[ 
https://issues.apache.org/jira/browse/ARROW-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420087#comment-16420087
 ] 

ASF GitHub Bot commented on ARROW-2351:
---------------------------------------

gaolizhou commented on a change in pull request #1803: ARROW-2351 [C++] 
StringBuilder::append(vector<string>...) not impleme…
URL: https://github.com/apache/arrow/pull/1803#discussion_r178223472
 
 

 ##########
 File path: cpp/src/arrow/builder.cc
 ##########
 @@ -1385,6 +1385,28 @@ const uint8_t* BinaryBuilder::GetValue(int64_t i, 
int32_t* out_length) const {
 
 StringBuilder::StringBuilder(MemoryPool* pool) : BinaryBuilder(utf8(), pool) {}
 
+Status StringBuilder::Append(const std::vector<std::string>& values,
+                             uint8_t* null_bytes) {
+  std::size_t total_length = std::accumulate(
+      values.begin(), values.end(), 0ULL,
+      [](uint64_t sum, const std::string& str) { return sum + str.size(); });
 
 Review comment:
   Tested with below code, for-loop-solution and accumulate-solution almost has 
same performance.To count 100000000 strings, for-loop-solution takes 21.2312 ms,
    accumulate-solution takes 21.5120 ms.
   
   log:
   ```
   lizgao@lizgao-ubuntu:~/CLionProjects/cmake-test$ ./cmake_test 
   WARNING: Logging before InitGoogleLogging() is written to STDERR
   I0330 10:29:56.864248  3989 accmulate-lambda-performance.cc:15] Begin 
for_loop_solution
   I0330 10:29:57.076560  3989 accmulate-lambda-performance.cc:19] End 
for_loop_solution = 8000000000
   I0330 10:29:57.076603  3989 accmulate-lambda-performance.cc:23] Begin 
accumulate_solution
   I0330 10:29:57.291723  3989 accmulate-lambda-performance.cc:27] End 
accumulate_solution = 8000000000
   
   ```
   
   ```
   //
   // Created by lizgao on 3/30/18.
   //
   #include <cstdint>
   #include <numeric>
   #include <string>
   #include <vector>
   #include <glog/logging.h>
   
   namespace {
   
   void for_loop_solution(const std::vector<std::string> &values) {
     std::size_t total_length = 0;
     std::size_t cnt = values.size();
     LOG(INFO) << "Begin for_loop_solution";
     for (std::size_t i = 0; i < cnt; ++i) {
       total_length += values[i].size();
     }
     LOG(INFO) << "End for_loop_solution = " << total_length;
   }
   
   void accumulate_solution(const std::vector<std::string> &values) {
     LOG(INFO) << "Begin accumulate_solution";
     std::size_t total_length = std::accumulate(
           values.begin(), values.end(), 0ULL,
           [](uint64_t sum, const std::string& str) { return sum + str.size(); 
});
     LOG(INFO) << "End accumulate_solution = " << total_length;
   }
   }
   
   void accumulate_lambda_performance_test() {
     std::vector<std::string> str_list(100000000, std::string(80, '*'));
     for_loop_solution(str_list);
     accumulate_solution(str_list);
   }
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] StringBuilder::append(vector<string>...) not implemented
> --------------------------------------------------------------
>
>                 Key: ARROW-2351
>                 URL: https://issues.apache.org/jira/browse/ARROW-2351
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 0.9.0
>            Reporter: Rares Vernica
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> For {{StringBuilder}} an {{append(vector<string>, uint8_t*)}} function is 
> [declared|https://github.com/apache/arrow/blob/7b2c79765cf92760e1f8cca079159d9613b86412/cpp/src/arrow/builder.h#L721]
>  and 
> [documented|http://arrow.apache.org/docs/cpp/classarrow_1_1_string_builder.html#a59be34b5e11017a392b4ee019d90da3c]
>  but it does not seem to be implemented.
> {code:java}
> undefined reference to `arrow::StringBuilder::Append(std::vector<std::string, 
> std::allocator<std::string> > const&, unsigned char*)'
> collect2: error: ld returned 1 exit status
> {code}
> Also worth noting is that the similar function in {{NumericBuilder}} uses 
> {{vector<bool>}} for the null values instead of {{uint8_t*}}. It might be 
> worth making them consistent.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to