[ 
https://issues.apache.org/jira/browse/ARROW-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429642#comment-16429642
 ] 

ASF GitHub Bot commented on ARROW-2411:
---------------------------------------

kou commented on a change in pull request #1852: ARROW-2411: [C++] Add 
StringBuilder::Append(const char **values)
URL: https://github.com/apache/arrow/pull/1852#discussion_r179936915
 
 

 ##########
 File path: cpp/src/arrow/builder.cc
 ##########
 @@ -1413,6 +1413,41 @@ Status StringBuilder::Append(const 
std::vector<std::string>& values,
   return Status::OK();
 }
 
+Status StringBuilder::Append(const char** values,
+                             int64_t length,
+                             const uint8_t* valid_bytes) {
+  std::size_t total_length = 0;
+  std::vector<std::size_t> value_lengths(length);
+  for (int64_t i = 0; i < length; ++i) {
+    if (values[i]) {
+      auto value_length = strlen(values[i]);
+      value_lengths[i] = value_length;
+      total_length += value_length;
+    }
+  }
+  RETURN_NOT_OK(Reserve(length));
+  RETURN_NOT_OK(value_data_builder_.Reserve(total_length));
+  RETURN_NOT_OK(offsets_builder_.Reserve(length));
+
+  if (valid_bytes) {
+    for (int64_t i = 0; i < length; ++i) {
+      RETURN_NOT_OK(AppendNextOffset());
+      if (valid_bytes[i]) {
+        RETURN_NOT_OK(value_data_builder_.Append(
+            reinterpret_cast<const uint8_t*>(values[i]), value_lengths[i]));
+      }
+    }
+  } else {
+    for (int64_t i = 0; i < length; ++i) {
+      RETURN_NOT_OK(AppendNextOffset());
+      RETURN_NOT_OK(value_data_builder_.Append(
+          reinterpret_cast<const uint8_t*>(values[i]), value_lengths[i]));
 
 Review comment:
   It's interesting.
   I've implemented as the followings:
   
     * If `values[i]` is `NULL` and `valid_bytes` isn't `nullptr`, the value is 
an empty string not a null value. It's for respecting `valid_bytes` data.
     * If `values[i]` is `NULL` and `valid_bytes` is `nullptr`, the value is a 
null value.
   
   But users may be confused...
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Add method to append batches of null-terminated strings to StringBuilder
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-2411
>                 URL: https://issues.apache.org/jira/browse/ARROW-2411
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, GLib
>            Reporter: Uwe L. Korn
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> We should add a method {{StringBuilder::AppendCStrings(const char** values, 
> const uint8_t* valid_bytes = NULLPTR)}} to the {{StringBuilder}} class to 
> have fast inserts for these strings. See 
> https://github.com/apache/arrow/pull/1845/files for a use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to