nealrichardson commented on a change in pull request #7819:
URL: https://github.com/apache/arrow/pull/7819#discussion_r472278210



##########
File path: r/tests/testthat/test-buffer.R
##########
@@ -39,6 +39,7 @@ test_that("Buffer can be created from numeric vector", {
 })
 
 test_that("Buffer can be created from complex vector", {
+  skip("until cpp11 has a complex vector class")

Review comment:
       Is this something we need?

##########
File path: r/src/array_from_vector.cpp
##########
@@ -1064,42 +1062,42 @@ class FixedSizeBinaryVectorConverter : public 
VectorConverter {
   FixedSizeBinaryBuilder* typed_builder_;
 };
 
-template <typename Builder>
+template <typename StringBuilder>
 class StringVectorConverter : public VectorConverter {
  public:
   ~StringVectorConverter() {}
 
   Status Init(ArrayBuilder* builder) {
-    typed_builder_ = checked_cast<Builder*>(builder);
+    typed_builder_ = checked_cast<StringBuilder*>(builder);
     return Status::OK();
   }
 
   Status Ingest(SEXP obj) {
     ARROW_RETURN_IF(TYPEOF(obj) != STRSXP,
                     Status::RError("Expecting a character vector"));
-    R_xlen_t n = XLENGTH(obj);
 
-    // Reserve enough space before appending
-    int64_t size = 0;
-    for (R_xlen_t i = 0; i < n; i++) {
-      SEXP string_i = STRING_ELT(obj, i);
-      if (string_i != NA_STRING) {
-        size += XLENGTH(Rf_mkCharCE(Rf_translateCharUTF8(string_i), CE_UTF8));
-      }
+    cpp11::strings s(obj);
+    RETURN_NOT_OK(typed_builder_->Reserve(s.size()));
+
+    // note: the total length is calculated without utf8
+    //       conversion, so see this more as a hint rather than
+    //       the actual total length

Review comment:
       Is this true? Is this safe to do? Previously you'd get a truncated 
string when converting latin1 to utf8 unless you converted before measuring 
size.

##########
File path: r/DESCRIPTION
##########
@@ -50,6 +49,8 @@ Suggests:
     rmarkdown,
     testthat,
     tibble
+Remotes:
+    r-lib/cpp11#97

Review comment:
       When we're ready to merge this, we should vendor cpp11. I made 
https://issues.apache.org/jira/browse/ARROW-9786 to go back to the released 
cpp11 before we do the 2.0.0 release (October-ish).

##########
File path: r/tests/testthat/test-python.R
##########
@@ -17,6 +17,8 @@
 
 context("To/from Python")
 
+skip("for now")

Review comment:
       Need to remove this before we merge, of course

##########
File path: r/tests/testthat/test-buffer.R
##########
@@ -39,6 +39,7 @@ test_that("Buffer can be created from numeric vector", {
 })
 
 test_that("Buffer can be created from complex vector", {
+  skip("until cpp11 has a complex vector class")

Review comment:
       Doing some archaeology, it was added specifically in 
https://github.com/apache/arrow/commit/13c63bdec9675f08b1631e0514dcf2a890d7e36b 
(https://issues.apache.org/jira/browse/ARROW-3823). No context in the PR or the 
JIRA to know if this was just added because it was possible or if it was 
required for something (that said, no evidence that it was demanded by some 
real use case).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to