nealrichardson commented on a change in pull request #7819:
URL: https://github.com/apache/arrow/pull/7819#discussion_r472278210
##########
File path: r/tests/testthat/test-buffer.R
##########
@@ -39,6 +39,7 @@ test_that("Buffer can be created from numeric vector", {
})
test_that("Buffer can be created from complex vector", {
+ skip("until cpp11 has a complex vector class")
Review comment:
Is this something we need?
##########
File path: r/src/array_from_vector.cpp
##########
@@ -1064,42 +1062,42 @@ class FixedSizeBinaryVectorConverter : public
VectorConverter {
FixedSizeBinaryBuilder* typed_builder_;
};
-template <typename Builder>
+template <typename StringBuilder>
class StringVectorConverter : public VectorConverter {
public:
~StringVectorConverter() {}
Status Init(ArrayBuilder* builder) {
- typed_builder_ = checked_cast<Builder*>(builder);
+ typed_builder_ = checked_cast<StringBuilder*>(builder);
return Status::OK();
}
Status Ingest(SEXP obj) {
ARROW_RETURN_IF(TYPEOF(obj) != STRSXP,
Status::RError("Expecting a character vector"));
- R_xlen_t n = XLENGTH(obj);
- // Reserve enough space before appending
- int64_t size = 0;
- for (R_xlen_t i = 0; i < n; i++) {
- SEXP string_i = STRING_ELT(obj, i);
- if (string_i != NA_STRING) {
- size += XLENGTH(Rf_mkCharCE(Rf_translateCharUTF8(string_i), CE_UTF8));
- }
+ cpp11::strings s(obj);
+ RETURN_NOT_OK(typed_builder_->Reserve(s.size()));
+
+ // note: the total length is calculated without utf8
+ // conversion, so see this more as a hint rather than
+ // the actual total length
Review comment:
Is this true? Is this safe to do? Previously you'd get a truncated
string when converting latin1 to utf8 unless you converted before measuring
size.
##########
File path: r/DESCRIPTION
##########
@@ -50,6 +49,8 @@ Suggests:
rmarkdown,
testthat,
tibble
+Remotes:
+ r-lib/cpp11#97
Review comment:
When we're ready to merge this, we should vendor cpp11. I made
https://issues.apache.org/jira/browse/ARROW-9786 to go back to the released
cpp11 before we do the 2.0.0 release (October-ish).
##########
File path: r/tests/testthat/test-python.R
##########
@@ -17,6 +17,8 @@
context("To/from Python")
+skip("for now")
Review comment:
Need to remove this before we merge, of course
##########
File path: r/tests/testthat/test-buffer.R
##########
@@ -39,6 +39,7 @@ test_that("Buffer can be created from numeric vector", {
})
test_that("Buffer can be created from complex vector", {
+ skip("until cpp11 has a complex vector class")
Review comment:
Doing some archaeology, it was added specifically in
https://github.com/apache/arrow/commit/13c63bdec9675f08b1631e0514dcf2a890d7e36b
(https://issues.apache.org/jira/browse/ARROW-3823). No context in the PR or the
JIRA to know if this was just added because it was possible or if it was
required for something (that said, no evidence that it was demanded by some
real use case).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]