nealrichardson commented on a change in pull request #7527:
URL: https://github.com/apache/arrow/pull/7527#discussion_r444530279



##########
File path: r/src/array_from_vector.cpp
##########
@@ -159,6 +159,9 @@ struct VectorToArrayConverter {
       if (s == NA_STRING) {
         RETURN_NOT_OK(binary_builder->AppendNull());
         continue;
+      } else {
+        // Make sure we're ingesting UTF-8
+        s = Rf_mkCharCE(Rf_translateCharUTF8(s), CE_UTF8);

Review comment:
       This `Rf_mkCharCE(Rf_translateCharUTF8(s), CE_UTF8)` is dropped in 
several places; should we factor this out to a macro or something?

##########
File path: r/src/recordbatch.cpp
##########
@@ -246,6 +246,7 @@ std::shared_ptr<arrow::RecordBatch> 
RecordBatch__from_arrays__known_schema(
   SEXP names = Rf_getAttrib(lst, R_NamesSymbol);
 
   auto fill_array = [&arrays, &schema](int j, SEXP x, SEXP name) {
+    name = Rf_mkCharCE(Rf_translateCharUTF8(name), CE_UTF8);

Review comment:
       There are a few places where I `Rf_mkCharCE()` and then immediately call 
`CHAR()`, which IIUC is boxing in a SEXP and then immediately unboxing it. 
Maybe that can be eliminated some places or done better?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to