romainfrancois commented on a change in pull request #7527:
URL: https://github.com/apache/arrow/pull/7527#discussion_r446220256



##########
File path: r/src/array_from_vector.cpp
##########
@@ -159,6 +159,9 @@ struct VectorToArrayConverter {
       if (s == NA_STRING) {
         RETURN_NOT_OK(binary_builder->AppendNull());
         continue;
+      } else {
+        // Make sure we're ingesting UTF-8
+        s = Rf_mkCharCE(Rf_translateCharUTF8(s), CE_UTF8);

Review comment:
       yeah maybe some sort of `Rf_mkCharUtf8()` or `Rf_mkUtf8()`

##########
File path: r/src/recordbatch.cpp
##########
@@ -246,6 +246,7 @@ std::shared_ptr<arrow::RecordBatch> 
RecordBatch__from_arrays__known_schema(
   SEXP names = Rf_getAttrib(lst, R_NamesSymbol);
 
   auto fill_array = [&arrays, &schema](int j, SEXP x, SEXP name) {
+    name = Rf_mkCharCE(Rf_translateCharUTF8(name), CE_UTF8);

Review comment:
       I don't think R offers api for this

##########
File path: r/R/schema.R
##########
@@ -83,16 +83,21 @@ Schema <- R6Class("Schema",
     }
   ),
   active = list(
-    names = function() Schema__field_names(self),
+    names = function() {
+      out <- Schema__field_names(self)
+      # Hack: Rcpp should set the encoding

Review comment:
       I believe `cpp11` will rescue us from that sort of trouble. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to