[GitHub] [arrow] ianmcook commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

GitBox Mon, 10 May 2021 12:32:48 -0700


ianmcook commented on a change in pull request #10269:
URL: https://github.com/apache/arrow/pull/10269#discussion_r629628488




##########
File path: r/R/record-batch.R
##########
@@ -161,6 +161,17 @@ RecordBatch$create <- function(..., schema = NULL) {
     out <- RecordBatch__from_arrays(schema, arrays)
     return(dplyr::group_by(out, !!!dplyr::groups(arrays[[1]])))
   }
+
+  # If any arrays are length 1, recycle them  
+  arr_lens <- map_int(arrays, length)
+  if (length(arrays) > 1 && any(arr_lens == 1) && !all(arr_lens==1)){
+    max_array_len <- max(arr_lens)
+    arrays <- modify2(
+      arrays,
+      arr_lens == 1,
+      ~if(.y) MakeArrayFromScalar(Scalar$create(as.vector(.x)), max_array_len) 
else .x

Review comment:
       We should find a way to do this without calling `as.vector()` because 
that will convert Arrow objects to R vectors which could cause the data type to 
be lost during the conversion.
   
   But the problem is that without `as.vector()`, length 1 `ChunkedArray` 
objects will error when passed to `Array$create()` inside `Scalar$create()`. I 
think the cleanest way to solve that is to improve the `Array$create` function 
in `array.R` to handle `ChunkedArray` objects and convert it to `Array`. (This 
is getting into yak shaving territory here but I think it's important.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] ianmcook commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

Reply via email to