Neal Richardson created ARROW-5718:
--------------------------------------

             Summary: [R] Add as_record_batch()
                 Key: ARROW-5718
                 URL: https://issues.apache.org/jira/browse/ARROW-5718
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
            Reporter: Neal Richardson
             Fix For: 0.14.0


ARROW-3814 / 
[https://github.com/apache/arrow/pull/3565/files#diff-95ad459e0128bfecf0d72ebd6d6ee8aaR94]
 changed the API of `record_batch()` and `arrow::table()` such that you could 
no longer pass in a data.frame to the function, not without [massaging it 
yourself|https://github.com/apache/arrow/pull/3565/files#diff-09c05d1a6ff41bed094fbccfa76395a6R27].
 That broke sparklyr integration tests with an opaque `cannot infer type from 
data` error, and it's unfortunate that there's no longer a direct way to go 
from a data.frame to a record batch, which sounds like a common need.

After some discussion, we resolved that a solution would be to (1) add an 
{{as_record_batch}} function, which the data.frame method is probably just 
{{as_record_batch.data.frame <- function(x) record_batch(!!!x)}}; and (2) if a 
user supplies a single, unnamed data.frame as the argument to 
{{record_batch()}}, raise an error that says to use {{as_record_batch()}}. We 
may later decide that we should automatically call as_record_batch(), but in 
case that is too magical and prevents some legitimate use case, let's hold off 
for now. It's easier to add magic than remove it.

Once this function exists, sparklyr tests can try to use {{as_record_batch}}, 
and if that function doesn't exist, fall back to {{record_batch}} (because that 
means it has an older released version of arrow that doesn't have 
as_record_batch, so record_batch(df) should work).

cc [~javierluraschi]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to