[
https://issues.apache.org/jira/browse/ARROW-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022276#comment-17022276
]
Michael Chirico commented on ARROW-7662:
----------------------------------------
Hey [~npr], I'm taking a look at this now (first time looking at src for
arrow), it looks like it should be as simple as making a similar version as for
StructType for ArrayType here:
https://github.com/apache/arrow/blob/master/r/src/array_from_vector.cpp#L861-L871
{code:r}
case VECSXP:
if (Rf_inherits(x, "data.frame")) {
R_xlen_t n = XLENGTH(x);
SEXP names = Rf_getAttrib(x, R_NamesSymbol);
std::vector<std::shared_ptr<arrow::Field>> fields(n);
for (R_xlen_t i = 0; i < n; i++) {
fields[i] = std::make_shared<arrow::Field>(CHAR(STRING_ELT(names, i)),
InferType(VECTOR_ELT(x, i)));
}
return std::make_shared<StructType>(std::move(fields));
}
{code}
Does that sound about right?
> [R] Support auto-inferring list column type
> -------------------------------------------
>
> Key: ARROW-7662
> URL: https://issues.apache.org/jira/browse/ARROW-7662
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Reporter: Michael Chirico
> Priority: Major
>
> {code:r}
> DF = data.frame(a = 1:10)
> DF$b = as.list(DF$a)
> arrow::write_parquet(DF, 'test.parquet')
> # Error in Table__from_dots(dots, schema) : cannot infer type from data
> {code}
> This appears to be supported naturally already in Python:
> {code:python}
> import pandas as pd
> pd.DataFrame({'a': [1, 2, 3], 'b': [[1, 2], [3, 4], [5,
> 6]]}).to_parquet('test.parquet')
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)