[ 
https://issues.apache.org/jira/browse/ARROW-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022276#comment-17022276
 ] 

Michael Chirico commented on ARROW-7662:
----------------------------------------

Hey [~npr], I'm taking a look at this now (first time looking at src for 
arrow), it looks like it should be as simple as making a similar version as for 
StructType for ArrayType here:

https://github.com/apache/arrow/blob/master/r/src/array_from_vector.cpp#L861-L871

{code:r}
case VECSXP:
  if (Rf_inherits(x, "data.frame")) {
    R_xlen_t n = XLENGTH(x);
    SEXP names = Rf_getAttrib(x, R_NamesSymbol);
    std::vector<std::shared_ptr<arrow::Field>> fields(n);
    for (R_xlen_t i = 0; i < n; i++) {
      fields[i] = std::make_shared<arrow::Field>(CHAR(STRING_ELT(names, i)),
                                                 InferType(VECTOR_ELT(x, i)));
    }
    return std::make_shared<StructType>(std::move(fields));
  }
{code}

Does that sound about right?

> [R] Support auto-inferring list column type
> -------------------------------------------
>
>                 Key: ARROW-7662
>                 URL: https://issues.apache.org/jira/browse/ARROW-7662
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Michael Chirico
>            Priority: Major
>
> {code:r}
> DF = data.frame(a = 1:10)
> DF$b = as.list(DF$a)
> arrow::write_parquet(DF, 'test.parquet')
> # Error in Table__from_dots(dots, schema) : cannot infer type from data
> {code}
> This appears to be supported naturally already in Python:
> {code:python}
> import pandas as pd
> pd.DataFrame({'a': [1, 2, 3], 'b': [[1, 2], [3, 4], [5, 
> 6]]}).to_parquet('test.parquet')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to