[ 
https://issues.apache.org/jira/browse/ARROW-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176899#comment-16176899
 ] 

Wes McKinney commented on ARROW-1599:
-------------------------------------

Yes. It's quite a bit of work to handle decoding of lists within structs, and 
vice versa; we support reads of structs and lists / lists-of-lists but the full 
nested data read case has not yet been implemented. I just did some support 
work for this in PARQUET-1100 
https://github.com/apache/parquet-cpp/commit/4b09ac703bc75fee72f94bed8ecfe571096b04c1
 and I hope that we can get this completed by end of the year. Development help 
always appreciated

> PyArrow unable to read Parquet files with vector as column
> ----------------------------------------------------------
>
>                 Key: ARROW-1599
>                 URL: https://issues.apache.org/jira/browse/ARROW-1599
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.7.0
>         Environment: Ubuntu
>            Reporter: Jovann Kung
>
> Is PyArrow currently unable to read in Parquet files with a vector as a 
> column? For example, the schema of such a file is below:
> {{<pyarrow._parquet.ParquetSchema object at 0x7f2d42493c88>
> mbc: FLOAT
> deltae: FLOAT
> labels: FLOAT
> features.type: INT32 INT_8
> features.size: INT32
> features.indices.list.element: INT32
> features.values.list.element: DOUBLE}}
> Using either pq.read_table() or pq.ParquetDataset('/path/to/parquet').read() 
> yields the following error: ArrowNotImplementedError: Currently only nesting 
> with Lists is supported.
> From the error I assume that this may be implemented in further releases?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to