[jira] [Updated] (ARROW-5030) [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs

2019-04-25 Thread Joris Van den Bossche (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche updated ARROW-5030:
-
Labels: parquet  (was: )

> [Python] read_row_group fails with Nested data conversions not implemented 
> for chunked array outputs
> 
>
> Key: ARROW-5030
> URL: https://issues.apache.org/jira/browse/ARROW-5030
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 0.12.0
>Reporter: Jakub Okoński
>Priority: Major
>  Labels: parquet
>
> Hey, I'm trying to concatenate two files and to avoid reading everything to 
> memory at once, I wanted to use `read_row_group` for my solution, but it 
> fails.
>  
> I think it's due to fields like these:
> {{pyarrow.Field>}}
>  
> But I'm not sure. Is this a duplicate? The issue linked in the code is 
> resolved 
> https://github.com/apache/arrow/blob/fd0b90a7f7e65fde32af04c4746004a1240914cf/cpp/src/parquet/arrow/reader.cc#L915
>  
> Stacktrace is
>  
> {{  File "/data/teftel/teftel-data/teftel_data/parquet_stream.py", line 163, 
> in read_batches}}
> {{    table = pf.read_row_group(ix, columns=self._columns)}}
> {{  File 
> "/home/kuba/.local/share/virtualenvs/teftel-o6G5iH_l/lib/python3.6/site-packages/pyarrow/parquet.py",
>  line 186, in read_row_group}}
> {{    use_threads=use_threads)}}
> {{  File "pyarrow/_parquet.pyx", line 695, in 
> pyarrow._parquet.ParquetReader.read_row_group}}
> {{  File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status}}
> {{pyarrow.lib.ArrowNotImplementedError: Nested data conversions not 
> implemented for chunked array outputs}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-5030) [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs

2019-04-17 Thread Antoine Pitrou (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-5030:
--
Component/s: Python
 C++

> [Python] read_row_group fails with Nested data conversions not implemented 
> for chunked array outputs
> 
>
> Key: ARROW-5030
> URL: https://issues.apache.org/jira/browse/ARROW-5030
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 0.12.0
>Reporter: Jakub Okoński
>Priority: Major
>
> Hey, I'm trying to concatenate two files and to avoid reading everything to 
> memory at once, I wanted to use `read_row_group` for my solution, but it 
> fails.
>  
> I think it's due to fields like these:
> {{pyarrow.Field>}}
>  
> But I'm not sure. Is this a duplicate? The issue linked in the code is 
> resolved 
> https://github.com/apache/arrow/blob/fd0b90a7f7e65fde32af04c4746004a1240914cf/cpp/src/parquet/arrow/reader.cc#L915
>  
> Stacktrace is
>  
> {{  File "/data/teftel/teftel-data/teftel_data/parquet_stream.py", line 163, 
> in read_batches}}
> {{    table = pf.read_row_group(ix, columns=self._columns)}}
> {{  File 
> "/home/kuba/.local/share/virtualenvs/teftel-o6G5iH_l/lib/python3.6/site-packages/pyarrow/parquet.py",
>  line 186, in read_row_group}}
> {{    use_threads=use_threads)}}
> {{  File "pyarrow/_parquet.pyx", line 695, in 
> pyarrow._parquet.ParquetReader.read_row_group}}
> {{  File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status}}
> {{pyarrow.lib.ArrowNotImplementedError: Nested data conversions not 
> implemented for chunked array outputs}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-5030) [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs

2019-03-28 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-5030:

Summary: [Python] read_row_group fails with Nested data conversions not 
implemented for chunked array outputs  (was: read_row_group fails with Nested 
data conversions not implemented for chunked array outputs)

> [Python] read_row_group fails with Nested data conversions not implemented 
> for chunked array outputs
> 
>
> Key: ARROW-5030
> URL: https://issues.apache.org/jira/browse/ARROW-5030
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Jakub Okoński
>Priority: Major
>
> Hey, I'm trying to concatenate two files and to avoid reading everything to 
> memory at once, I wanted to use `read_row_group` for my solution, but it 
> fails.
>  
> I think it's due to fields like these:
> {{pyarrow.Field>}}
>  
> But I'm not sure. Is this a duplicate? The issue linked in the code is 
> resolved 
> https://github.com/apache/arrow/blob/fd0b90a7f7e65fde32af04c4746004a1240914cf/cpp/src/parquet/arrow/reader.cc#L915
>  
> Stacktrace is
>  
> {{  File "/data/teftel/teftel-data/teftel_data/parquet_stream.py", line 163, 
> in read_batches}}
> {{    table = pf.read_row_group(ix, columns=self._columns)}}
> {{  File 
> "/home/kuba/.local/share/virtualenvs/teftel-o6G5iH_l/lib/python3.6/site-packages/pyarrow/parquet.py",
>  line 186, in read_row_group}}
> {{    use_threads=use_threads)}}
> {{  File "pyarrow/_parquet.pyx", line 695, in 
> pyarrow._parquet.ParquetReader.read_row_group}}
> {{  File "pyarrow/error.pxi", line 89, in pyarrow.lib.check_status}}
> {{pyarrow.lib.ArrowNotImplementedError: Nested data conversions not 
> implemented for chunked array outputs}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)