[jira] [Updated] (ARROW-3208) [Python] Segmentation fault when reading a Parquet partitioned dataset to a Parquet file

Wes McKinney (JIRA) Tue, 13 Nov 2018 06:58:26 -0800


     [ 
https://issues.apache.org/jira/browse/ARROW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wes McKinney updated ARROW-3208:
--------------------------------
    Labels: parquet  (was: )

> [Python] Segmentation fault when reading a Parquet partitioned dataset to a 
> Parquet file
> ----------------------------------------------------------------------------------------
>
>                 Key: ARROW-3208
>                 URL: https://issues.apache.org/jira/browse/ARROW-3208
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.9.0
>         Environment: Ubuntu 16.04 LTS; System76 Oryx Pro
>            Reporter: Ying Wang
>            Priority: Major
>              Labels: parquet
>
> Steps to reproduce:
>  # Create a partitioned dataset with the following code:
> ```python
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> import pyarrow.parquet as pq
> df = pd.DataFrame({ 'one': [-1, 10, 2.5, 100, 1000, 1, 29.2], 'two': [-1, 10, 
> 2, 100, 1000, 1, 11], 'three': [0, 0, 0, 0, 0, 0, 0] })
> table = pa.Table.from_pandas(df)
> pq.write_to_dataset(table, root_path='/home/yingw787/misc/example_dataset', 
> partition_cols=['one', 'two'])
> ```
>  # Create a Parquet file from a PyArrow Table created from the partitioned 
> Parquet dataset:
> ```python
> import pyarrow.parquet as pq
> table = pq.ParquetDataset('/path/to/dataset').read()
> pq.write_table(table, '/path/to/example.parquet')
> ```
> EXPECTED:
>  * Successful write
> GOT:
>  * Segmentation fault
> Issue reference on GitHub mirror: https://github.com/apache/arrow/issues/2511



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3208) [Python] Segmentation fault when reading a Parquet partitioned dataset to a Parquet file

Reply via email to