[jira] [Updated] (ARROW-6573) [Python] Segfault when writing to parquet

Wes McKinney (Jira) Wed, 18 Sep 2019 19:30:29 -0700


     [ 
https://issues.apache.org/jira/browse/ARROW-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wes McKinney updated ARROW-6573:
--------------------------------
    Summary: [Python] Segfault when writing to parquet  (was: Segfault when 
writing to parquet)

> [Python] Segfault when writing to parquet
> -----------------------------------------
>
>                 Key: ARROW-6573
>                 URL: https://issues.apache.org/jira/browse/ARROW-6573
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 0.14.1
>         Environment: Ubuntu 16.04. Pyarrow 0.14.1 installed through pip. 
> Using Anaconda distribution of Python 3.7. 
>            Reporter: Josh Weinstock
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 0.15.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When attempting to write out a pyarrow table to parquet I am observing a 
> segfault when there is a mismatch between the schema and the datatypes. 
> Here is a reproducible example:
>  
> {code:java}
> import pyarrow as pa
> import pyarrow.parquet as pq
> data = dict()
> data["key"] = [0, 1, 2, 3] # segfault
> #data["key"] = ["0", "1", "2", "3"] # no segfault
> schema = pa.schema({"key" : pa.string()})
> table = pa.Table.from_pydict(data, schema = schema)
> print("now writing out test file")
> pq.write_table(table, "test.parquet") 
> {code}
> This results in a segfault when writing the table. Running 
>  
> {code:java}
> gdb -ex r --args python test.py 
> {code}
> Yields
>  
>  
> {noformat}
> Program received signal SIGSEGV, Segmentation fault. 0x00007fffe8173917 in 
> virtual thunk to 
> parquet::DictEncoderImpl<parquet::DataType<(parquet::Type::type)6> 
> >::Put(parquet::ByteArray const*, int) () from 
> /net/fantasia/home/jweinstk/anaconda3/lib/python3.7/site-packages/pyarrow/libparquet.so.14
> {noformat}
>  
>  
> Thanks for all of your arrow work,
> Josh



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6573) [Python] Segfault when writing to parquet

Reply via email to