[
https://issues.apache.org/jira/browse/ARROW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17660231#comment-17660231
]
Rok Mihevc commented on ARROW-3208:
-----------------------------------
This issue has been migrated to [issue
#19552|https://github.com/apache/arrow/issues/19552] on GitHub. Please see the
[migration documentation|https://github.com/apache/arrow/issues/14542] for
further details.
> [C++] Segmentation fault when casting dictionary to numeric with nullptr
> valid_bitmap
> --------------------------------------------------------------------------------------
>
> Key: ARROW-3208
> URL: https://issues.apache.org/jira/browse/ARROW-3208
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Affects Versions: 0.9.0
> Environment: Ubuntu 16.04 LTS; System76 Oryx Pro
> Reporter: Ying Wang
> Assignee: Francois Saint-Jacques
> Priority: Major
> Labels: parquet, pull-request-available
> Fix For: 0.13.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Steps to reproduce:
> # Create a partitioned dataset with the following code:
> ```python
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> import pyarrow.parquet as pq
> df = pd.DataFrame({ 'one': [-1, 10, 2.5, 100, 1000, 1, 29.2], 'two': [-1, 10,
> 2, 100, 1000, 1, 11], 'three': [0, 0, 0, 0, 0, 0, 0] })
> table = pa.Table.from_pandas(df)
> pq.write_to_dataset(table, root_path='/home/yingw787/misc/example_dataset',
> partition_cols=['one', 'two'])
> ```
> # Create a Parquet file from a PyArrow Table created from the partitioned
> Parquet dataset:
> ```python
> import pyarrow.parquet as pq
> table = pq.ParquetDataset('/path/to/dataset').read()
> pq.write_table(table, '/path/to/example.parquet')
> ```
> EXPECTED:
> * Successful write
> GOT:
> * Segmentation fault
> Issue reference on GitHub mirror: https://github.com/apache/arrow/issues/2511
--
This message was sent by Atlassian Jira
(v8.20.10#820010)