[
https://issues.apache.org/jira/browse/ARROW-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Francois Saint-Jacques reassigned ARROW-7638:
---------------------------------------------
Assignee: Francois Saint-Jacques
> [Python] Segfault when inspecting dataset.Source with invalid
> file/partitioning
> -------------------------------------------------------------------------------
>
> Key: ARROW-7638
> URL: https://issues.apache.org/jira/browse/ARROW-7638
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Joris Van den Bossche
> Assignee: Francois Saint-Jacques
> Priority: Major
>
> Getting a segfault with:
> {code}
> In [1]: import pyarrow.dataset as ds
>
>
> In [2]: !touch test_empty.txt
>
>
> In [3]: source_factory = ds.source("test_empty.txt",
> partitioning=ds.partitioning(field_names=['a', 'b']))
>
>
> In [4]: source_factory.inspect()
>
>
> Segmentation fault (core dumped)
> {code}
> Didn't yet further investigate what might be the reason (there are several
> "wrong" things here: it's an empty file, it's not a valid file for the
> parquet format, the partitioning does not match the files, etc)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)