[ 
https://issues.apache.org/jira/browse/ARROW-13198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375710#comment-17375710
 ] 

Joris Van den Bossche commented on ARROW-13198:
-----------------------------------------------

(copying my comment from ARROW-13248) In case this helps, going back in time a  
bit checking the nightly builds (it's certainly not limited to a single python 
version or packaging solution or so):

- 2021-07-06: test-conda-python-3.9-pandas-master 
(https://github.com/ursacomputing/crossbow/runs/2995817516), segfaulted in 
{{test_filter_timestamp[threaded-async]}}
- 2021-06-30: test-conda-python-3.8-pandas-latest 
(https://github.com/ursacomputing/crossbow/runs/2949300069), segfaulted in 
{{test_ipc_format[threaded-async]}}
- 2021-06-26: test-conda-python-3.7-pandas-latest 
(https://github.com/ursacomputing/crossbow/runs/2919782027), segfaulted in 
{{test_open_dataset_partitioned_directory[threaded-async]}} (and also in 
test-conda-python-3.7-pandas-0.24)
- 2021-06-25: test-conda-python-3.6-pandas-0.23 
(https://github.com/ursacomputing/crossbow/runs/2911496222), segfaulted in 
{{test_open_dataset_partitioned_directory[serial-async]}}
- 2021-06-26: wheel-manylinux2010-cp37-amd64 
(https://github.com/ursacomputing/crossbow/runs/2919785672), segfaulted in 
test_open_dataset_partitioned_directory

This last one has a bit more of a traceback, at least showing it's coming from 
the {{to_table}} call:

{code}
 Fatal Python error: Segmentation fault

Current thread 0x00007f3897f43740 (most recent call first):
  File "/usr/local/lib/python3.7/site-packages/pyarrow/tests/test_dataset.py", 
line 241 in to_table
  File "/usr/local/lib/python3.7/site-packages/pyarrow/tests/test_dataset.py", 
line 1678 in _check_dataset
  File "/usr/local/lib/python3.7/site-packages/pyarrow/tests/test_dataset.py", 
line 1689 in _check_dataset_from_path
  File "/usr/local/lib/python3.7/site-packages/pyarrow/tests/test_dataset.py", 
line 1983 in test_open_dataset_partitioned_directory
{code}

> [C++][Dataset] Async scanner occasionally segfaulting in CI
> -----------------------------------------------------------
>
>                 Key: ARROW-13198
>                 URL: https://issues.apache.org/jira/browse/ARROW-13198
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: David Li
>            Assignee: Weston Pace
>            Priority: Major
>              Labels: dataset, datasets
>             Fix For: 5.0.0
>
>         Attachments: AMD64 Conda Python 3.7 Pandas latest.log
>
>
> See attached log; it's failing in 
> {{test_open_dataset_partitioned_directory[threaded-async]}} [^AMD64 Conda 
> Python 3.7 Pandas latest.log]
> {noformat}
> 2021-06-28T15:53:47.0090857Z 
> opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/tests/test_dataset.py::test_scan_iterator[True-True]
>  PASSED [ 27%]
> 2021-06-28T15:53:48.5943186Z /arrow/ci/scripts/python_test.sh: line 32:  7137 
> Segmentation fault      (core dumped) pytest -r s -v ${PYTEST_ARGS} --pyargs 
> pyarrow
> 2021-06-28T15:53:49.1303267Z 139
> /pyarrow/tests/test_dataset.py::test_open_dataset_partitioned_directory[threaded-async]
>  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to