Mike Gevaert created ARROW-17319:
------------------------------------
Summary: pyarrow seems to set default CPU affinity to 0 on
shutdown, crashes if CPU 0 is not available
Key: ARROW-17319
URL: https://issues.apache.org/jira/browse/ARROW-17319
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 9.0.0
Environment: Ubuntu 20.02 / Python 3.8.10 (default, Jun 22 2022,
20:18:18)
$ pip list
Package Version
--------------- -------
numpy 1.23.1
pandas 1.4.3
pip 20.0.2
pkg-resources 0.0.0
pyarrow 9.0.0
python-dateutil 2.8.2
pytz 2022.1
setuptools 44.0.0
six 1.16.0
Reporter: Mike Gevaert
I get the following traceback when exiting python after loading
{{pyarrow.parquet}}
{code}
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> os.getpid()
25106
>>> import pyarrow.parquet
>>>
Fatal error condition occurred in
/opt/vcpkg/buildtrees/aws-c-io/src/9e6648842a-364b708815.clean/source/event_loop.c:72:
aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn,
el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200af06)
[0x7f831b2b3f06]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x20028e5)
[0x7f831b2ab8e5]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f27e09)
[0x7f831b1d0e09]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d)
[0x7f831b2b4a3d]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f25948)
[0x7f831b1ce948]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d)
[0x7f831b2b4a3d]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1ee0b46)
[0x7f831b189b46]
/tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x194546a)
[0x7f831abee46a]
/lib/x86_64-linux-gnu/libc.so.6(+0x468a7) [0x7f831c6188a7]
/lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7f831c618a60]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7f831c5f608a]
{code}
To replicate this; one needs to make sure that CPU 0 isn't available to
schedule tasks on. In HPC our environment, that happens due to slurm using
cgroups to constrain CPU usage.
On a linux workstation, one should be able to:
1) open python as a normal user
2) get the pid
3) as root:
{code}
cd /sys/fs/cgroup/cpuset/
mkdir pyarrow
cd pyarrow
echo 0 > cpuset.mems
echo 1 > cpuset.cpus # sets the cgroup to only have access to cpu 1
echo $PID > tasks
{code}
Then, in the python enviroment:
{code}
import pyarrow.parquet
exit()
{code}
Which should trigger the crash.
Sadly, I couldn't track down which {{aws-c-common}} and {{aws-c-io}} are being
used for the 9.0.0 py38 manylinux wheels. (libarrow.so.900 has
BuildID[sha1]=dd6c5a2efd5cacf09657780a58c40f7c930e4df1)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)