F. H. created ARROW-15141:
-----------------------------

             Summary: Fatal error condition occurred in aws_thread_launch
                 Key: ARROW-15141
                 URL: https://issues.apache.org/jira/browse/ARROW-15141
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 6.0.1, 6.0.0
         Environment: - `uname -a`:
Linux datalab2 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 
x86_64 x86_64 x86_64 GNU/Linux
- `mamba list | grep -i "pyarrow\|tensorflow\|^python"`
pyarrow                   6.0.0           py39hff6fa39_1_cpu    conda-forge
python                    3.9.7           hb7a2778_3_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-flatbuffers        1.12               pyhd8ed1ab_1    conda-forge
python-irodsclient        1.0.0              pyhd8ed1ab_0    conda-forge
python-rocksdb            0.7.0            py39h7fcd5f3_4    conda-forge
python_abi                3.9                      2_cp39    conda-forge
tensorflow                2.6.2           cuda112py39h9333c2f_0    conda-forge
tensorflow-base           2.6.2           cuda112py39h7de589b_0    conda-forge
tensorflow-estimator      2.6.2           cuda112py39h9333c2f_0    conda-forge
tensorflow-gpu            2.6.2           cuda112py39h0bbbad9_0    conda-forge



            Reporter: F. H.


Hi, I am getting randomly the following error when first running inference with 
a Tensorflow model and then writing the result to a `.parquet` file:
```

Fatal error condition occurred in 
/home/conda/feedstock_root/build_artifacts/aws-c-io_1633633131324/work/source/event_loop.c:72:
 aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, 
el_group, &thread_options) == AWS_OP_SUCCESS
Exiting Application
################################################################################
Stack trace:
################################################################################
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_backtrace_print+0x59)
 [0x7ffb14235f19]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_fatal_assert+0x48)
 [0x7ffb14227098]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0x10a43)
 [0x7ffb1406ea43]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
 [0x7ffb14237fad]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././././libaws-c-io.so.1.0.0(+0xe35a)
 [0x7ffb1406c35a]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1(aws_ref_count_release+0x1d)
 [0x7ffb14237fad]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../../././libaws-crt-cpp.so(_ZN3Aws3Crt2Io15ClientBootstrapD1Ev+0x3a)
 [0x7ffb142a2f5a]
/home/<user>/miniconda3/envs/spliceai_env/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so(+0x5f570)
 [0x7ffb147fd570]
/lib/x86_64-linux-gnu/libc.so.6(+0x49a27) [0x7ffb17f7da27]
/lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7ffb17f7dbe0]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7ffb17f5b0ba]
/home/<user>/miniconda3/envs/spliceai_env/bin/python3.9(+0x20aa51) 
[0x562576609a51]
/bin/bash: line 1: 2341494 Aborted                 (core dumped)

```

 

My colleague ran into the same issue on Centos 8 while running the same job + 
same environment on SLURM, so I guess it could be some issue with tensorflow + 
pyarrow.

Also I found a github issue with multiple people running into the same issue:
[https://github.com/huggingface/datasets/issues/3310]

 

It would be very important to my lab that this bug gets resolved, as we cannot 
work with parquet any more. Unfortunately, we do not have the knowledge to fix 
it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to