Hi,

When I try to install 'dorado' via

  dorado-0.3.1-foss-2022a-CUDA-11.7.0.eb

the tests stall at some point.  The process tree is as follows:

  ├─ /bin/bash /var/spool/slurmd/job14269886/slurm_script
  │  └─ /usr/bin/python3.6 -m easybuild.main 
dorado-0.3.1-foss-2022a-CUDA-11.7.0.eb --robot 
--cuda-compute-capabilities=6.1,7.5 --buildpath=/dev/shm --tmpdir=/scratch/eb-bu
  │     └─ /bin/bash -c export 
PYTHONPATH=/scratch/eb-build/eb-yoirmakd/tmpc14mrksg/lib/python3.10/site-packages:$PYTHONPATH
 &&  cd test && PYTHONUNBUFFERED=1 /trinity/shar
  │        └─ /easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
run_test.py --continue-through-error --verbose -x 
distributed/elastic/utils/distrib
  │           ├─ /easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v --subprocess
  │           │  ├─ /easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  │  ├─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  │  ...
  │           │  │  ├─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  │  ├─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python -s -c from 
multiprocessing.resource_tracker import main;main(30)
  │           │  │  ├─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  │  ├─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  │  └─ 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v TestRPCPickler.test_case
  │           │  └─ /easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
distributed/rpc/test_share_memory.py -v --subprocess
  │           └─ /easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python 
run_test.py --continue-through-error --verbose -x distributed/elastic/utils/dist

The problem seems to be the that the following process hangs while
calling 'read':

  Trace of process 95404 - 
/easybuild/software/Python/3.10.4-GCCcore-11.3.0/bin/python -s -c from 
multiprocessing.resource_tracker import main;main(30)
  strace: Process 95404 attached
  read(30,

I have tried this twice and both times the installation has stopped like
this, so I assume it is not some temporary issue with the file system.

Does anyone have any ideas about what else I could look at?

Cheers,

Loris

-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin

Reply via email to