>> As you probably can imagine that is hard to reproduce. See if
>> you can make smaller example fail - preferably something that
>> can run on smaller machines.
> 
> I wrote a small script that shows the problem. It completes
> in less than 10 seconds on my desktop (two cores), but hangs
> (read: "does not complete within hours") on two other
> machines (8/32 cores).

I left the script running and it did not complete within 3 days!
A modified version of the trigger is attached. Having a look at
the temporary directory, 'parallel' hangs _after_ all files
have been created (or removed).

I just tested the new script on all machines again: "2core" and
"8core" successfully completed 10 consecutive runs, but "32core"
still hungs _everytime_ a script is run.

Could someone with 8-32 (or even more?) cores please try to
reproduce the issue?

Thomas
#!/bin/bash

# use a tmp-dir in a RAM disk
mkdir -p /dev/shm/pissue || exit
cd /dev/shm/pissue || exit

# cleanup
ls | xargs -rd '\n' rm

# this seems to be important
export PARALLEL="--load 100% --verbose"
echo PARALLEL=$PARALLEL

echo -n "cores on $HOSTNAME: "
parallel --number-of-cores
echo

for i in $(seq 2 10); do
  i2=$[i*i]
  echo creating $i2 files:
  seq $i2 | parallel -X touch
  echo deleting $i2 files:
  ls -v | parallel -X rm
  echo
done

rmdir /dev/shm/pissue

Reply via email to