Reuti <[email protected]> writes:
> The h_vmem isn't multiplied on the slave nodes even if you are getting
> slots from one queue only, despite the fact that the correct value of
> $NSLOTS on the slave node is known:
>
> $ qsub -pe mpich 4 -l h_vmem=256M test.sh
> $ cat test.sh.o5664
> pc15370 2 all.q@pc15370 UNDEFINED
> pc15381 2 all.q@pc15381 UNDEFINED
> Script pc15370: /tmp/5664.1.all.q 4
> ...
> virtual memory (kbytes, -v) 524288
> ...
> Call pc15370: /tmp/5664.1.all.q 4
> ...
> virtual memory (kbytes, -v) 262144
> ...
> Call pc15381: /tmp/5664.1.all.q 2
> ...
> virtual memory (kbytes, -v) 262144
> ...
> Call pc15381: /tmp/5664.1.all.q 2
> ...
> virtual memory (kbytes, -v) 262144
> ...
>
> It should be 524288 also on pc15381, at least for the first call.
I can't reproduce that (with openmpi tight integration). Doing this
(which gets three four-core nodes):
qsub -pe openmpi 12 -l h_vmem=256M
echo "Script $(hostname): $TMPDIR $NSLOTS"
ulimit -v
for HOST in $(tail -n +2 $PE_HOSTFILE|cut -f1 -d' '); do
qrsh -inherit $HOST 'echo "Call $(hostname): $TMPDIR $NSLOTS"; ulimit -v;
sleep 60' &
done
wait
I see:
Script node193: /tmp/179483.1.parallel 12
1048576
Call node228: /tmp/179483.1.parallel 4
1048576
Call node214: /tmp/179483.1.parallel 4
1048576
--
Community Grid Engine: http://arc.liv.ac.uk/SGE/
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users