Hi
h_vmem set to physical memory on any node.
I set h_vmem on 48 cpus hosts to 100G and submitted a job  to queue
instance on one host with requests:
#$ -pe mpifill 48
#$ -l  h_vmem=50M

$> qsub -q [email protected] detdet.job

while the job is running:
$> qhost -F -h  amd-9-6.local
...
hc:h_vmem=97.656G

1) all h_vmem request for that job is 48*50M=2400M   remaining h_vmem=
97.656G
consumed h_vmem added to remaining h_vmem should equal be 100G but
2400M+97.656G=100056 M (100.56G) that is greater than h_vmem on host level .
job ran successfully .
How 56M could be described?

2)  another job:

#$ -pe mpi12amd 120     # PE config is 12 cores per node
#$ -l  h_vmem=1.4G

  request value of h_vmem is strange!
 I should set h_vmem for a job to 1.4G, otherwise run of it will be failed.
It means that job consumes
12* 1.4G memory on any node (16.8G).

while job is running, <qhost -F -h host>  shows that  remaining mem_free
doesn't match with consuming h_vmem value.
it means that job doesn't consumes such value as real memory  !!!

hl:mem_free=17.057G  #first value of mem_free is 23.7G , before of accept
this job
hc:h_vmem=6.200G

consumed mem_free (23G-17.057G=6G) is less 10G than consumed h_vmem (16.8G).
 why h_vmem should be requested more than It needed?

this value of h_vmem is different host to host! on xeon hosts (with 24cpus
and 23G h_vmem) requested h_vmem for that job  should be at least1G, and
for AMD hosts(24cpus and 100Gh_vmem) requested h_vmem for that job  should
be at least 1.4G!!! why?

Thx
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to