[gridengine users] h_vmem + CUDA

Ilya M Fri, 18 Apr 2014 13:28:23 -0700

Hello,

I have been using h_vmem as a consumable resource to limit the amount ofmemory users can request and to make sure jobs don't use more than theyrequested. It all has been working fine until we added nodes with GPUmodules.

The memory model in CUDA applications is such, that the address spacefor virtual memory is expanded to have a single address space forregular memory and GPU memory (and something else, because I sawreported memory usage 2-fold of the virtual memory).

This results in jobs getting killed because of exceeding h_vmem, whereasis fact the actual memory usage was really low. So this effectivelyrenders h_vmem useless, because it either kills jobs that should not bekilled or needs to be set very high (2-3 time the size of virtualmemory) for the host and jobs to allow jobs to run.

I was wondering if there were good solutions or practices on controllingmemory usage on GPU-equipped nodes. I am using SGE 6.2u5.


Thank you,
Ilya.

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

[gridengine users] h_vmem + CUDA

Reply via email to