Re: [gridengine users] Alternative to h_vmem?

2014-06-08 Thread Jewell, Chris
Hi, FWIW, this is a very important distinction for CUDA applications. CUDA registers a combine memory address space for all host, device, and page-locked memory as virtual memory. The OS may detect a virtual memory usage of tens if not hundreds of GB, while the application is only using

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-25 Thread Jewell, Chris
The message is from a failure of setuid(2) or similar. I don't know if it's a libc bug that errno seems no to be set (Success) as it should be. The two possible cases are: EAGAIN The uid does not match the current uid and uid brings process over its RLIMIT_NPROC resource

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-22 Thread Jewell, Chris
I started with a search of the SGE mailing list archive, and found your post. :) Have you found a solution? Hello all, Sorry for the long leave of absence. I've been thoroughly testing my system for this issue. I checked my RAID1 for consistency, and performed an xfs_repair to make