Re: [gridengine users] Convert from file to bdb spooling?

Jesse Becker Fri, 02 Mar 2012 07:05:44 -0800

On Thu, Mar 01, 2012 at 11:03:51PM -0500, Simon Matthews wrote:



On Thu, Mar 1, 2012 at 8:00 PM, Rayson Ho 
<[email protected]<mailto:[email protected]>> wrote:
On Thu, Mar 1, 2012 at 10:55 PM, Simon Matthews
<[email protected]<mailto:[email protected]>> wrote:

I installed "iotop" and it shows multiple nfsd processes driving a lot of
I/O. I have always assumed that the qmaster and the execution clients need
to share a common SGE_ROOT directory. Is this true?


You don't need to have a shared SGE_ROOT, see:

http://gridscheduler.sourceforge.net/howto/nfsreduce.html

Great! I'll reconfigure things to use a local spool directory -- that should 
eliminate much of the issue.



FWIW, I always use classic spooling with $SGE_ROOT shared over NFS, and
havne't seen any problems.  With 50+ compute nodes, and users routinely
pushing hundreds of jobs (many of them large array jobs, so there are
thousands of tasks).

The current stats (although things are very quiet right now...):

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND10948 sgeadmin 16 0 640m 41m 10m S 0.7 0.1 11499:57 sge_qmaster


Simon

And for SGE 6.2u5 or below, you can't have BerkeleyDB on NFS (unless
it is NFSv4). For Grid Engine 2011.11, you can place your BerkeleyDB
spool directory on any version of NFS.

Rayson

If not, then I can make
each execution machine have a local SGE_ROOT directory, which will eliminate
the I/O from nfsd.

Simon

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users



--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Convert from file to bdb spooling?

Reply via email to