On Thu, Mar 01, 2012 at 11:03:51PM -0500, Simon Matthews wrote:
On Thu, Mar 1, 2012 at 8:00 PM, Rayson Ho
<[email protected]<mailto:[email protected]>> wrote:
On Thu, Mar 1, 2012 at 10:55 PM, Simon Matthews
<[email protected]<mailto:[email protected]>> wrote:
I installed "iotop" and it shows multiple nfsd processes driving a lot of
I/O. I have always assumed that the qmaster and the execution clients need
to share a common SGE_ROOT directory. Is this true?
You don't need to have a shared SGE_ROOT, see:
http://gridscheduler.sourceforge.net/howto/nfsreduce.html
Great! I'll reconfigure things to use a local spool directory -- that should
eliminate much of the issue.
FWIW, I always use classic spooling with $SGE_ROOT shared over NFS, and
havne't seen any problems. With 50+ compute nodes, and users routinely
pushing hundreds of jobs (many of them large array jobs, so there are
thousands of tasks).
The current stats (although things are very quiet right now...):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10948 sgeadmin 16 0 640m 41m 10m S 0.7 0.1 11499:57 sge_qmaster
Simon
And for SGE 6.2u5 or below, you can't have BerkeleyDB on NFS (unless
it is NFSv4). For Grid Engine 2011.11, you can place your BerkeleyDB
spool directory on any version of NFS.
Rayson
If not, then I can make
each execution machine have a local SGE_ROOT directory, which will eliminate
the I/O from nfsd.
Simon
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
--
Jesse Becker
NHGRI Linux support (Digicon Contractor)
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users