Am 01.11.2013 um 14:39 schrieb Sylvain Foisy Ph. D.: > Hi Reuti, > > Everything seems to be working fine now. My $SGE_ROOT is located on a SAN > volume, connected to me cluster via NFS. Would network saturation issues > might cause this type of behaviour?
Yes. It's best to have all spool directories for the exechosts local too instead of having all in a shared $SGE_ROOT: http://arc.liv.ac.uk/SGE/howto/nfsreduce.html Unless you have a shadow qmaster also the qmaster's spool directory should be local IMO. During installation you can give different directories for the qmaster spool (must exist already) and exechosts spool (will be created with the name of each exechost during startup of each of them). The latter can also be changed quite easy by editing each exechosts' configuration file (`qconf -mconf node001`...). -- Reuti > Thanks in advance > > S > > On 2013-10-31, at 1:25 PM, Reuti wrote: > >> Hi, >> >> Am 31.10.2013 um 15:38 schrieb Sylvain Foisy Ph. D.: >> >>> I sent a whole bunch of next gen sequencing alignment jobs on our cluster >>> that completed just fine on the slaves but my qmaster process dies along >>> the way and I had to restart it. Following this, I tried to submit >>> sleeper.sh test jobs to check if everytinng was fine but they get stuck in >>> the queue in qw state, never being submitted for execution. When I look >>> into the qmaster log file, I see this message a number of times (I guess >>> that each time the master tries to submit): >>> >>> rule "default rule (spool dir)" in spooling context "flatfile spooling" >>> failed writing an object >>> >>> Ok, I did my googling on this and found out that the problem is lack of >>> space for spooling into the $SGE_ROOT folder. All good and fine but my df >>> inspection shows me that my $SGE_ROOT is only at 90% free... >> >> The spool directory is at the location you specified during installation. So >> all the flat files are in $SGE_ROOT/default/spool/qmaster? This location is >> writable too? >> >> -- Reuti >> >> >>> Before I go and restart the master server, is there anything that I should >>> be looking for? >>> >>> Best regards and thanks in advance >>> >>> Sylvain >>> >>> ============================================================== >>> Sylvain Foisy, Ph. D. >>> Chargé de projet | Project Manager >>> Bioinformatics >>> Labo. de génétique et médecine génomique de l'inflammation >>> Centre de recherche >>> Institut de cardiologie de Montréal >>> 5000 Bélanger Est >>> Montréal, Qc H1T 1C8 >>> CANADA >>> ============================================================== >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >> >> >> Email secured by Check Point > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
