Hi Rad, Are you sure that the execution daemons are running on your compute nodes? Can you login to one of the nodes say ‘compute001’ and do a ps looking for the execd? When an execd is functioning normally it provides the load and memory, etc… none of your nodes are showing that.
Regards, Bill. > On May 30, 2016, at 1:20 PM, Radhouane Aniba <arad...@gmail.com> wrote: > > Hello all, > > I am trying to submit a simple "hello world" to test a gridengine (I used it > before with no problems) > > The problem is that my job is waiting in the queue forever > > The qhost command shows a wired state of the compute nodes > > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO > SWAPUS > ------------------------------------------------------------------------------- > global - - - - - - > - > compute001 lx26-amd64 4 - 31.4G - 0.0 > - > compute002 lx26-amd64 4 - 31.4G - 0.0 > - > compute003 lx26-amd64 4 - 31.4G - 0.0 > - > compute004 lx26-amd64 4 - 31.4G - 0.0 > - > compute005 lx26-amd64 4 - 31.4G - 0.0 > - > compute006 lx26-amd64 4 - 31.4G - 0.0 > - > compute007 lx26-amd64 4 - 31.4G - 0.0 > - > compute008 lx26-amd64 4 - 31.4G - 0.0 > - > compute009 lx26-amd64 4 - 31.4G - 0.0 > - > compute010 lx26-amd64 4 - 31.4G - 0.0 > - > compute011 lx26-amd64 4 - 31.4G - 0.0 > In normal times even when the compute nodes are not used I used to have some > information on the load and memuse columns > > I am not an SGE persons but I am familiar with all the commands, any help > would be much appreciated > > the qstat -f command shows all my nodes in au state. I've been reading a lot > about it and I understood its an alarm state (overloaded ?) > > the only heavy activity I had on the head node was a script downloading 19T > of data, could the headnode be the problem and not the compute nodes ? > > sge_execd is working on all the compute/exec nodes :/ > > -- > Rad > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users William Bryce | VP Products Univa Corporation, Toronto E: bbr...@univa.com | D: 647-9742841 | Toll-Free (800) 370-5320 W: Univa.com <http://univa.com/> | FB: facebook.com/univa.corporation <http://facebook.com/univa.corporation> | T: twitter.com/Grid_Engine <http://twitter.com/Grid_Engine>
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users