Hi, > Am 15.10.2015 um 01:16 schrieb Hatem Elshazly <hmelsha...@gmail.com>: > > Hi there, > > I'm having a problem getting an execution host to work. The master node seems > it can't sense the execution node, when I submit a job it stalls in the queue.
Is it in state "qw" or "t"? $ qalter -w v <job_id> will check whether the job could be started in an empty cluster in the current configuration. The home directory is shared in the cluster, so that the user's home directory can be accessed? > Both daemons are running on master and executing node, I added the execution > node to the queue and made sure the ports are open and can ssh without > password from/to both nodes It's not necessary to have passphraseless SSH in the cluster. Even parallel jobs can run without this setting. In fact, I allow SSH access to nodes only for admin staff. > , sge_root and sge_cell are open to read and write. The strange thing is when > I change the ncpu of the execution node it gets reflected when I use qhost > command on master node. You mean "num_proc"? This should be seen as a read only value and it's normally not necessary to adjust it. The slot count in the queues is independent from this setting. -- Reuti > This is the output of qhost command: (Arch and mem is NA although I set them > in the node's values) > > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO > SWAPUS > ------------------------------------------------------------------------------- > global - - - - - - > - > node001 - 1 - - - - - > master linux-x64 1 0.01 3.7G 157.8M 0.0 0.0 > > > Any suggestions on what might be wrong is really appreciated. > > Thanks. > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users