Howdy All,
We are running GE 1.8.2. When a node gets oversubscribed and jobs get
"suspended" -- turned into the T state -- users who's jobs are in the
T state can not directly SSH into that node. Am I correct in that GE
is the cause for this (users not able to SSH into the nodes)? If so,
we would want users to be able to directly login to those nodes.
The reason I suspect GE is that cause, is that SSH'ing to other nodes
work, another user account who has no running job on the
oversubscribed node can directly SSH to node.
When the node resumes back to a normal level, and jobs that were in
the T state go back to R, the user is then able to SSH directly to the
node.
In our case, it is very helpful for our users to directly SSH into the
nodes to determine what is wrong with their qsub scripts, etc. This is
a follow up the following thread by Joseph and Harry:
https://gridengine.org/pipermail/users/2013-February/005585.html
Thanks,
-Adam
--
Adam Brenner
Computer Science, Undergraduate Student
Donald Bren School of Information and Computer Sciences
Research Computing Support
Office of Information Technology
http://www.oit.uci.edu/rcs/
University of California, Irvine
www.ics.uci.edu/~aebrenne/
[email protected]
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users