I found the problem. For some reason or other two of the nodes refused to accept jobs. Instead that the scheduller would choose a different nodes, the jobs were deferred. Once I had disabled those two nodes, the jobs started.
I still do not know why the scheduller insisted of keeping on choosing the nodes or why these nodes failed to communicated properly to the pbs_server, but at least the cluster i functioning alright now. Lydia On Thu, 31 Mar 2011, Lydia Heck wrote: > > > Our cluster has ~2,600 cores, there are parallel jobs running to fill ~1,700 > and there are many sequential jobs queue that are now in "front" of other > parallel jobs. But only one or two of the sequential jobs are running, the > others being held by the user. > > The parallel jobs are not schedulled. The scheduler is maui. Any idea what I > am missing here? > > Lydia > > _______________________________________________ > mauiusers mailing list > [email protected] > http://www.supercluster.org/mailman/listinfo/mauiusers > _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
