Hello all,
I've got an error with Globus WebMDS, the Sun Grid Engine scheduler and the
GridWise Tech GE-GT Adaptor, specifically the component that should provide
SGE queue information to WebMDS. This is in the container.log:
2007-08-10 09:31:27,183 WARN transforms.GLUESchedulerElementTransform
[Timer-6,transformElement:377] Unhandled exception during GLUE ComputeElement
transform
ation
java.lang.Exception: Batch provider generated no useful information.
at
org.globus.mds.usefulrp.rpprovider.transforms.GLUESchedulerElementTransform.t
ransformElement(GLUESchedulerElementTransform.java:121)
at
org.globus.mds.usefulrp.rpprovider.TransformElementListener.executionPerforme
d(TransformElementListener.java:81)
at
org.globus.mds.usefulrp.rpprovider.ResourcePropertyProviderTask.timerExpired(
ResourcePropertyProviderTask.java:155)
at
org.globus.wsrf.impl.timer.TimerListenerWrapper.executeTask(TimerListenerWrap
per.java:65)
at
org.globus.wsrf.impl.timer.TimerListenerWrapper.run(TimerListenerWrapper.java
:82)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Aparently, the component that should provide this information does not work
correctly. I've got two GT 4.0.5 Installations that are more or less similar,
so I can compare things. Both stream to one WebMDS, and the Service Group
Entry Details are there for both Hosts. But only one Host has GLUECE queue
information in the AggregatorData section, the Host with this error does not.
On the host without the error, the WebMDS page section looks like this:
Content
* AggregatorConfig:
o GetResourcePropertyPollType:
+ PollIntervalMillis: 60000
+ ResourcePropertyName: glue:GLUECE
* AggregatorData:
o GLUECE:
+ ComputingElement:
# Name: all.q
# UniqueID: all.q
# Info:
* GRAMVersion: 4.0.5
* HostName: bibgrid1lx.eu.boehringer.com
* LRMSType: SGE
* LRMSVersion: 6.0u9
* TotalCPUs: 4
[...]
On the host with the error, the AggregatorData section is empty. I think that
this information is provided by a Perl script called
globus-scheduler-provider-sge. This script, however, works on both Hosts, and
produces (almost) identical output like this:
<scheduler xmlns="http://mds.globus.org/batchproviders/2004/09">
<Info LRMSType="SGE" LRMSVersion="6.0u9" GRAMVersion="4.0.5"
HostName="bibgrid2lx" TotalCPUs="8"/>
<Queue name="all.q">
<totalnodes>4</totalnodes>
<freenodes>4</freenodes>
<maxCount>4</maxCount>
<maxtime>unknown</maxtime>
<maxCPUtime>unknown</maxCPUtime>
<maxReqNodes>unknown</maxReqNodes>
<maxRunningJobs>unknown</maxRunningJobs>
<maxJobsInQueue>unknown</maxJobsInQueue>
<maxTotalMemory>unknown</maxTotalMemory>
<maxSingleMemory>unknown</maxSingleMemory>
<whenActive>unknown</whenActive>
<dispatchType>batch</dispatchType>
<status>enabled</status>
<numJobs>0</numJobs>
</Queue>
<Queue name="restricted.q">
<totalnodes>4</totalnodes>
<freenodes>4</freenodes>
<maxCount>4</maxCount>
<maxtime>unknown</maxtime>
<maxCPUtime>unknown</maxCPUtime>
<maxReqNodes>unknown</maxReqNodes>
<maxRunningJobs>unknown</maxRunningJobs>
<maxJobsInQueue>unknown</maxJobsInQueue>
<maxTotalMemory>unknown</maxTotalMemory>
<maxSingleMemory>unknown</maxSingleMemory>
<whenActive>unknown</whenActive>
<dispatchType>batch</dispatchType>
<status>enabled</status>
<numJobs>0</numJobs>
</Queue>
</scheduler>
My best guess right now is, that I am missing some XML or XSLT related
Library, so that whatever component should parse this output is unable to do
so.
Any suggestions to what might cause this issue, which components are
involved, or where I might get more information on what's going on here?
Mit freundlichen Grüßen,
Christian Aßfalg
BA-Student Informationstechnik
AIV-ITB-SYS2 bei Frau Zocher