Hi Shabhaz, This could be because in BESProvider jobID is not set properly. We read the JobID like below.
jobExecutionContext.getJobDetails().getJobID(); Hope this helps. Lahiru On Wed, Apr 23, 2014 at 10:12 AM, Shahbaz Memon <[email protected]>wrote: > > Thanks Lahiru. It has somehow passed the NPE. Now I see the following > error, > > org.apache.airavata.job.monitor.exception.AiravataMonitorException: Error > retrieving the job status > at > org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.startPulling(BESPullJobMonitor.java:165) > at > org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.run(BESPullJobMonitor.java:58) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.xmlbeans.XmlException: error: Unexpected element: > CDATA > at > org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3511) > at org.apache.xmlbeans.impl.store.Locale.parse(Locale.java:713) > at > org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:697) > at > org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:684) > at > org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208) > at > org.w3.x2005.x08.addressing.EndpointReferenceType$Factory.parse(Unknown > Source) > at > org.apache.airavata.job.monitor.impl.pull.bes.BESStatusChecker.getJobStatuses(BESStatusChecker.java:114) > at > org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.startPulling(BESPullJobMonitor.java:98) > ... 2 more > Caused by: org.xml.sax.SAXParseException; systemId: file:; lineNumber: 1; > columnNumber: 1; Unexpected element: CDATA > at > org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportFatalError(Piccolo.java:1038) > at > org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:723) > at > org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3479) > ... 9 more > > It is happening due to an unexpected jobid "DO_NOT_SET_AT_CLIENTS", which > is not a correct structure and should be the endpoint reference structure > of the submitted activity. > > Lahiru, any comments? > > Cheers, > > Shahbaz > > > > On Wed, Apr 23, 2014 at 3:34 PM, Lahiru Gunathilake <[email protected]>wrote: > >> Hi Shahbaz, >> >> I had a look at the code and I think the actual error is not a NPE but in >> side the catch claus we get NPE because currentMonitorID is null, so if you >> change the code as following and run again, we will get some meaningful >> information. I can see you have followed the same implementation as >> QstatMonitor, I will change the code in QstatMonitor too. >> >> >> else if (!this.queue.contains(take)) { // we put the job back to the >> queue only if its state is not unknown >> if (currentMonitorID == null) { >> logger.error("Monitoring the jobs failed, for user: " >> + take.getUserName() >> + " in Host: " + >> currentHostDescription.getType().getHostAddress()); >> } else { >> if (currentMonitorID != null) { >> if (currentMonitorID.getFailedCount() < 2) { >> try { >> >> currentMonitorID.setFailedCount(currentMonitorID.getFailedCount() + 1); >> this.queue.put(take); >> } catch (InterruptedException e1) { >> e1.printStackTrace(); >> } >> } else { >> logger.error(e.getMessage()); >> logger.error("Tried to monitor the job 3 >> times, so dropping of the the Job with ID: " + currentMonitorID.getJobID()); >> } >> } >> } >> } >> throw new AiravataMonitorException("Error retrieving the job >> status", e); >> } >> >> Thanks >> Lahiru >> >> >> On Wed, Apr 23, 2014 at 9:18 AM, Shahbaz Memon <[email protected]>wrote: >> >>> Thanks Lahiru. >>> >>> airavata.log -> https://gigamove.rz.rwth-aachen.de/d/id/3pxEa6Ksf9Vf39 >>> >>> Cheers, >>> >>> Shahbaz >>> >>> >>> On Wed, Apr 23, 2014 at 3:07 PM, Lahiru Gunathilake >>> <[email protected]>wrote: >>> >>>> Hi Shahbaz, >>>> >>>> Are you seeing any logs in the server ? >>>> >>>> Regards >>>> Lahiru >>>> >>>> >>>> On Wed, Apr 23, 2014 at 9:00 AM, Shahbaz Memon >>>> <[email protected]>wrote: >>>> >>>>> Hi all, >>>>> >>>>> I am facing one issue while testing the bes's pull monitor >>>>> implementation. >>>>> >>>>> Before stating my issue, let me write details on the current >>>>> implementation state, >>>>> >>>>> For the bes extension I have forked the github repository under the >>>>> following url, >>>>> >>>>> https://github.com/msmemon/airavata >>>>> >>>>> In the forked sources most of the classes are not touched except a >>>>> couple of modifications and additions. I have also modified project poms >>>>> with multiple dependency exclusions to avoid class loading horrors. >>>>> >>>>> There is a partially tested implementation available with input / >>>>> output handlers, provider,and monitor classes. >>>>> >>>>> For the monitoring purposes (as it is the place where I am facing an >>>>> issue), I have written a pull monitor that is very similar to the QStat >>>>> one, the only exception is the connection object which contains a >>>>> different >>>>> credential and proxy client instance that is suitable for BES supported >>>>> endpoints. >>>>> >>>>> Now my issue is, >>>>> >>>>> during the job submission process, input handler and provider is >>>>> properly invoked, and after that, BESPullJobMonitor [1] is throwing a NPE, >>>>> thus my workflow is not reaching the final phase of output handler >>>>> invocation and completion. >>>>> >>>>> java.lang.NullPointerException >>>>> at >>>>> org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.startPulling(BESPullJobMonitor.java:173) >>>>> at >>>>> org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.run(BESPullJobMonitor.java:60) >>>>> at java.lang.Thread.run(Thread.java:744) >>>>> >>>>> May be I am not rightly following the NEW monitoring extensions. Any >>>>> feedback is more than welcome. >>>>> >>>>> [1] >>>>> https://github.com/msmemon/airavata/blob/master/tools/job-monitor/src/main/java/org/apache/airavata/job/monitor/impl/pull/bes/BESPullJobMonitor.java >>>>> >>>>> Thanks in advance, >>>>> >>>>> Shahbaz >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------------------------ >>>>> >>>>> ------------------------------------------------------------------------------------------------ >>>>> Forschungszentrum Juelich GmbH >>>>> 52425 Juelich >>>>> Sitz der Gesellschaft: Juelich >>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >>>>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher >>>>> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), >>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >>>>> Prof. Dr. Sebastian M. Schmidt >>>>> >>>>> ------------------------------------------------------------------------------------------------ >>>>> >>>>> ------------------------------------------------------------------------------------------------ >>>>> >>>>> >>>> >>>> >>>> -- >>>> System Analyst Programmer >>>> PTI Lab >>>> Indiana University >>>> >>> >>> >> >> >> -- >> System Analyst Programmer >> PTI Lab >> Indiana University >> > > -- System Analyst Programmer PTI Lab Indiana University
