I tried again with my current application. As is, it works fine in All In ONe, and in the DUCC service. When I return null from getProgress(), AllInONe throws a "No Work Items" exception, and in the DUCC Job, it hangs at the Initialize state.
I can't share my code on the forum, but I recompiled the RawTextCasCR from the sample, returning null from getProgress() reproduces the same results in both AllInOne and the DUCC job. I'm not sure about the JobDriver, but looking at the CasGenerator points to returning a null "total" value if there is no progress. It's possible that I'm performing some operations out of order, but I'm following the guidlines from the sample application. I haven't rebuilt DUCC since November 5th, would that be a problem? From: Lou DeGenaro <[email protected]> To: [email protected] Date: 11/20/2013 08:55 AM Subject: Re: DUCC not leaving Initializing State The DUCC Job Driver (JD) employs the user supplied CR and handles the case where getProgress() returns null, or so is my claim anyway. The CR's getProgress() is not required for the JD to operate. I'm not clear why this did not work for you? Adding to what I said above, I created my own CR that a) returned null for getProgress() and b) produced 20 CASes in response to 20 getNext() invocations by the JD. My Job ran properly and processed all 20 CASes. Do you have a simple CR that you could share that demonstrates the problem you are seeing? Lou. On Mon, Nov 18, 2013 at 4:42 PM, Neal R Lewis <[email protected]> wrote: > getNext was returning a CAS, which was why it worked in my SimplePipeline > in uimafit. > > After digging in with debug and AllInOne, I found the exception being > thrown was "No Workitem in CAS". It will through an exception if the > CasGenerator does not generate any CASes, which it determines through the > CasGenerator's total variable. This was in the initialization of > AllinOne.java and where the exception is thrown. > > //AllInOne.java > private void initialize() throws Exception { > String mid = "initialize"; > mh.frameworkTrace(cid, mid, "enter"); > // Generator > casGenerator = new > CasGenerator(jobRequestProperties, mh); > casGenerator.initialize(); > int total = casGenerator.getTotal(); > if(total > 0) { > // Pipeline > casPipeline = new > CasPipeline(jobRequestProperties, mh); > casPipeline.initialize(); > } > else { > throw new NoWorkItems(); > } > mh.frameworkTrace(cid, mid, "exit"); > } > > > the casGenerator.getTotal(), get's the private class variable > total, which is initialized from the progress: > > //CasGenerator.java > private void initTotal() { > String mid = "initTotal"; > mh.frameworkTrace(cid, mid, "enter"); > Progress progress = getProgress(); > if(progress != null) { > total = (int)progress.getTotal(); > } > mh.frameworkTrace(cid, mid, "exit"); > } > > > So, my progress bar was returning null, so the CasGenerator just > skip it, keep total as null. I don't know how this goes through in the > DUCC Flow Controller, but it looks like outside of AllInOne, it causes a > hang > > > > > From: Lou DeGenaro <[email protected]> > To: [email protected] > Date: 11/18/2013 01:17 PM > Subject: Re: DUCC not leaving Initializing State > > > > Neal, > > Did getNext() get called and did it return a filled-in CAS? I tried a > simple test case with my CR returning null for getProgress() and the Job > seemed to run just fine. However, one may be penalized by the scheduler if > is had to guess how much work the Job comprises. > > Lou. > > > On Fri, Nov 15, 2013 at 5:15 PM, Neal R Lewis <[email protected]> wrote: > > > I debugged using the all_in_one setup and found that I was returning > null > > from getProgress(), which was not creating any workitems. So I setup > the > > function and Cases started moving. > > > > Thanks for all of your help, > > > > Neal. > > > > > > > > > > From: Neal Lewis <[email protected]> > > To: [email protected] > > Date: 11/15/2013 05:48 AM > > Subject: Re: DUCC not leaving Initializing State > > > > > > > > Thanks Eddie, > > > > I missed the all_in_one in the documentation, so I'll try it out and get > > back to you guys. > > > > -Neal > > > > On 11/14/2013 06:02 PM, Eddie Epstein wrote: > > > Good job. FYI, it is strongly recommended to use all_in_one as > described > > > in the documentation on developing a new sample application. Your > > > aggregate most likely does not include the built-in DUCC flow > controller > > > which implements SendToLast. > > > > > > After all child CASes created for a work itemhave been generated, if > > > the CC needs a signal to clean up, then use SendToLast. The DucCasCC > > > does need that signal to close the output zip package. > > > > > > Eddie > > > > > > > > > > > > > > > On Thu, Nov 14, 2013 at 6:38 PM, Neal R Lewis <[email protected]> > > wrote: > > > > > >> So, I found one GLARING problem that I missed completely: When > > testing > > >> my CR and CM setup, I looped through the CR's getNext(), adding to > the > > >> JCas that I just created. But the CR's getNext() didn't loop > > internally, > > >> it only added IDs one at a time. So, instead of the CR creating a CAS > > with > > >> N Workitems and halting, it was create N CASes with 1 workitem. I've > > >> fixed that. > > >> > > >> So I've made an Aggregate AE of my CM and a simple AE, and ran them > > with > > >> my CR in a uimaFit SimplePipeline. The outputs are as I expect for > > now. > > >> > > >> > > >> But I do have a question. The Workitems have an option of > SendToLast, > > >> which in the RawText Example is set to "true". The DuccCasCC in the > > >> sample pulls WorkItem FSes for output stuff. The CR adds the > > Workitems, > > >> but the CM doesn't reattach them to the new CASes before returning > > next(). > > >> > > >> > > >> Should the Workitems stay in the CAS throughout processing (ie., > should > > >> the CMs add them to the new CASes) ? OR is SendToLast enough? > > >> > > >> Thanks! > > >> > > >> Neal > > >> > > >> > > >> > > >> From: Neal R Lewis/Almaden/IBM@IBMUS > > >> To: [email protected] > > >> Date: 11/14/2013 01:20 PM > > >> Subject: Re: DUCC not leaving Initializing State > > >> > > >> > > >> > > >> The minimal wrapper creates a Cas using uimaFit and loop over the > CR, > > >> then to run over the iterator from cm.processAndOutputNewCASes, then > > >> output each cas in the loop like below: > > >> > > >> CollectionReader cr = > > CollectionReaderFactory.createReader > > >> ("path.to.my.CasReader"); > > >> AnalysisEngine cm = > > AnalysisEngineFactory.createEngine( > > >> "path.to.my.CasMultiplier"); > > >> > > >> JCas jcas = JCasFactory.createJCas(" > > >> desc.DuccJobFlowControlTS); > > >> > > >> while (cr.hasNext()){ > > >> cr.getNext(jcas.getCas()); > > >> } > > >> > > >> CasIterator casIterator = > > >> cm.processAndOutputNewCASes(jcas.getCas()); > > >> while (casIterator.hasNext()) { > > >> CAS outCas = casIterator.next(); > > >> System.out.println(outCas.getDocumentText()); > > >> } > > >> > > >> I'll need to debug a full standalone PEAR or Pipeline. But at this > > point I > > >> > > >> was able to at least get the CASes out. > > >> > > >> Below is the output of the JP file before the stall and after loading > > JVM > > >> stuff. It stays like this until shutdown. As far as I can tell the > > CM > > >> Loads and is waiting. > > >> > > >> > > >> + JVM > > >> LIB_PATH:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib > > >> +------------------------------------------------------------------ > > >> 13 Nov 2013 15:53:44,761 INFO DUCC.ThreadPoolTaskExecutor - > > Initializing > > >> ExecutorService 'pooling_ducc.jd.queue.91_1' > > >> 03:53:44.794 - 1: > > >> org.apache.uima.adapter.jms.activemq.JmsInputChannel.setEndpointName: > > >> INFO: top_level_input_queue_service_1 Service Starting - Listening > for > > >> Messages > > >> 13 Nov 2013 15:53:44,795 INFO DUCC.ThreadPoolTaskExecutor - > > Initializing > > >> ExecutorService 'pooling_ducc.jd.queue.91_1' > > >> 03:53:44.797 - 30: > > >> org.apache.uima.aae.UimaAsThreadFactory$1.UimaAsThreadFactory.run(): > > INFO: > > >> > > >> Controller: ducc.jd.queue.91 Initializing AE instance on Thread Id: > 30 > > >> 03:53:44.799 - 1: > > >> > > >> > > > > > > org.apache.uima.adapter.jms.activemq.SpringContainerDeployer.waitForServiceNotification: > > >> > > >> INFO: Uima EE Client Blocking - Awaiting Top Level Controller > > >> Initialization Notification > > >> 03:53:44.887 - 30: > > >> > > >> > > > > > > com.ibm.almaden.disco.quartermaster.SimpleQuarterMasterCasMultiplier.logParameters(70): > > >> > > >> INFO: Initializing Cas Multiplier Parameters > > >> 03:53:44.887 - 30: > > >> > > >> > > > > > > com.ibm.almaden.disco.quartermaster.SimpleQuarterMasterCasMultiplier.logParameters(71): > > >> > > >> INFO: init HOST ADDRESS: > > >> https://thepit.element.almaden.ibm.com/cgi-bin/qm2 > > >> 03:53:44.887 - 30: > > >> > > >> > > > > > > com.ibm.almaden.disco.quartermaster.SimpleQuarterMasterCasMultiplier.logParameters(72): > > >> > > >> INFO: init LENIENT: false > > >> 03:53:44.887 - 30: > > >> > > >> > > > > > > com.ibm.almaden.disco.quartermaster.SimpleQuarterMasterCasMultiplier.logParameters(73): > > >> > > >> INFO: init PARAM_ENCODING: UTF-8 > > >> 03:53:44.887 - 30: > > >> > > >> > > > > > > com.ibm.almaden.disco.quartermaster.SimpleQuarterMasterCasMultiplier.logParameters(74): > > >> > > >> INFO: init PARAM_LANGUAGE: en > > >> 03:53:44.933 - 30: > > >> org.apache.uima.ducc.sampleapps.DuccCasCC.initialize(74): INFO: > > Outputting > > >> > > >> CASes in XmiCas format, zip compressed at level=7 > > >> Service:ducc.jd.queue.91 Initialized. Ready To Process Messages From > > >> Queue:ducc.jd.queue.91 > > >> 03:53:45.97 - 30: > > >> > > >> > > > > > > org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl.postInitialize: > > >> > > >> INFO: ********* Initialized the Controller. ducc.jd.queue.91 Ready To > > >> Process. ******** > > >> 03:53:45.97 - 1: > > >> > > >> > > > > > > org.apache.uima.adapter.jms.activemq.SpringContainerDeployer.doStartListeners: > > >> > > >> INFO: Controller: ducc.jd.queue.91 Starting Listener on Endpoint: > > >> queue://ducc.jd.queue.91 Selector: Command=2000 OR Command=2002 > Broker: > > >> > > > tcp://greenfairy:61617?wireFormat.maxInactivityDuration=0&closeAsync=false > > >> 03:53:45.290 - 1: > > >> > > >> > > > > > > org.apache.uima.adapter.jms.activemq.SpringContainerDeployer.doStartListeners: > > >> > > >> INFO: Controller: ducc.jd.queue.91 Starting Listener on Endpoint: > > >> queue://ducc.jd.queue.91 Selector: Command=2001 Broker: > > >> > > > tcp://greenfairy:61617?wireFormat.maxInactivityDuration=0&closeAsync=false > > >> 13 Nov 2013 15:53:45,299 INFO DUCC.DuccService - boot N/A ... > > >> Component started: managedService > > >> 13 Nov 2013 15:53:45,300 INFO DUCC.DuccService - boot N/A > Starting > > >> Camel. Use ctrl + c to terminate the JVM. > > >> > > >> > > >> > > >> > > >> > > >> > > >> From: Eddie Epstein <[email protected]> > > >> To: [email protected] > > >> Date: 11/14/2013 10:37 AM > > >> Subject: Re: DUCC not leaving Initializing State > > >> > > >> > > >> > > >> On Wed, Nov 13, 2013 at 7:10 PM, Neal R Lewis <[email protected]> > > wrote: > > >> > > >>> I have modeled a CasReader and CasMultiplier based on the > > RawTextExample > > >>> in the duccbook, and have successfully ran the CR and CM in eclipse > > with > > >> a > > >>> minimal wrapper. > > >> > > >> Did you debug the job using one of the --all_in_one varieties, or > some > > >> other > > >> minimal wrapper? > > >> > > >> > > >> I am using the same CC from the Sample (DuccCasCC) > > >>> I am experiencing a problem where my DUCC job goes through > > >>> WaitingForDriver, WaitingForResources, then Initializing, then > doesn't > > >>> change state. After several minutes I just cancel the job. > > >>> > > >> Did you look in the JP logfile? > > >> > > >> Eddie > > >> > > >> > > >> > > > > > > > >
