Re: Batch Checkpoints with DUCC?
Hi, Yes, exactly. DUCC jobs that specify CM,AE, CC use a custom flow controller that routes the WorkItem CAS as desired. By default the route is (CM,CC), but this can be modified by the contents of the WorkItem feature structure ... http://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1930009.5.3 Eddie On Wed, May 16, 2018 at 2:56 AM, Erik Fäßlerwrote: > Hey Eddie, thanks again! :-) > > So the idea is that the work item is the CAS that the CR sent to the CM, > right? The work item CAS consists of a list of artifacts which are output > by the CM, processed by the pipeline and finally cached by the CC. > Then, I can somehow (have to read this up) have the work item CAS sent to > the CC as the effective “batch processing complete” signal. > > Is that correct? > > > On 15. May 2018, at 20:50, Eddie Epstein wrote: > > > > Hi Erik, > > > > There is a brief discussion of this in the duccbook in section 9.3 ... > > https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3 > > > > In particular, the 3rd option, "Flushing cached data". This assumes that > > the batch of work to be flushed is represented by each workitem CAS. > > > > Regards, > > Eddie > > > > On Tue, May 15, 2018 at 9:21 AM, Erik Fäßler > > wrote: > > > >> And another question concerning DUCC :-) > >> > >> With my CPEs I use a lot the batchProcessingComplete() and > >> collectionProcessingComplete() methods. I need them because I do a lot > of > >> database interactions where I need to send data in batches due to the > >> overhead of network communication. > >> How is that handled in DUCC? The documentation does not talk about it, > at > >> least it not find anything. > >> > >> Hints are appreciated. > >> > >> Thanks! > >> > >> Erik > >
Re: Batch Checkpoints with DUCC?
Hey Eddie, thanks again! :-) So the idea is that the work item is the CAS that the CR sent to the CM, right? The work item CAS consists of a list of artifacts which are output by the CM, processed by the pipeline and finally cached by the CC. Then, I can somehow (have to read this up) have the work item CAS sent to the CC as the effective “batch processing complete” signal. Is that correct? > On 15. May 2018, at 20:50, Eddie Epsteinwrote: > > Hi Erik, > > There is a brief discussion of this in the duccbook in section 9.3 ... > https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3 > > In particular, the 3rd option, "Flushing cached data". This assumes that > the batch of work to be flushed is represented by each workitem CAS. > > Regards, > Eddie > > On Tue, May 15, 2018 at 9:21 AM, Erik Fäßler > wrote: > >> And another question concerning DUCC :-) >> >> With my CPEs I use a lot the batchProcessingComplete() and >> collectionProcessingComplete() methods. I need them because I do a lot of >> database interactions where I need to send data in batches due to the >> overhead of network communication. >> How is that handled in DUCC? The documentation does not talk about it, at >> least it not find anything. >> >> Hints are appreciated. >> >> Thanks! >> >> Erik
Re: Batch Checkpoints with DUCC?
Hi Erik, There is a brief discussion of this in the duccbook in section 9.3 ... https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3 In particular, the 3rd option, "Flushing cached data". This assumes that the batch of work to be flushed is represented by each workitem CAS. Regards, Eddie On Tue, May 15, 2018 at 9:21 AM, Erik Fäßlerwrote: > And another question concerning DUCC :-) > > With my CPEs I use a lot the batchProcessingComplete() and > collectionProcessingComplete() methods. I need them because I do a lot of > database interactions where I need to send data in batches due to the > overhead of network communication. > How is that handled in DUCC? The documentation does not talk about it, at > least it not find anything. > > Hints are appreciated. > > Thanks! > > Erik
Batch Checkpoints with DUCC?
And another question concerning DUCC :-) With my CPEs I use a lot the batchProcessingComplete() and collectionProcessingComplete() methods. I need them because I do a lot of database interactions where I need to send data in batches due to the overhead of network communication. How is that handled in DUCC? The documentation does not talk about it, at least it not find anything. Hints are appreciated. Thanks! Erik