On 27.02.2017, at 16:39, James Baker <[email protected]> wrote: > > Thanks Richard, switching over to SimplePipeline did the trick. I'll update > the GitHub repository with a working solution for reference. > > Is there any information available on the advantages/disadvantages of > SimplePipeline over using the CPE? The application I'm using already uses > CPE, so I'd like to understand what the impact of moving away from that > might be.
SimplePipeline is just a basic single-threaded thing. What it does internally is basically creating an aggregate from all the engines that it receives, creating a CAS, and using that CAS to loop over the collection reader and all the engines. All no extra threads created, no parallelization. CpePipeline and CpeBuilder make use of the rather deprecated UIMA CPE [1]. They are very simple wrappers around CPE mimicking the API of SimplePipeline. CpePipeline configures CPE to scale up the analysis engines creating one parallel instance per CPU core (reserving one core for your other work). With CpeBuilder, you have a bit more control over the settings, e.g. you can change the number of threads to use and you can post-process the CpeDescription if you want that. Cheers, -- Richard [1] https://uima.apache.org/d/uimaj-2.9.0/references.html#ugr.ref.xml.cpe_descriptor
