Thanks Richard, I don’t think we’re using any multi-threading bits at the moment, so it sounds like the SimplePipeline might be ok for the time being. If we do decide to move to a multi-threaded system in the future, presumably implementing our own is the best way forwards, or is there an alternative to CPE/SimplePipeline that I’ve missed?
James > On 27 Feb 2017, at 16:35, Richard Eckart de Castilho <[email protected]> wrote: > > On 27.02.2017, at 16:39, James Baker <[email protected]> wrote: >> >> Thanks Richard, switching over to SimplePipeline did the trick. I'll update >> the GitHub repository with a working solution for reference. >> >> Is there any information available on the advantages/disadvantages of >> SimplePipeline over using the CPE? The application I'm using already uses >> CPE, so I'd like to understand what the impact of moving away from that >> might be. > > SimplePipeline is just a basic single-threaded thing. What it does internally > is basically creating an aggregate from all the engines that it receives, > creating a CAS, and using that CAS to loop over the collection reader and > all the engines. All no extra threads created, no parallelization. > > CpePipeline and CpeBuilder make use of the rather deprecated UIMA CPE [1]. > They are very simple wrappers around CPE mimicking the API of SimplePipeline. > CpePipeline configures CPE to scale up the analysis engines creating one > parallel instance per CPU core (reserving one core for your other work). > With CpeBuilder, you have a bit more control over the settings, e.g. you > can change the number of threads to use and you can post-process the > CpeDescription if you want that. > > Cheers, > > -- Richard > > [1] > https://uima.apache.org/d/uimaj-2.9.0/references.html#ugr.ref.xml.cpe_descriptor
