Hi, just saw this ... I'll take a look. This kind of thing is "on the list" for uima v3; see https://cwiki.apache.org/confluence/display/UIMA/Ideas+for+UIMAJ+v3
-Marshall On 6/14/2015 8:20 PM, Petr Baudis wrote: > Hi! > > I have created an extension of UIMA that replaces its default ASB > with a multi-threaded one, so that if you have a CAS multiplier in your > pipeline, multiple generated CASes may be processed in parallel in > different threads. It has a few warts, but should be generally much > simpler to use than UIMA-AS if you do not need fancy things like cluster > deployment. > > It even has some documentation now. Find it at: > > > https://github.com/brmson/yodaqa/tree/master/src/main/java/cz/brmlab/yodaqa/flow/asb > > (Right now, it just lives as part of my YodaQA software, simply copy > that directory to your project. I can spin-off the package properly > if there'll be enough interest in it. It shares the YodaQA licence > statement, i.e. ASL2.) > > > On Wed, May 20, 2015 at 03:27:20AM +0200, Petr Baudis wrote: >> I'm looking into ways to run a part of my pipeline multi-threaded: > ..snip.. >> (i) I'm using UIMAfit heavily, and multiple CAS multipliers and >> mergers (even within the parallel branches). So I can't use CPE. >> >> (ii) I need multi-threading, not separate processes. (I have just >> a meager 24G RAM (sigh) and one Java process with all the linguistic >> models and stuff loaded takes 3GB RAM. So I really need to load these >> resources to memory only once.) > ..snip.. >> However, (before actually trying) it still seems to me to be much >> easier to rewrite a piece of the stock ASB than use UIMA-AS with complex >> pipeline construed by UIMAfit... So I think I will try that first (and >> report back). > Whew, this was not so easy! It took a good few days (and a few > start-overs) to do and debug, and I learnt more about UIMAj internals > than I ever cared to. ;-) But I think I'm still happier with the result > than if I used UIMA-AS and it doesn't seem to deadlock or crash anymore > even on (IMHO) a fairly massive pipeline. > > (What I'm bothered by the most at this point is the fixed-size CAS > pool, though there are a few more issues; I tried to document them all > as well.) > > P.S.: Would there be any interest in merging this to UIMA proper, > or at least cleaning up some UIMA API bits to simplify and future-proof > the external package? I admit up-front that I probably won't have time > to do all that work myself, but I'd be happy to cooperate with someone. >
