Hi, just saw this ...

I'll take a look.  This kind of thing is "on the list" for uima v3; see
https://cwiki.apache.org/confluence/display/UIMA/Ideas+for+UIMAJ+v3

-Marshall

On 6/14/2015 8:20 PM, Petr Baudis wrote:
>   Hi!
>
>   I have created an extension of UIMA that replaces its default ASB
> with a multi-threaded one, so that if you have a CAS multiplier in your
> pipeline, multiple generated CASes may be processed in parallel in
> different threads.  It has a few warts, but should be generally much
> simpler to use than UIMA-AS if you do not need fancy things like cluster
> deployment.
>
>   It even has some documentation now.  Find it at:
>
>       
> https://github.com/brmson/yodaqa/tree/master/src/main/java/cz/brmlab/yodaqa/flow/asb
>
>   (Right now, it just lives as part of my YodaQA software, simply copy
> that directory to your project.  I can spin-off the package properly
> if there'll be enough interest in it.  It shares the YodaQA licence
> statement, i.e. ASL2.)
>
>
> On Wed, May 20, 2015 at 03:27:20AM +0200, Petr Baudis wrote:
>>   I'm looking into ways to run a part of my pipeline multi-threaded:
> ..snip..
>>   (i) I'm using UIMAfit heavily, and multiple CAS multipliers and
>> mergers (even within the parallel branches).  So I can't use CPE.
>>
>>   (ii) I need multi-threading, not separate processes.  (I have just
>> a meager 24G RAM (sigh) and one Java process with all the linguistic
>> models and stuff loaded takes 3GB RAM.  So I really need to load these
>> resources to memory only once.)
> ..snip..
>>   However, (before actually trying) it still seems to me to be much
>> easier to rewrite a piece of the stock ASB than use UIMA-AS with complex
>> pipeline construed by UIMAfit...  So I think I will try that first (and
>> report back).
>   Whew, this was not so easy!  It took a good few days (and a few
> start-overs) to do and debug, and I learnt more about UIMAj internals
> than I ever cared to. ;-)  But I think I'm still happier with the result
> than if I used UIMA-AS and it doesn't seem to deadlock or crash anymore
> even on (IMHO) a fairly massive pipeline.
>
>   (What I'm bothered by the most at this point is the fixed-size CAS
> pool, though there are a few more issues; I tried to document them all
> as well.)
>
>   P.S.: Would there be any interest in merging this to UIMA proper,
> or at least cleaning up some UIMA API bits to simplify and future-proof
> the external package?  I admit up-front that I probably won't have time
> to do all that work myself, but I'd be happy to cooperate with someone.
>

Reply via email to