On 9/16/11 12:12 PM, Alexander Klenner wrote:
Hi Jörn,
we are using a combination of UNICORE and UIMA, I think it'd take too long to explain the
complete architecture but in a nutshell: unicore does all the workflow stuff (passing
XCAS obejcts to next job and so on) and all our unicore "grid beans" are
independent uima applications (XCAS in - AE - XCAS out), so far it works really fine but
at some point we have to merge our CAS objects. The nice thing about this is we can use
the unicore rich client to drag and drop any uima AEs into the workflow. It's like a GUI
for UIMA AEs now. We have a little test pipeline with 60 diverse documents (8 million
characters) and we have some simple and also some complex AEs that need 180 minutes if
all AEs are single threaded, this can be reduced to 60 minutes with multithreaded AEs (4
instances each) and it goes down to 20 minutes if we let them do the annotation at the
same time using 4 threads each. So it actually does make a difference if the annotation
is done one after another or at the same time.
I see, it is because you can utilize more cores this way with a small
set of text items. I assumed you have thousands or millions
of text items.
Jörn