Just to clarify, what do you mean by "annotation"?  Is there a specific
Analysis Engine that you are using? What is a "record"? Is this a
document?  It would actually be surprizing for many applications if
annotation were not the bottleneck, given that some annotation processes
are quite expensive, but this doesn't seem like what you mean here. I
can't tell from your question whether it is the process that determines
the annotations that is a burden or the actual adding of the annotations
to the cas.

-----Original Message-----
From: rohan rai [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 26, 2008 7:36 AM
To: [email protected]
Subject: Annotation (Indexing) a bottleneck in UIMA in terms of speed

When I profile a UIMA application
What I see that annonation takes a lot of time If I profile I see that
to annotate 1 record , it takes around 0.06 seconds Now you may say its
good Now scale up Although it does not scale up linearly. But here is
rough estimate on experiments done 6000 records take 6 min to annotate
800000 record tale around 10 hrs min to annotate Which is bad.
One thing is that I am treating each record individually as a cas Even
if I treat all the record as a single cas it takes around 6-7 hrs Which
is still not good in terms of speed

Is there a way out?
Can I improve performance by any means??

Regards
Rohan

Reply via email to