Hi Rohan, 6000 records in 6 min => 1000/min 800000 records in 10H => 1333/min
Not that bad! I guess one of these numbers is wrong. Are you distributing the load across several machines? Vinci is not that good for load balancing across a lot of machines (>20-50, depending your annotator speed). Pascal > -----Original Message----- > From: rohan rai [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 26, 2008 7:36 AM > To: [email protected] > Subject: Annotation (Indexing) a bottleneck in UIMA in terms of speed > > When I profile a UIMA application > What I see that annonation takes a lot of time > If I profile I see that to annotate 1 record , it takes around 0.06 > seconds > Now you may say its good > Now scale up > Although it does not scale up linearly. But here is rough estimate on > experiments done > 6000 records take 6 min to annotate > 800000 record tale around 10 hrs min to annotate > Which is bad. > One thing is that I am treating each record individually as a cas > Even if I treat all the record as a single cas it takes around 6-7 hrs > Which is still not good in terms of speed > > Is there a way out? > Can I improve performance by any means?? > > Regards > Rohan
