When I run the following loop it takes about 6 seconds on my 2GHz machine:

for(int i=0; i<10000; i++) {

jCas.reset();

}

Which comes out to a .6 milliseconds per call. This is pretty slow for cases in which you have many short documents. For example, this would add 10 minutes of processing time for 1M document corpus. Is this a known issue and is there anything that I can do to minimize this impact?

Thanks,

Philip

Reply via email to