I had previously described that when I used XmiCasSerializer with many (10) 
concurrent AnalysisEngines, my throughput dropped to about half, and wasn't 
scaling up.

I did some profiling of my code using JProbe, and I think I've found the 
problem.

I discovered that my application spent 64% of its elapsed time in 
XmiCasSerializer and it's child methods.  Within that, one method rose to the 
top: 72% of elapsed time was spent in TypeSystemImpl.ll_isValidTypeCode().  In 
fact, this exceeded the time spent in XmiCasSerializer (114%).

This in turn was almost all in SymbolTable.getSymbol().  This was called over 
17 million times in my application, which spent 72% of its elapsed time in this 
one method.  99.9% of its time was spent in itself, and not it's children 
(Vector.get(int) was the highest child, at 0.1%).

I'm not exactly sure why this method takes so long.  I suspect it's a 
concurrency issue.  I see a synchronized block in the set() method, so that 
would be something to look into.  Given that some of my AnalysisEngines may be 
inserting annotations while others are executing XmiCasSerializer, I can see 
potential for conflict.

Hopefully, these clues will be enough for someone familiar with the code to 
figure it out.


Greg Holmberg

Reply via email to