I had previously described that when I used XmiCasSerializer with many (10) concurrent AnalysisEngines, my throughput dropped to about half, and wasn't scaling up.
I did some profiling of my code using JProbe, and I think I've found the problem. I discovered that my application spent 64% of its elapsed time in XmiCasSerializer and it's child methods. Within that, one method rose to the top: 72% of elapsed time was spent in TypeSystemImpl.ll_isValidTypeCode(). In fact, this exceeded the time spent in XmiCasSerializer (114%). This in turn was almost all in SymbolTable.getSymbol(). This was called over 17 million times in my application, which spent 72% of its elapsed time in this one method. 99.9% of its time was spent in itself, and not it's children (Vector.get(int) was the highest child, at 0.1%). I'm not exactly sure why this method takes so long. I suspect it's a concurrency issue. I see a synchronized block in the set() method, so that would be something to look into. Given that some of my AnalysisEngines may be inserting annotations while others are executing XmiCasSerializer, I can see potential for conflict. Hopefully, these clues will be enough for someone familiar with the code to figure it out. Greg Holmberg
