I'm wondering how best to use CasPools in my system. My system is a general service--that is, it may receive concurrent requests with arbitrary AnalysisEngineDescriptions from different applications.
Some requests may use the exact same AED object, in which case, it's obvious that documents processed from those requests can share the same AnalysisEngine and CasPool. However, it's also likely that the system will receive AEDs that are equivalent--that is, they have the same annotators with the same configuration parameters, but they are two physically different AED objects. Now, in this case, it would be possible to use the same AE and CasPool, if there was a way to tell that they were equivalent. Unfortunately, the equals() methods on AnalysisEngineDescription and AnalysisEngine won't tell me this. So what I currently do is create separate CasPools. Is it worth it for performance and memory usage to write a method to compare two AEDs to determine if they are equivalent? Or is creating CAS's and CasPool not expensive enough to justify the work, and I should just continue with separate CasPools? Going further, it appears that two AnalysisEngines could share the same CasPool if only their type systems are the same--the AE's themselves don't event have to be the same (could have different configuration parameter values, for example). They merely need the same CasDefinition. Is there an easy way to determine if two AEDs have the same type system or CasDefinition, and so could share a CasPool? Thanks, Greg Holmberg
