I'm wondering how best to use CasPools in my system.

My system is a general service--that is, it may receive concurrent requests 
with arbitrary AnalysisEngineDescriptions from different applications.

Some requests may use the exact same AED object, in which case, it's obvious 
that documents processed from those requests can share the same AnalysisEngine 
and CasPool.

However, it's also likely that the system will receive AEDs that are 
equivalent--that is, they have the same annotators with the same configuration 
parameters, but they are two physically different AED objects.

Now, in this case, it would be possible to use the same AE and CasPool, if 
there was a way to tell that they were equivalent.  Unfortunately, the equals() 
methods on AnalysisEngineDescription and AnalysisEngine won't tell me this.  So 
what I currently do is create separate CasPools.

Is it worth it for performance and memory usage to write a method to compare 
two AEDs to determine if they are equivalent?  Or is creating CAS's and CasPool 
not expensive enough to justify the work, and I should just continue with 
separate CasPools?

Going further, it appears that two AnalysisEngines could share the same CasPool 
if only their type systems are the same--the AE's themselves don't event have 
to be the same (could have different configuration parameter values, for 
example).  They merely need the same CasDefinition. Is there an easy way to 
determine if two AEDs have the same type system or CasDefinition, and so could 
share a CasPool?

Thanks,


Greg Holmberg

Reply via email to