Trust me in this. I will do nothing that means a major performance degradation for low-level annotators.
It's interesting that you all seem to be expecting a performance degradation. I'm really hoping for an improvement :-) I'll be disappointed if an object-based heap is slower than what we have now, and maybe then we should not switch. Let's discuss performance when we actually have some numbers to discuss. --Thilo Marshall Schor wrote: > Adam Lally wrote: >> On 10/17/07, Thilo Goetz <[EMAIL PROTECTED]> wrote: >> >>> I'm thinking about experimenting with alternative heap >>> implementations in the CAS. In particular, I would like >>> to try out a heap impl that uses regular Java objects to >>> represent feature structures, as opposed to our proprietary >>> binary heap. >>> <snip/> >>> >> My two cents: I'm in favor of experimenting with a new heap >> implementation. For co-located deployments Java object overhead >> should not be an issue at all, since in almost all cases we end up >> creating a Java object for each FeatureStructure anyway. > Except for one -- maybe major -- case of several commercial (and maybe > research) implementations using low-level CAS interfaces for > performance, for components like tokenizers, which have short execution > paths. I agree the overhead won't be there if a later annotator then > uses JCas (or plain CAS, which also creates Java objects when iterating > (when not using the low-level APIs) in the co-located case, and iterates > over the tokens. > > -Marshall >> However for >> remote services I think it's a different story. Services may only >> access some of the objects in the CAS and therefore in the current >> implementation we never have to create Java objects for many of them. >> I don't know how significant this is though, since as you said JREs >> have gotten much better about their object creation overhead and >> per-object memory footprint. >> >> Also what about the logistics of manging the source code - would this >> work be done in a separate branch? >> >> -Adam >> >> >>
