My low-memory sorting/faceting-hacking requires terms to be accessed by ordinals. With Lucene 4.0 I cannot depend on TermsEnums supporting ord() and seek(long), so the code switches to a cache that keeps track of every X terms if they are not implemented. When the terms for an ordinal is requested, it jumps to the nearest previously cached term and calls next() from there until the ordinal matches. So far so good.
Two methods for seeking terms are: seek(BytesRef text) and seek(BytesRef term, TermState state). The JavaDoc indicates that the seek with TermState is (potentially) the fastest in this scenario as implementations can seek very efficient using a custom TermState. My problem is that I am going for low memory and it seems that I need to keep track of both BytesRef term and TermState state in order to use this method. This is quite a burden, memory-wise. I tried calling with an empty BytesRef term. This gave me an empty result back for the call itself, but the correct terms for subsequent calls to next. This works perfectly for my scenario. However, that was just an experiment using the default variable gap codec, so I am unsure if I can count on this behavior for any given codec? Any thoughts on how to reduce the memory needed for ordinal-based lookup, without killing performance, would be appreciated. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
