On Fri, Mar 19, 2010 at 1:46 PM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > From: Robert Muir [rcm...@gmail.com]: >> Toke, only partially-on-topic here, is it possible to describe your >> use-case a little more where its preferable to use this Locale-based >> sort instead of indexing collation keys (e.g. you have to support so >> many locales this would be too much indexing overhead?) > > My original use case was to avoid the memory overhead: Looking at our current > index, we have ~7.5M documents with ~7M unique titles. They take up about > 362MB as UTF-8 bytes, which translates to a neat 1GB of RAM as Java Strings. > That's 1GB less heap for other stuff for us, plus a sort is fairly slow. > Indexing collation keys only helps with the speed problem.
I don't really understand this measurement, collation keys are byte[]... (although its true we don't yet encode them this way in flex, I think we should) -- Robert Muir rcm...@gmail.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org