"What's the point of using a sorted interval list for a category?"
Just terminology first to avoid misunderstanding :), category is "category field" that can take N valus Now, the case I am facing goes as follows: I have category field in 50Mio collection which has more or less uniform distribution of values, imagine 100 values (terms) each with 0.5Mio documents. when I sort index on it, bit vectors look like 00000000......1111111111....00000000 So, if I cache this in any other representation (Filter usage, no scoring contribution as typical for categories) this adds up in memory rather quickly. With interval list for Filter/Matcher I need 2 Ints to store cache entry per value/term and have really fast next()-SkipTo(). In total 200 Ints for sorted category field with 100 values.... Not an unusual case I guess, a lot of people can sort their indexes, I guess at least. Of course, it is a bit simplified, as we should not forget dynamic nature of the index. In real life, you can afford to sort your index once per month or once per week, but small percentage of updates do not introduce significant problems for interval lists. Does this make sense to you, or I misread your aproach? What woud be sexy to do, is to use such representation for postings storage, so each postings list would have type field that determines optimal bit vector representation for the term (high level, did not look into it yet in details). So instead of VInts, we could have interval lists for such cases. This would be equivalent to RLE encoding... which can be done fast I guess. This field type could be determined in some kind of *oprional* post-optimize proces that scanns postings and tries to find optimal "compresion", only VInt, VInt with rle for "predominantly sorted" postings.... ANd this is not limited to rle only, I could imagine some smart people to come up with even smarter things... It looks like good idea to me, but I would not be surprised if somebody proves that I missed the point totaly :) --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]