Eks, Op Tuesday 19 February 2008 21:48:03 schreef eks dev: ... > > >Btw. there is some room in SortedVIntList to add interval > >coding. Normally the VInt value 0 cannot occur in the current > >version, and this could be used as a prefix to encode a run of > >set bits. > > > > I like this! I was just experimenting with > > int[] leftIntervalExtreme > int[] intervalLength > representation of interval lists, this has one nice feature, you can > binary search left intervals for really fast long skipTo(), but has > somewhat higher memory consumption in case bit vector gets ugly > distributed... SortedVIntList with RLEncoding could prove more robust > in that sense.
skipTo() on a SortedVIntList as it stands is not nice, it's a linear search. I'd like to add skip info to it, much like the multilevel skip info that was added to the index not too long ago. With that addition, skipTo() on a SortedVintList should be ok, too. At that point, it might also be possible to split the underlying byte[] into maximum size blocks, as Robert suggested. > Friend of mine sent me this link, looks very > interesting > http://repositories.cdlib.org/cgi/viewcontent.cgi?article=3104&contex >t=lbnl First impression: Nice article, good for relational dbs, and for bitwise boolean ops. In Lucene there is normally the need to score each matching doc though, and for that the doc number is needed, and that does not really fit in the data structures discussed in the article. Regards, Paul Elschot > Op Tuesday 19 February 2008 12:58:34 schreef eks dev: > > hi Mark, > > > > just out of curiosity, do you know the distribution of set bits in > > these terms you have tried to cache? maybe this simple tip could > > help. > > If you are lucky like we were, such terms typically used for > > filters are good candidates to be used to sort your index before > > indexing (once in a while) and then with some sort of > > IntervalDocIdSet you can reduce memory requirements dramatically. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]