On Wed, Feb 10, 2010 at 12:33:27PM -0500, Michael McCandless wrote: > In Lucene, skipping is done through the aggregator,
I had a look at MultiDocsEnum in the flex blanch. It doesn't know when sub-enum is reading skip data. > > I suppose another possibility would have been to have the aggregator > > keep its own Posting and copy all data over from the > > SegPostingList's Posting on each iteration then add its offset. > > I think this is what Lucene does (?). EG the aggregator holds its own > "int doc" which it must copy to (adding the offset) from the > underlying sub enum. That's fine for a *primitive* type. Modifying an int returned by a sub-enum doesn't affect the sub-enum. :) The problem arises when there's an opaque *object* conveying data to the consumer. The aggregator knows everything there is to know about an int, but it doesn't know what it needs to do to prepare an opaque object owned by the sub-enum for consumption at the aggregate level. > > However, that would have been a lot less efficient, and it still > > wouldn't have worked for the "flat positions space" example because > > the generic aggregator would not have known about the needs of the > > specific codec. > > But aggregator could also add the positions offset on each > nextPosition() call, in Lucene. Like that use case could be made to > work, if Lucene had used a flat position space. A generic aggregator wouldn't know that it needed to do that. The postings codec developer would be forced to write aggregation code in addition to segment-level code. Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org