Hi Robust Links: I think you want to build a class that implements the LeafCollector. For example:
public class theLeafCollectorDocid implements LeafCollector { theLeafCollectorDocid( final LeafReaderContext context ) { } collect( int doc ) { } } Once you done this then build another class that implements the Collector. For example: public class docCollectorKeyDocid implements Collector { public LeafCollector getLeafCollector( final LeafReaderContext context ) { final LeafCollector tlc = new theLeafCollectorDocid(context ); } } This will, I believe, allow you to realize your goal. regards, west suhanic On Wed, Apr 29, 2015 at 10:41 AM, Robust Links <pey...@robustlinks.com> wrote: > Hi > > I need help porting my lucene code from 4 to 5. In particular, I need to > customize a collector (to collect all doc Ids in the index - which can be > >30MM docs..). Below is how I achieved this in lucene 4. Is there some > guidelines how to do this in lucene 5, specially on semantics changes of > AtomicReaderContext (which seems deprecated) and the new LeafReaderContext? > > thank you in advance > > > public class CustomCollector extends Collector { > > private HashSet<String> data = new HashSet<String>(); > > private Scorer scorer; > > private int docBase; > > private BinaryDocValues dataList; > > > public boolean acceptsDocsOutOfOrder() { > > return true; > > } > > public void setScorer(Scorer scorer) { > > this.scorer = scorer; > > } > > public void setNextReader(AtomicReaderContext ctx) throws IOException{ > > this.docBase = ctx.docBase; > > dataList = FieldCache.DEFAULT.getTerms(ctx.reader(),"title",false); > > } > > public void collect(int doc) throws IOException { > > BytesRef t = new BytesRef(); > > dataList(doc); > > if (t.bytes != BytesRef.EMPTY_BYTES && t.bytes != BytesRef.EMPTY_BYTES) { > > data((t.utf8ToString())); > > } > > } > > public void reset() { > > data.clear(); > > dataList = null; > > } > > public HashSet<String> getData() { > > return data; > > } > > } >