Thanks Simon for the quick update... We always have uniform docs with same set of fields added and that led to the confusion.
-- Ravi On Wed, Mar 20, 2013 at 6:33 PM, Simon Willnauer <simon.willna...@gmail.com>wrote: > The BitSet basically counts how many documents have one or more values > in this field. Some docs might not have values in this field. > state.segmentInfo.getDocCount() is the # of docs in this segment but > we are flushing a single field here. We pass down the cardinality > here since > we keep the statistics of the doc count per field in the index since > 4.0 so we can't use the segmetns doc count. > > hope that helps > > simon > > On Wed, Mar 20, 2013 at 1:12 PM, Ravikumar Govindarajan > <ravikumar.govindara...@gmail.com> wrote: > > This is an internal code I came across in lucene today and unable to > > decipher it. > > > > FreqProxTermsWriterPerField.java > > > > void flush(String fieldName, FieldsConsumer consumer, final > > SegmentWriteState state) > > { > > ............. > > FixedBitSet visitedDocs = new > FixedBitSet(state.segmentInfo.getDocCount()); > > for (int i = 0; i < numTerms; i++) > > { > > ............. > > visitedDocs.set(docID); > > ......... > > termsConsumer.finishTerm(text, new TermStats(docFreq, writeTermFreq ? > > totTF : -1)); *//We plan to pass the state.segmentInfo.getDocCount() in > > TermStats, above. Is it * > > * wrong to do this here?* > > } > > //Once all terms are over > > termsConsumer.finish(writeTermFreq ? sumTotalTermFreq : -1, sumDocFreq, > > visitedDocs.cardinality()); *//Why are we doing cardinality() instead of > > getDocCount() here?* > > *//Can there be un-visited docs during a flush?* > > } > > * > > * > > Can someone help me understand this? > > > > -- > > Ravi > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >