Mike, I have started playing with this, holly cow.... it is a lot of code
Question SegmentMerger. mergeFields()... there is a big block else { addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION_OFFSET), true, true, true, false); addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION), true, true, false, false); addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_OFFSET), true, false, true, false); addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR), true, false, false, false); addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.STORES_PAYLOADS), false, false, false, true); addIndexed(reader, fieldInfos, reader.getFieldNames(IndexReader.FieldOption.INDEXED), false, false, false, false); fieldInfos.add(reader.getFieldNames(IndexReader.FieldOption.UNINDEXED), false); } I simply do not understand it, have changed addIndexed(...) signature to include omitTf, but I am sure what needs to be done here? ----- Original Message ---- > From: Michael McCandless <[EMAIL PROTECTED]> > To: java-dev@lucene.apache.org > Sent: Friday, 18 July, 2008 11:48:20 AM > Subject: Re: Index without tf, anyone? > > I just committed LUCENE-1301, which is a first step (top down) towards > flexible indexing. I hope I didn't break anything.... > > While flexible indexing should make this simpler, it's not too bad to > modify Lucene to do this today, if you want. I think this is what > you'll need to do (but I haven't tested!): > > * Add something to Fieldable/AbstractField/Field that "knows" > whether a field should store the tf. Also add this to > FieldInfo.java, and make sure that bit is saved to the fnm file. > > * In the new oal.index.DocFieldProcessorPerThread, in the > processDocument method, fix the FieldInfos.add call to also pass > in your new "storeTermFreq" bit. Probably, assert that this > cannot change -- ie a field must be created with > storeTermFreq=true or false and must never change. > > * The new oal.index.FreqProxTermsWriter, in appendPostings, has the > code that creates a new segment. Change that to skip writing tf > if the FieldInfo says so. > > * Fix SegmentTermDocs to not read tf if FieldInfo says so. > > * Fix SegmentMerger.appendPostings to not merge/write tf if > FieldInfo says so. Likewise assert here that the "storeTermFreq" > does not change in the merged segments. > > It's also possible to fix FreqProxTermsWriterPerField to not even > compute & store the tf in its RawPostingList, per term. This is an > optimization (saves RAM & CPU) that you can do after first getting the > above working... > > On the search side, you'll need to fix scoring to be OK with tf=0. > > I think this would be a useful addition to Lucene (it comes up every > so often), even before we fully work out flexible indexing. > > Mike > > eks dev wrote: > > > hi all, > > is there any solution to have pure postings lists without > > interleaved tf ... this eats a lot of CPU for VInt decoding on dense > > terms (also doubles IO...) in our case. Can be a untested patch, > > tips how to do it or whatever... I know about flexible indexing, but > > cannot wait (I guess it will take some time?). > > > > Does it make sense to start working on it? Can be this somehow later > > incorporated into Flexible Indexing... I hate to do it and than > > throw it away whem Mike doe his magic with Flexible Indexing. > > > > Simply we are sure this could help performance a lot (some dense > > fields have always constant tf, no need to read them from index). > > Simply asking for help if somebody accidently happens to have some > > Quick 'n Dirty solution/idea. > > > > thanks, eks > > > > > > > > __________________________________________________________ > > Not happy with your email address?. > > Get the one you really want - millions of new email addresses > > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] __________________________________________________________ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]