Mike,
I have started playing with this, holly cow.... it is a lot of code
Question
SegmentMerger. mergeFields()... there is a big block
else {
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION_OFFSET),
true, true, true, false);
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION), true,
true, false, false);
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_OFFSET), true,
false, true, false);
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR), true, false, false,
false);
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.STORES_PAYLOADS), false, false,
false, true);
addIndexed(reader, fieldInfos,
reader.getFieldNames(IndexReader.FieldOption.INDEXED), false, false, false,
false);
fieldInfos.add(reader.getFieldNames(IndexReader.FieldOption.UNINDEXED),
false);
}
I simply do not understand it, have changed addIndexed(...) signature to
include omitTf, but I am sure what needs to be done here?
----- Original Message ----
> From: Michael McCandless <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Friday, 18 July, 2008 11:48:20 AM
> Subject: Re: Index without tf, anyone?
>
> I just committed LUCENE-1301, which is a first step (top down) towards
> flexible indexing. I hope I didn't break anything....
>
> While flexible indexing should make this simpler, it's not too bad to
> modify Lucene to do this today, if you want. I think this is what
> you'll need to do (but I haven't tested!):
>
> * Add something to Fieldable/AbstractField/Field that "knows"
> whether a field should store the tf. Also add this to
> FieldInfo.java, and make sure that bit is saved to the fnm file.
>
> * In the new oal.index.DocFieldProcessorPerThread, in the
> processDocument method, fix the FieldInfos.add call to also pass
> in your new "storeTermFreq" bit. Probably, assert that this
> cannot change -- ie a field must be created with
> storeTermFreq=true or false and must never change.
>
> * The new oal.index.FreqProxTermsWriter, in appendPostings, has the
> code that creates a new segment. Change that to skip writing tf
> if the FieldInfo says so.
>
> * Fix SegmentTermDocs to not read tf if FieldInfo says so.
>
> * Fix SegmentMerger.appendPostings to not merge/write tf if
> FieldInfo says so. Likewise assert here that the "storeTermFreq"
> does not change in the merged segments.
>
> It's also possible to fix FreqProxTermsWriterPerField to not even
> compute & store the tf in its RawPostingList, per term. This is an
> optimization (saves RAM & CPU) that you can do after first getting the
> above working...
>
> On the search side, you'll need to fix scoring to be OK with tf=0.
>
> I think this would be a useful addition to Lucene (it comes up every
> so often), even before we fully work out flexible indexing.
>
> Mike
>
> eks dev wrote:
>
> > hi all,
> > is there any solution to have pure postings lists without
> > interleaved tf ... this eats a lot of CPU for VInt decoding on dense
> > terms (also doubles IO...) in our case. Can be a untested patch,
> > tips how to do it or whatever... I know about flexible indexing, but
> > cannot wait (I guess it will take some time?).
> >
> > Does it make sense to start working on it? Can be this somehow later
> > incorporated into Flexible Indexing... I hate to do it and than
> > throw it away whem Mike doe his magic with Flexible Indexing.
> >
> > Simply we are sure this could help performance a lot (some dense
> > fields have always constant tf, no need to read them from index).
> > Simply asking for help if somebody accidently happens to have some
> > Quick 'n Dirty solution/idea.
> >
> > thanks, eks
> >
> >
> >
> > __________________________________________________________
> > Not happy with your email address?.
> > Get the one you really want - millions of new email addresses
> > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
__________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at
Yahoo! http://uk.docs.yahoo.com/ymail/new.html
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]