Mike, 
I have started playing with this, holly cow.... it is a lot of code 

Question 

SegmentMerger. mergeFields()... there is a big block

else {
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION_OFFSET), 
true, true, true, false);
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION), true, 
true, false, false);
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_OFFSET), true, 
false, true, false);
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR), true, false, false, 
false);
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.STORES_PAYLOADS), false, false, 
false, true);
        addIndexed(reader, fieldInfos, 
reader.getFieldNames(IndexReader.FieldOption.INDEXED), false, false, false, 
false);
        fieldInfos.add(reader.getFieldNames(IndexReader.FieldOption.UNINDEXED), 
false);
      }


I simply do not understand it, have changed addIndexed(...) signature to 
include omitTf, but I am sure what needs to be done here?





----- Original Message ----
> From: Michael McCandless <[EMAIL PROTECTED]>
> To: java-dev@lucene.apache.org
> Sent: Friday, 18 July, 2008 11:48:20 AM
> Subject: Re: Index without tf, anyone?
> 
> I just committed LUCENE-1301, which is a first step (top down) towards
> flexible indexing.  I hope I didn't break anything....
> 
> While flexible indexing should make this simpler, it's not too bad to
> modify Lucene to do this today, if you want.  I think this is what
> you'll need to do (but I haven't tested!):
> 
>    * Add something to Fieldable/AbstractField/Field that "knows"
>      whether a field should store the tf.  Also add this to
>      FieldInfo.java, and make sure that bit is saved to the fnm file.
> 
>    * In the new oal.index.DocFieldProcessorPerThread, in the
>      processDocument method, fix the FieldInfos.add call to also pass
>      in your new "storeTermFreq" bit.  Probably, assert that this
>      cannot change -- ie a field must be created with
>      storeTermFreq=true or false and must never change.
> 
>    * The new oal.index.FreqProxTermsWriter, in appendPostings, has the
>      code that creates a new segment.  Change that to skip writing tf
>      if the FieldInfo says so.
> 
>    * Fix SegmentTermDocs to not read tf if FieldInfo says so.
> 
>    * Fix SegmentMerger.appendPostings to not merge/write tf if
>      FieldInfo says so.  Likewise assert here that the "storeTermFreq"
>      does not change in the merged segments.
> 
> It's also possible to fix FreqProxTermsWriterPerField to not even
> compute & store the tf in its RawPostingList, per term.  This is an
> optimization (saves RAM & CPU) that you can do after first getting the
> above working...
> 
> On the search side, you'll need to fix scoring to be OK with tf=0.
> 
> I think this would be a useful addition to Lucene (it comes up every
> so often), even before we fully work out flexible indexing.
> 
> Mike
> 
> eks dev wrote:
> 
> > hi all,
> > is there any solution to have pure postings lists without  
> > interleaved tf ... this eats a lot of CPU for VInt decoding on dense  
> > terms (also doubles IO...)  in our case. Can be a untested patch,  
> > tips how to do it or whatever... I know about flexible indexing, but  
> > cannot wait (I guess it will take some time?).
> >
> > Does it make sense to start working on it? Can be this somehow later  
> > incorporated into Flexible Indexing... I hate to do it and than  
> > throw it away whem Mike doe his magic with Flexible Indexing.
> >
> > Simply we are sure this could help performance a lot (some dense  
> > fields have always constant tf, no need to read them from index).  
> > Simply asking for help if somebody accidently happens to have some  
> > Quick 'n Dirty solution/idea.
> >
> > thanks, eks
> >
> >
> >
> >      __________________________________________________________
> > Not happy with your email address?.
> > Get the one you really want - millions of new email addresses  
> > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at 
Yahoo! http://uk.docs.yahoo.com/ymail/new.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to