also, another one:

what should happen with payloads and omitTf options in case
op
storePayloads==true && omitTf==true 
shold we say:
1. ignore omitTf and go on with payloads
or
2. disable payloads  and omit tf

other combination are clear



----- Original Message ----
> From: eks dev <[EMAIL PROTECTED]>
> To: java-dev@lucene.apache.org
> Sent: Friday, 18 July, 2008 9:20:09 PM
> Subject: Re: Index without tf, anyone?
> 
> Mike, 
> I have started playing with this, holly cow.... it is a lot of code 
> 
> Question 
> 
> SegmentMerger. mergeFields()... there is a big block
> 
> else {
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION_OFFSET),
>  
> true, true, true, false);
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION), true, 
> true, false, false);
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_OFFSET), true, 
> false, true, false);
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR), true, false, false, 
> false);
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.STORES_PAYLOADS), false, false, 
> false, true);
>         addIndexed(reader, fieldInfos, 
> reader.getFieldNames(IndexReader.FieldOption.INDEXED), false, false, false, 
> false);
>         
> fieldInfos.add(reader.getFieldNames(IndexReader.FieldOption.UNINDEXED), 
> false);
>       }
> 
> 
> I simply do not understand it, have changed addIndexed(...) signature to 
> include 
> omitTf, but I am sure what needs to be done here?
> 
> 
> 
> 
> 
> ----- Original Message ----
> > From: Michael McCandless 
> > To: java-dev@lucene.apache.org
> > Sent: Friday, 18 July, 2008 11:48:20 AM
> > Subject: Re: Index without tf, anyone?
> > 
> > I just committed LUCENE-1301, which is a first step (top down) towards
> > flexible indexing.  I hope I didn't break anything....
> > 
> > While flexible indexing should make this simpler, it's not too bad to
> > modify Lucene to do this today, if you want.  I think this is what
> > you'll need to do (but I haven't tested!):
> > 
> >    * Add something to Fieldable/AbstractField/Field that "knows"
> >      whether a field should store the tf.  Also add this to
> >      FieldInfo.java, and make sure that bit is saved to the fnm file.
> > 
> >    * In the new oal.index.DocFieldProcessorPerThread, in the
> >      processDocument method, fix the FieldInfos.add call to also pass
> >      in your new "storeTermFreq" bit.  Probably, assert that this
> >      cannot change -- ie a field must be created with
> >      storeTermFreq=true or false and must never change.
> > 
> >    * The new oal.index.FreqProxTermsWriter, in appendPostings, has the
> >      code that creates a new segment.  Change that to skip writing tf
> >      if the FieldInfo says so.
> > 
> >    * Fix SegmentTermDocs to not read tf if FieldInfo says so.
> > 
> >    * Fix SegmentMerger.appendPostings to not merge/write tf if
> >      FieldInfo says so.  Likewise assert here that the "storeTermFreq"
> >      does not change in the merged segments.
> > 
> > It's also possible to fix FreqProxTermsWriterPerField to not even
> > compute & store the tf in its RawPostingList, per term.  This is an
> > optimization (saves RAM & CPU) that you can do after first getting the
> > above working...
> > 
> > On the search side, you'll need to fix scoring to be OK with tf=0.
> > 
> > I think this would be a useful addition to Lucene (it comes up every
> > so often), even before we fully work out flexible indexing.
> > 
> > Mike
> > 
> > eks dev wrote:
> > 
> > > hi all,
> > > is there any solution to have pure postings lists without  
> > > interleaved tf ... this eats a lot of CPU for VInt decoding on dense  
> > > terms (also doubles IO...)  in our case. Can be a untested patch,  
> > > tips how to do it or whatever... I know about flexible indexing, but  
> > > cannot wait (I guess it will take some time?).
> > >
> > > Does it make sense to start working on it? Can be this somehow later  
> > > incorporated into Flexible Indexing... I hate to do it and than  
> > > throw it away whem Mike doe his magic with Flexible Indexing.
> > >
> > > Simply we are sure this could help performance a lot (some dense  
> > > fields have always constant tf, no need to read them from index).  
> > > Simply asking for help if somebody accidently happens to have some  
> > > Quick 'n Dirty solution/idea.
> > >
> > > thanks, eks
> > >
> > >
> > >
> > >      __________________________________________________________
> > > Not happy with your email address?.
> > > Get the one you really want - millions of new email addresses  
> > > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available now 
> at 
> Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at 
Yahoo! http://uk.docs.yahoo.com/ymail/new.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to