am I boring :)

would it be ok to assume tf == 1 always if we use omitTf? In that case docDelta 
remains odd and current index format interprets this as tf==1... if all terms 
have tf == 1 , relative score is factored out, so it makes no diference.
  

In that case, there is no need to change anything on reader side! 


----- Original Message ----
> From: eks dev <[EMAIL PROTECTED]>
> To: java-dev@lucene.apache.org
> Sent: Friday, 18 July, 2008 9:48:04 PM
> Subject: Re: Index without tf, anyone?
> 
> also, another one:
> 
> what should happen with payloads and omitTf options in case
> op
> storePayloads==true && omitTf==true 
> shold we say:
> 1. ignore omitTf and go on with payloads
> or
> 2. disable payloads  and omit tf
> 
> other combination are clear
> 
> 
> 
> ----- Original Message ----
> > From: eks dev 
> > To: java-dev@lucene.apache.org
> > Sent: Friday, 18 July, 2008 9:20:09 PM
> > Subject: Re: Index without tf, anyone?
> > 
> > Mike, 
> > I have started playing with this, holly cow.... it is a lot of code 
> > 
> > Question 
> > 
> > SegmentMerger. mergeFields()... there is a big block
> > 
> > else {
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION_OFFSET),
> >  
> 
> > true, true, true, false);
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_POSITION), 
> > true, 
> > true, false, false);
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR_WITH_OFFSET), true, 
> > false, true, false);
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.TERMVECTOR), true, false, 
> > false, 
> > false);
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.STORES_PAYLOADS), false, 
> > false, 
> > false, true);
> >         addIndexed(reader, fieldInfos, 
> > reader.getFieldNames(IndexReader.FieldOption.INDEXED), false, false, false, 
> > false);
> >        
> fieldInfos.add(reader.getFieldNames(IndexReader.FieldOption.UNINDEXED), 
> > false);
> >       }
> > 
> > 
> > I simply do not understand it, have changed addIndexed(...) signature to 
> include 
> > omitTf, but I am sure what needs to be done here?
> > 
> > 
> > 
> > 
> > 
> > ----- Original Message ----
> > > From: Michael McCandless 
> > > To: java-dev@lucene.apache.org
> > > Sent: Friday, 18 July, 2008 11:48:20 AM
> > > Subject: Re: Index without tf, anyone?
> > > 
> > > I just committed LUCENE-1301, which is a first step (top down) towards
> > > flexible indexing.  I hope I didn't break anything....
> > > 
> > > While flexible indexing should make this simpler, it's not too bad to
> > > modify Lucene to do this today, if you want.  I think this is what
> > > you'll need to do (but I haven't tested!):
> > > 
> > >    * Add something to Fieldable/AbstractField/Field that "knows"
> > >      whether a field should store the tf.  Also add this to
> > >      FieldInfo.java, and make sure that bit is saved to the fnm file.
> > > 
> > >    * In the new oal.index.DocFieldProcessorPerThread, in the
> > >      processDocument method, fix the FieldInfos.add call to also pass
> > >      in your new "storeTermFreq" bit.  Probably, assert that this
> > >      cannot change -- ie a field must be created with
> > >      storeTermFreq=true or false and must never change.
> > > 
> > >    * The new oal.index.FreqProxTermsWriter, in appendPostings, has the
> > >      code that creates a new segment.  Change that to skip writing tf
> > >      if the FieldInfo says so.
> > > 
> > >    * Fix SegmentTermDocs to not read tf if FieldInfo says so.
> > > 
> > >    * Fix SegmentMerger.appendPostings to not merge/write tf if
> > >      FieldInfo says so.  Likewise assert here that the "storeTermFreq"
> > >      does not change in the merged segments.
> > > 
> > > It's also possible to fix FreqProxTermsWriterPerField to not even
> > > compute & store the tf in its RawPostingList, per term.  This is an
> > > optimization (saves RAM & CPU) that you can do after first getting the
> > > above working...
> > > 
> > > On the search side, you'll need to fix scoring to be OK with tf=0.
> > > 
> > > I think this would be a useful addition to Lucene (it comes up every
> > > so often), even before we fully work out flexible indexing.
> > > 
> > > Mike
> > > 
> > > eks dev wrote:
> > > 
> > > > hi all,
> > > > is there any solution to have pure postings lists without  
> > > > interleaved tf ... this eats a lot of CPU for VInt decoding on dense  
> > > > terms (also doubles IO...)  in our case. Can be a untested patch,  
> > > > tips how to do it or whatever... I know about flexible indexing, but  
> > > > cannot wait (I guess it will take some time?).
> > > >
> > > > Does it make sense to start working on it? Can be this somehow later  
> > > > incorporated into Flexible Indexing... I hate to do it and than  
> > > > throw it away whem Mike doe his magic with Flexible Indexing.
> > > >
> > > > Simply we are sure this could help performance a lot (some dense  
> > > > fields have always constant tf, no need to read them from index).  
> > > > Simply asking for help if somebody accidently happens to have some  
> > > > Quick 'n Dirty solution/idea.
> > > >
> > > > thanks, eks
> > > >
> > > >
> > > >
> > > >      __________________________________________________________
> > > > Not happy with your email address?.
> > > > Get the one you really want - millions of new email addresses  
> > > > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > >
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> > 
> >       __________________________________________________________
> > Not happy with your email address?.
> > Get the one you really want - millions of new email addresses available now 
> > at 
> 
> > Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available now 
> at 
> Yahoo! http://uk.docs.yahoo.com/ymail/new.html
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at 
Yahoo! http://uk.docs.yahoo.com/ymail/new.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to