Re: Disable IDF scoring on certain fields
I believe I have applied the patch correctly. However, I cannot seem to figure out where the similarity class I create should reside. Any tips on that? Thanks, Brian Lamb On Tue, May 17, 2011 at 4:00 PM, Brian Lamb wrote: > Thank you Robert for pointing this out. This is not being used for > autocomplete. I already have another core set up for that :-) > > The idea is like I outlined above. I just want a multivalued field that > treats every term in the field the same so that the only way documents > separate themselves is by an unrelated boost and/or matching on multiple > terms in that field. > > > On Tue, May 17, 2011 at 3:55 PM, Markus Jelsma > wrote: > >> Well, if you're experimental you can try trunk as Robert points out it has >> been fixed there. If not, i guess you're stuck with creating another core. >> >> If this fieldType specifically used for auto-completion? If so, another >> core, >> preferably on another machine, is in my opinion the way to go. >> Auto-completion >> is tough in terms of performance. >> >> Thanks Robert for pointing to the Jira ticket. >> >> Cheers >> >> > Hi Markus, >> > >> > I was just looking at overriding DefaultSimilarity so your email was >> well >> > timed. The problem I have with it is as you mentioned, it does not seem >> > possible to do it on a field by field basis. Has anyone had any luck >> with >> > doing some of the similarity functions on a field by field basis? I have >> > need to do more than one of them and from what I can find, it seems that >> > only computeNorm accounts for the name of the field. >> > >> > Thanks, >> > >> > Brian Lamb >> > >> > On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma >> > >> > wrote: >> > > Hi, >> > > >> > > Although you can configure per field TF (by omitTermFreqAndPositions) >> you >> > > can't >> > > do this for IDF. If you index is only used for this specific purpose >> > > (seems like an auto-complete index) then you can override >> > > DefaultSimilarity and return a static value for IDF. If you still want >> > > IDF for other fields then i >> > > think you have a problem because Solr doesn't yet support per-field >> > > similarity. >> > > >> > > >> > > >> http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/jav >> > > a/org/apache/lucene/search/DefaultSimilarity.java?view=markup >> > > >> > > Cheers, >> > > >> > > > Hi all, >> > > > >> > > > I have a field defined in my schema.xml file as >> > > > >> > > > > > > > positionIncrementGap="1000"> >> > > > >> > > > >> > > > >> > > > >> > > > > > > > >> > > > maxGramSize="25" side="front" /> >> > > > >> > > > >> > > > >> > > > >> > > > > > > > indexed="true" stored="true" required="false" omitNorms="true" /> >> > > > >> > > > I would like do disable IDF scoring on this field. I am not >> interested >> > > > in how rare the term is, I only care if the term is present or not. >> > > > The idea is that if a user does a search for "myfield:dog OR >> > > > myfield:pony", that any document containing dog or pony would be >> > > > scored identically. In the case that both showed up, that record >> would >> > > > be moved to the top but all the records where they both showed up >> > > > would have the same score. >> > > > >> > > > So long story short, how can I disable the idf score for this >> > > > particular field? >> > > > >> > > > Thanks, >> > > > >> > > > Brian Lamb >> > >
Re: Disable IDF scoring on certain fields
Thank you Robert for pointing this out. This is not being used for autocomplete. I already have another core set up for that :-) The idea is like I outlined above. I just want a multivalued field that treats every term in the field the same so that the only way documents separate themselves is by an unrelated boost and/or matching on multiple terms in that field. On Tue, May 17, 2011 at 3:55 PM, Markus Jelsma wrote: > Well, if you're experimental you can try trunk as Robert points out it has > been fixed there. If not, i guess you're stuck with creating another core. > > If this fieldType specifically used for auto-completion? If so, another > core, > preferably on another machine, is in my opinion the way to go. > Auto-completion > is tough in terms of performance. > > Thanks Robert for pointing to the Jira ticket. > > Cheers > > > Hi Markus, > > > > I was just looking at overriding DefaultSimilarity so your email was well > > timed. The problem I have with it is as you mentioned, it does not seem > > possible to do it on a field by field basis. Has anyone had any luck with > > doing some of the similarity functions on a field by field basis? I have > > need to do more than one of them and from what I can find, it seems that > > only computeNorm accounts for the name of the field. > > > > Thanks, > > > > Brian Lamb > > > > On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma > > > > wrote: > > > Hi, > > > > > > Although you can configure per field TF (by omitTermFreqAndPositions) > you > > > can't > > > do this for IDF. If you index is only used for this specific purpose > > > (seems like an auto-complete index) then you can override > > > DefaultSimilarity and return a static value for IDF. If you still want > > > IDF for other fields then i > > > think you have a problem because Solr doesn't yet support per-field > > > similarity. > > > > > > > > > > http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/jav > > > a/org/apache/lucene/search/DefaultSimilarity.java?view=markup > > > > > > Cheers, > > > > > > > Hi all, > > > > > > > > I have a field defined in my schema.xml file as > > > > > > > > > > > positionIncrementGap="1000"> > > > > > > > > > > > > > > > > > > > > > > > > > > > maxGramSize="25" side="front" /> > > > > > > > > > > > > > > > > > > > > > > > indexed="true" stored="true" required="false" omitNorms="true" /> > > > > > > > > I would like do disable IDF scoring on this field. I am not > interested > > > > in how rare the term is, I only care if the term is present or not. > > > > The idea is that if a user does a search for "myfield:dog OR > > > > myfield:pony", that any document containing dog or pony would be > > > > scored identically. In the case that both showed up, that record > would > > > > be moved to the top but all the records where they both showed up > > > > would have the same score. > > > > > > > > So long story short, how can I disable the idf score for this > > > > particular field? > > > > > > > > Thanks, > > > > > > > > Brian Lamb >
Re: Disable IDF scoring on certain fields
Well, if you're experimental you can try trunk as Robert points out it has been fixed there. If not, i guess you're stuck with creating another core. If this fieldType specifically used for auto-completion? If so, another core, preferably on another machine, is in my opinion the way to go. Auto-completion is tough in terms of performance. Thanks Robert for pointing to the Jira ticket. Cheers > Hi Markus, > > I was just looking at overriding DefaultSimilarity so your email was well > timed. The problem I have with it is as you mentioned, it does not seem > possible to do it on a field by field basis. Has anyone had any luck with > doing some of the similarity functions on a field by field basis? I have > need to do more than one of them and from what I can find, it seems that > only computeNorm accounts for the name of the field. > > Thanks, > > Brian Lamb > > On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma > > wrote: > > Hi, > > > > Although you can configure per field TF (by omitTermFreqAndPositions) you > > can't > > do this for IDF. If you index is only used for this specific purpose > > (seems like an auto-complete index) then you can override > > DefaultSimilarity and return a static value for IDF. If you still want > > IDF for other fields then i > > think you have a problem because Solr doesn't yet support per-field > > similarity. > > > > > > http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/jav > > a/org/apache/lucene/search/DefaultSimilarity.java?view=markup > > > > Cheers, > > > > > Hi all, > > > > > > I have a field defined in my schema.xml file as > > > > > > > > positionIncrementGap="1000"> > > > > > > > > > > > > > > > > > > > > maxGramSize="25" side="front" /> > > > > > > > > > > > > > > > > > indexed="true" stored="true" required="false" omitNorms="true" /> > > > > > > I would like do disable IDF scoring on this field. I am not interested > > > in how rare the term is, I only care if the term is present or not. > > > The idea is that if a user does a search for "myfield:dog OR > > > myfield:pony", that any document containing dog or pony would be > > > scored identically. In the case that both showed up, that record would > > > be moved to the top but all the records where they both showed up > > > would have the same score. > > > > > > So long story short, how can I disable the idf score for this > > > particular field? > > > > > > Thanks, > > > > > > Brian Lamb
Re: Disable IDF scoring on certain fields
On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma wrote: > If you still want IDF for other fields then i > think you have a problem because Solr doesn't yet support per-field > similarity. > it does in trunk: https://issues.apache.org/jira/browse/SOLR-2338
Re: Disable IDF scoring on certain fields
Hi Markus, I was just looking at overriding DefaultSimilarity so your email was well timed. The problem I have with it is as you mentioned, it does not seem possible to do it on a field by field basis. Has anyone had any luck with doing some of the similarity functions on a field by field basis? I have need to do more than one of them and from what I can find, it seems that only computeNorm accounts for the name of the field. Thanks, Brian Lamb On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma wrote: > Hi, > > Although you can configure per field TF (by omitTermFreqAndPositions) you > can't > do this for IDF. If you index is only used for this specific purpose (seems > like an auto-complete index) then you can override DefaultSimilarity and > return a static value for IDF. If you still want IDF for other fields then > i > think you have a problem because Solr doesn't yet support per-field > similarity. > > > http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/java/org/apache/lucene/search/DefaultSimilarity.java?view=markup > > Cheers, > > > Hi all, > > > > I have a field defined in my schema.xml file as > > > > > positionIncrementGap="1000"> > > > > > > > maxGramSize="25" side="front" /> > > > > > > > stored="true" required="false" omitNorms="true" /> > > > > I would like do disable IDF scoring on this field. I am not interested in > > how rare the term is, I only care if the term is present or not. The idea > > is that if a user does a search for "myfield:dog OR myfield:pony", that > > any document containing dog or pony would be scored identically. In the > > case that both showed up, that record would be moved to the top but all > > the records where they both showed up would have the same score. > > > > So long story short, how can I disable the idf score for this particular > > field? > > > > Thanks, > > > > Brian Lamb >
Re: Disable IDF scoring on certain fields
Hi, Although you can configure per field TF (by omitTermFreqAndPositions) you can't do this for IDF. If you index is only used for this specific purpose (seems like an auto-complete index) then you can override DefaultSimilarity and return a static value for IDF. If you still want IDF for other fields then i think you have a problem because Solr doesn't yet support per-field similarity. http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/java/org/apache/lucene/search/DefaultSimilarity.java?view=markup Cheers, > Hi all, > > I have a field defined in my schema.xml file as > > positionIncrementGap="1000"> > > > maxGramSize="25" side="front" /> > > > stored="true" required="false" omitNorms="true" /> > > I would like do disable IDF scoring on this field. I am not interested in > how rare the term is, I only care if the term is present or not. The idea > is that if a user does a search for "myfield:dog OR myfield:pony", that > any document containing dog or pony would be scored identically. In the > case that both showed up, that record would be moved to the top but all > the records where they both showed up would have the same score. > > So long story short, how can I disable the idf score for this particular > field? > > Thanks, > > Brian Lamb
Disable IDF scoring on certain fields
Hi all, I have a field defined in my schema.xml file as I would like do disable IDF scoring on this field. I am not interested in how rare the term is, I only care if the term is present or not. The idea is that if a user does a search for "myfield:dog OR myfield:pony", that any document containing dog or pony would be scored identically. In the case that both showed up, that record would be moved to the top but all the records where they both showed up would have the same score. So long story short, how can I disable the idf score for this particular field? Thanks, Brian Lamb