Michael wrote: "so I wouldn't worry too much about making a special build for yourself with a few changes."
We did this to fix a couple of bugs and add some functionality around sorting a few versions back - it was absolutely fine, but depending on how much time you have to spend on Lucene, it can be a bit of a pain for maintainability, depending on how much is changing in that area of the Lucene code base with subsequent releases. Yours, MOray ------------------------------------- Moray McConnachie Director of IT +44 1865 261 600 Oxford Analytica http://www.oxan.com -----Original Message----- From: Michael Garski [mailto:mgar...@myspace-inc.com] Sent: 10 December 2009 18:10 To: lucene-net-user@incubator.apache.org Subject: RE: idf on per-field basis Artem, I've made modifications to the internals of Lucene.Net to achieve modifications to scoring, specifically in being able to manually specify the length norm for a field, which allowed me to retain positional information while injecting multi-term synonyms, so I wouldn't worry too much about making a special build for yourself with a few changes. Would using a QueryFilter in conjunction with a query work? The QueryFilter would be used on fields that scoring information was not necessary while the other fields would be queried with the specific query you need. Michael -----Original Message----- From: Artem Chereisky [mailto:a.cherei...@gmail.com] Sent: Thursday, December 10, 2009 1:40 AM To: lucene-net-user@incubator.apache.org Cc: <lucene-net-user@incubator.apache.org> Subject: Re: idf on per-field basis Michael, thank you. Query filter only solves half of my problem. Unfortunately I do need to have a proper score for some fields. I ended up extending Term class (I removed sealed attribute which is a bad thing). The new myTerm class has one boolean member, omitIdf. Then, when I compile my queries, I use myTerm with omitIdf set to true, for some fields. Then I extended Similarity cladd and I cast Term passes into Idf method to myTerm and only calculate Idf if omitIdf is true. Seems to work. I don't like the solution but that's the best I could do today. Any thoughts? Regards, Artem On 10/12/2009, at 15:51, Michael Garski <mgar...@myspace-inc.com> wrote: > Artem, > > Do you need any scoring information at all on that field? How about > using a QueryFilter for those fields? > > Michael > > > -----Original Message----- > From: Artem Chereisky [mailto:a.cherei...@gmail.com] > Sent: Wed 12/9/2009 4:53 PM > To: lucene-net-user@incubator.apache.org; > lucene-net-develo...@incubator.apache.org > Subject: idf on per-field basis > > Hi, > > I came across a situation when my scores are adversely affected by the > IDF component. Let me explain. > > My index documents contain a number of fields, for some, TF and IDF > are important and need to be taken into account, for others niether TF > nor IDF should apply. I dealt with TF by omiting norms during indexing > but I can't find a way to calculate IDF for certain fields only. > > The formula for IDF is defined in Similarity. I have my own > implementation of Similarity where I can set it to 1 or use the > default implementation. > mySearcher.SetSimilarity is where I tell Lucene which similarity > instance to use, but that's global, so it applies to all fields in the > index. > > So, here's my question. Is there a way to calculate IDF on per-field > basis? > > Regards, > Art > >