Thank you for all the help. Greatly appreciated. I have seen the related issues and I see lot of patches in the JIRA mentioned. I am really confused which patch to use (pls excuse my ignorance). Also are the patches production ready? I will greatly appreciate if you can point me to the correct patch or is it that i have to apply all the patches and make it work. Can I apply the patch in solr 1.3?
Thanks Bharat Jain On Sat, Jul 31, 2010 at 2:16 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > May I suggest looking at some of the related issues, say SOLR-1682 > > > This issue is related to: > SOLR-1682 Implement CollapseComponent > SOLR-1311 pseudo-field-collapsing > LUCENE-1421 Ability to group search results by field > SOLR-1773 Field Collapsing (lightweight version) > SOLR-237 Field collapsing > > > > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > ----- Original Message ---- > > From: Bharat Jain <bharat.j...@gmail.com> > > To: solr-user@lucene.apache.org > > Sent: Fri, July 30, 2010 10:40:19 AM > > Subject: Re: question about relevance > > > > Hi, > > Thanks a lot for the info and your time. I think field collapse will > work > > for us. I looked at the https://issues.apache.org/jira/browse/SOLR-236but > > which file I should use for patch. We use solr-1.3. > > > > Thanks > > Bharat Jain > > > > > > On Fri, Jul 30, 2010 at 12:53 AM, Chris Hostetter > > <hossman_luc...@fucit.org>wrote: > > > > > > > > : 1. There are user records of type A, B, C etc. (userId field in > index is > > > : common to all records) > > > : 2. A user can have any number of A, B, C etc (e.g. think of A being > a > > > : language then user can know many languages like french, english, > german > > > etc) > > > : 3. Records are currently stored as a document in index. > > > : 4. A given query can match multiple records for the user > > > : 5. If for a user more records are matched (e.g. if he knows both > french > > > and > > > : german) then he is more relevant and should come top in UI. This is > the > > > : reason I wanted to add lucene scores assuming the greater score > means > > > more > > > : relevance. > > > > > > if your goal is to get back "users" from each search, then you should > > > probably change your indexing strategry so that each "user" has a > single > > > document -- fields like "langauge" can be multivalued, etc... > > > > > > then a search for "language:en langauge:fr" will return users who > speak > > > english or french, and hte ones that speak both will score higher. > > > > > > if you really cant change the index structure, then essentially waht > you > > > are looking for is a "field collapsing" solution on the userId field, > > > where you want each collapsed group to get a cumulative score. i > don't > > > know if the existing field collapsing patches support this -- if you > are > > > already willing/capable to do it in the lcient then that may be the > > > simplest thing to support moving foward. > > > > > > Adding the scores is certainly one metric you could use -- it's > generally > > > suspicious to try and imply too much meaning to scores in lucene/solr > but > > > that's becuase people typically try to imply broader absolute meaning. > in > > > the case of a single query the scores are relative eachother, and > adding > > > up all the scores for a given userId is approximaly what would happen > in > > > my example above -- except that there is also a "coord" factor that > would > > > penalalize documents that only match one clause ... it's complicated, > but > > > as an approximation adding the scores might give you what you are > looking > > > for -- only you can know for sure based on your specific data. > > > > > > > > > > > > -Hoss > > > > > > > > >