Re: question about relevance

Bharat Jain Thu, 05 Aug 2010 11:08:13 -0700

Thank you for all the help. Greatly appreciated. I have seen the related
issues and I see lot of patches in the JIRA mentioned. I am really confused
which patch to use (pls excuse my ignorance). Also are the patches
production ready? I will greatly appreciate if you can point me to the
correct patch or is it that i have to apply all the patches and make it
work. Can I apply the patch in solr 1.3?


Thanks
Bharat Jain


On Sat, Jul 31, 2010 at 2:16 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> May I suggest looking at some of the related issues, say SOLR-1682
>
>
> This issue is related to:
>  SOLR-1682 Implement CollapseComponent
>  SOLR-1311 pseudo-field-collapsing
>  LUCENE-1421 Ability to group search results by field
>  SOLR-1773 Field Collapsing (lightweight version)
>  SOLR-237  Field collapsing
>
>
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> ----- Original Message ----
> > From: Bharat Jain <bharat.j...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Fri, July 30, 2010 10:40:19 AM
> > Subject: Re: question about relevance
> >
> > Hi,
> >    Thanks a lot for the info and your time. I think field collapse  will
> work
> > for us. I looked at the https://issues.apache.org/jira/browse/SOLR-236but
> > which file I should  use for patch. We use solr-1.3.
> >
> > Thanks
> > Bharat Jain
> >
> >
> > On Fri,  Jul 30, 2010 at 12:53 AM, Chris Hostetter
> > <hossman_luc...@fucit.org>wrote:
> >
> > >
> > >  : 1. There are user records of type A, B, C etc. (userId field in
> index  is
> > > : common to all records)
> > > : 2. A user can have any number of  A, B, C etc (e.g. think of A being
> a
> > > : language then user can know many  languages like french, english,
> german
> > > etc)
> > > : 3. Records are  currently stored as a document in index.
> > > : 4. A given query can match  multiple records for the user
> > > : 5. If for a user more records are  matched (e.g. if he knows both
> french
> > > and
> > > : german) then he is  more relevant and should come top in UI. This is
> the
> > > : reason I wanted  to add lucene scores assuming the greater score
> means
> > > more
> > > :  relevance.
> > >
> > > if your goal is to get back "users" from each search,  then you should
> > > probably change your indexing strategry so that each  "user" has a
> single
> > > document -- fields like "langauge" can be  multivalued, etc...
> > >
> > > then a search for "language:en langauge:fr"  will return users who
> speak
> > > english or french, and hte ones that speak  both will score higher.
> > >
> > > if you really cant change the index  structure, then essentially waht
> you
> > > are looking for is a "field  collapsing" solution on the userId field,
> > > where you want each collapsed  group to get a cumulative score.  i
> don't
> > > know if the existing  field collapsing patches support this -- if you
> are
> > > already  willing/capable to do it in the lcient then that may be the
> > > simplest  thing to support moving foward.
> > >
> > > Adding the scores is certainly  one metric you could use -- it's
> generally
> > > suspicious to try and imply  too much meaning to scores in lucene/solr
> but
> > > that's becuase people  typically try to imply broader absolute meaning.
>  in
> > > the case of a  single query the scores are relative eachother, and
> adding
> > > up all the  scores for a given userId is approximaly what would happen
> in
> > > my example  above -- except that there is also a "coord" factor that
> would
> > >  penalalize documents that only match one clause ... it's complicated,
>  but
> > > as an approximation adding the scores might give you what you are
>  looking
> > > for -- only you can know for sure based on your specific  data.
> > >
> > >
> > >
> > > -Hoss
> > >
> > >
> >
>

Re: question about relevance

Reply via email to