ok thanks, I will pass well I dunno how to verify it. Even if I try then I
get some scores, but I dunno if comparing them is reliable.


On 28 March 2011 11:36, Uwe Schindler <u...@thetaphi.de> wrote:

> Hi,
>
> You don't need to extend BooleanQuery, you can just pass "true" in its
> ctor,
> see: http://s.apache.org/QvK
> Of course you can also subclass DefaultSimilarity and return 1 as coord,
> but
> that is more work than passing true to a ctor.
>
> For your type of queries, disabling coord should be enough, but I am not
> 100% sure! Why not simply try it out?
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -----Original Message-----
> > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com]
> > Sent: Monday, March 28, 2011 10:49 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: comparing lucene scores across queries
> >
> > One more thing, instead of extending the BooleanQuery class to remove the
> > coord factor, can I also extend the Similarity class to do it ?
> >
> > Still the other question is open: just to be sure, if I disable the coord
> factor I
> > can finally compare my BooleanQuery results ?
> >
> > thanks
> >
> > >
> > >
> > >
> > > On 28 March 2011 10:11, Uwe Schindler <u...@thetaphi.de> wrote:
> > >
> > >> Hi Patrick,
> > >>
> > >> You can disable the coord factor in the constructor of BooleanQuery.
> > >>
> > >> Uwe
> > >>
> > >> -----
> > >> Uwe Schindler
> > >> H.-H.-Meier-Allee 63, D-28213 Bremen
> > >> http://www.thetaphi.de
> > >> eMail: u...@thetaphi.de
> > >>
> > >>
> > >> > -----Original Message-----
> > >> > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com]
> > >> > Sent: Monday, March 28, 2011 10:09 AM
> > >> > To: java-user@lucene.apache.org
> > >> > Subject: Re: comparing lucene scores across queries
> > >> >
> > >> > Hi, thanks for reply.
> > >> >
> > >> > Yeah, I've read the Similarity class documentation several times,
> > >> > but I
> > >> need
> > >> > some tip.
> > >> >
> > >> > My queries are BooleanQueries but they always have the same
> > >> > structure (the same structure of the docs, they are actually docs
> > >> > from
> > >> collection):
> > >> 3
> > >> > fields.
> > >> >
> > >> > What if I simplify the similarity scores, by removing coord factor
> > >> > and
> > >> just
> > >> > leaving the cosine similarity which is comparable ?
> > >> >
> > >> > I want to underline the fact that my boolean queries are just a
> > >> combination
> > >> > of "field:term" items, and I always have the same 3 fields with
> > >> different
> > >> > terms obviously.
> > >> >
> > >> > Thanks
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On 28 March 2011 10:03, Uwe Schindler <u...@thetaphi.de> wrote:
> > >> >
> > >> > > No, scores are in general not comparable between different
> queries.
> > >> > > The problem lies in many things:
> > >> > > - Each query has a norm factor that makes it more compareable if
> > >> > > they are sub clauses of a BooleanQuery. But you are right, this
> > >> > > norm factor should be the same.
> > >> > > - Some queries like FuzzyQuery rely on the terms in index and
> > >> > > those matches the query
> > >> > > - Inside Boolean queries, there is also a coord-factor involved
> > >> > >
> > >> > > If you are always using the same simple type of query (e.g.
> > >> > > simple TermQuery, only with different term) on the same index,
> > >> > > you can compare the scores. As soon as you are using complex
> > >> > > queries (e.g several terms compared in a BooleanQuery as
> > >> > > QueryParser produces), the scores are no longer comparable.
> > >> > >
> > >> > > You can read more on all factors that are included in scoring:
> > >> > >
> > >> > >
> > >> >
> > http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/sear
> > >> > ch/
> > >> > > Simila
> > >> > > rity.html
> > >> > >
> > >> > > -----
> > >> > > Uwe Schindler
> > >> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> > > eMail: u...@thetaphi.de
> > >> > >
> > >> > >
> > >> > > > -----Original Message-----
> > >> > > > From: Patrick Diviacco [mailto:patrick.divia...@gmail.com]
> > >> > > > Sent: Monday, March 28, 2011 9:44 AM
> > >> > > > To: java-user@lucene.apache.org
> > >> > > > Subject: comparing lucene scores across queries
> > >> > > >
> > >> > > > Hi,
> > >> > > >
> > >> > > > sorry I've already asked few days ago, but I got no reply and I
> > >> > > > really
> > >> > > need
> > >> > > > some help on this..
> > >> > > >
> > >> > > > I'm running several queries against a doc collection. The
> queries
> > >> > > > are documents of the collection itself, I need to measure how
> > >> > > > similar is each document to the rest of the collection.
> > >> > > >
> > >> > > > Now, Lucene returns me a score per query, but I've been told
> such
> > >> > > > score
> > >> > > is
> > >> > > > not comparable across queries. Is this correct ?
> > >> > > >
> > >> > > > For example, arem't these scores comparable ?
> > >> > > > query1, score:8.324234
> > >> > > > query2, score:3.324238
> > >> > > >
> > >> > > > If so, why not ? Isn't the cosine similarity between the query
> > >> > > > vector and collection docs vectors ? I really need a comparable
> > >> measure.
> > >> > > >
> > >> > > > thanks
> > >> > >
> > >> > >
> > >> > >
> ---------------------------------------------------------------------
> > >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >> > >
> > >> > >
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >>
> > >>
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to