Re: Query Term Questions

Terry Steichen Wed, 21 Jan 2004 08:02:56 -0800

Erik,

Thanks for your response.  My specific comments (TS==>) are inserted below.
I should make clear that I'm using
fairly complex, embedded queries - not ones that the user is expected to
enter.

Regards,

Terry

----- Original Message -----
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 21, 2004 9:31 AM
Subject: Re: Query Term Questions

> On Jan 20, 2004, at 10:22 AM, Terry Steichen wrote:
> > 1) Is there a way to set the query boost factor depending not on the
> > presence of a term, but on the presence of two specific terms?  For
> > example, I may want to boost the relevance of a document that contains
> > both "iraq" and "clerics", but not boost the relevance of documents
> > that contain only one or the other terms. (The idea is better
> > discrimination than if I simply boosted both terms.)
>
> But doesn't the query itself take this into account?  If there are
> multiple matching terms then the overlap (coord) factor kicks in.

TS==>Except that I'd like to be able to choose to do this on a
query-by-query basis.  In other words,
it's desirable that some specific queries significantly increase their
discrimination based on this multiple matching,
relative to the normal extra boost given by the coord factor.  However, I
take it from your answer that
there's not a way to do this in the query itself (at least using the
unmodified, standard Lucene version).

>
> > 2) Is it possible to apply (or simulate) a negative query boost
> > factor?  For example, I may have a complex query with lots of terms
> > but want to reduce the relevance of a matching document that also
> > included the term "iowa". ( The idea is for an easier and more
> > discriminating way than simply increasing the relevance of all other
> > terms besides "iowa").
>
> Another reply mentioned negative boosting.  Is that not working as
> you'd like?

TS==>I've not been able to get negative boosting to work at all.  Maybe
there's a problem with my syntax.
If, for example, I do a search with "green beret"^10, it works just fine.
But "green beret"^-2 gives me a
ParseException showing a lexical error.

>
> > 3) Is there a way to handle variants of a phrase without OR'ing
> > together the variants?  For example, I may want to find documents
> > dealing with North Korea; the terms might be "north korea" or "north
> > korean" or "north koreans" - is there a way to handle this with a
> > single term using wildcards?
>
> Sounds like what you're really after is fancier analysis.  This is one
> of the purposes of analysis, to do stemming.

TS==>Well, I hope I'm not trying to be fancy.  It's just that listing all
the different variants, particularly (as in my
case) I have to do this for multiple fields, gets tedious and error-prone.
The example above is simply one such case
for a particular query - other queries may have entirely different desired
combinations.  Constructing a single stemmer
to handle all such cases would be (for me, at least) very difficult.
Besides, I tend to stay away from stemming because
I believe it can introduce some rather unpredictable side-effects.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Query Term Questions

Reply via email to