Chuck Williams wrote:
That approach does not work.  I could not find an approach that would
work with the built-in classes, although of course there might be one.
The problem has two components:  coord and the fact that BooleanQuery's
sum their clause scores to compute the final score.  The latter is not
easily overridden.  Specifically,

  title:(albino elephant)^4 description:(albino elephant)

still has the problem that a result with albino in the title and albino
in the description gets the same score as a result with albino in the
title and elephant in the description

Perhaps I misunderstood what you desire. You want a reward for albino and elephant both occurring in the document, regardless of field, if so, then what you'd want is:


(title:albino description:albino) (title:elephant description:elephant)

with coord disabled on the *inner* queries, no? This way coord would explicitly boost documents which matched on both terms.

FYI, MaxDisjunctionQuery has made an enormous improvement in the quality
of my query results, and I have strong reason to believe the same would
be true in most other domains (more on that coming in the idf^2
discussion).  In terms of the albino elephant example, the query above
was putting all the albino animals except elephants above the albino
elephants, while the query with an outer BooleanQuery and inner
MaxDisjunctionQuery's

    ( (title:albino^4 | description:albino)~0.1
      (title:elephant^4 | description:elephant)~0.1
    )

properly puts the albino elephants on top.

If "albino" is outscoring "elephant" then you could either reduce the impact of idf or increase the impact of coordination. Did you try, e.g., defining coord as (overlap/max)^2 or somesuch?


Or, perhaps take proximity into account, with "albino elephant"~10? Or simply using AND instead of OR? These days most web search engines use AND as the default operator and reward for proximity. Is that wrong for your application? AND is effectively a coord of (overlap/max)^infinity.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to