Re: Compound word search (maybe DisMaxQueryPaser problem)

Tobias Dittrich Wed, 18 Mar 2009 03:28:55 -0700

Many thanks for your explanation. That really helped me alot in understanding DisMax - and finally I realized thatDisMax is not at all what I need. Actually I do not wantresults where "blue" is in one field and "tooth" in another(imagine you search for a notebook with blue tooth and getsome blue products that accidentally have tooth in some field).

My feeling already was that I have to come up with my ownsolution mixing parts of DisMax (distribute the query amongthe fields) and FieldQParserPlugin. So now I will try that out.


Many thanks
Tobi

Chris Hostetter schrieb:

: My original assumption for the DisMax Handler was, that it will just take the
: original query string and pass it to every field in its fieldlist using the
: fields configured analyzer stack. Maybe in the end add some stuff for the
: special options and so ... and then send the query to lucene. Can you explain
: why this approach was not choosen?

because then it wouldn't be the DisMaxRequestHandler.
seriously: the point of dismax is to build up a DisjunctionMaxQuery foreach "chunk" in the query string and populate those DisjunctionMaxQuerieswith the Queries produced by analyzing that "chunk" against each field inthe qf -- then all of the DisjunctionMaxQueries are grouped into aBooleanQuery with a minNrSHouldMatch.
if you look at the query toString from debugQuery (using a non trivial qfparam and a q string containing more then one "chunk") you can see what imean. your example shows it pretty well actaully...
: > : > : > ((category:blue | name:blue)~0.1 (category:tooth | name:tooth)~0.1)
the point is to build those DisjunctionMaxQueries -- so that each "chunk"only contributes significantly based on the highest scoring field thatchunk appears in ... if your example someone typing "blue tooth" can get amatch when a doc matches blue in one field and tooth in another -- thatwouldn't be possible with the appraoch you describe. the Query structurealso means that a doc where "tooth" appears in both the category and namefields but "blue" doesn't appear at all won't score as high as a doc thatmatches "blue" in category and "tooth" in name (allthough you have to lookat the score explanations to really see hwat i mean by that)
There are certainly a lot of improvements that could be made to dismax ...more customiation in terms of how the querystrings is parsed beforebuilding up the DisjunctionMaxQueries and calling the individual fieldanalyzers would certainly be one way it could improve ... but so far noone has attempted anything like that.
-Hoss

Re: Compound word search (maybe DisMaxQueryPaser problem)

Reply via email to