Re: dismax parser, parens, what do they do exactly
Thanks Hoss, this is very helpful, okay, dismax is not intended to do anything with parens for semantics, they're just like any other char, handled by analyzers. I think you're right I cut and paste the wrong query before. Just for the record, on 1.4.1: qf=text pf= q=book (dog +(cat -frog)) str name=parsedquery +((DisjunctionMaxQuery((text:book)~0.01) DisjunctionMaxQuery((text:dog)~0.01) DisjunctionMaxQuery((text:cat)~0.01) -DisjunctionMaxQuery((text:frog)~0.01))~3) () /str str name=parsedquery_toString +(((text:book)~0.01 (text:dog)~0.01 (text:cat)~0.01 -(text:frog)~0.01)~3) () /str
Re: dismax parser, parens, what do they do exactly
: It looks like Dismax query parser can somehow handle parens, used for : applying, for instance, + or - to a group, distributing it. But I'm not : sure what effect they have on the overall query. parens are treated like any regular character -- they have no semantic meaning. what may be confusing you is what the *analyzer* you have configured for your query field then does with the paren. For instance, using the example schema on trunk, try the same query... q = book (dog +(cat -frog)) ..but using a qf param containing a string field (no analysis) ... /select?defType=dismaxq=book+(dog+%2B(cat+-frog))tie=0.01qf=text_sdebugQuery=true It produces the following output (i've added some whitespace)... str name=parsedquery +( DisjunctionMaxQuery(( text_s:book)~0.01) DisjunctionMaxQuery(( text_s:(dog)~0.01) +DisjunctionMaxQuery(( text_s:(cat)~0.01) -DisjunctionMaxQuery(( text_s:frog)) )~0.01) ) () /str ...the parens from your query are being treated literally as characters in your terms. you just don't see them in the parsed query because it shows you what those terms look like after the anlsysis. Incidently... : debugQuery shows: : : +((DisjunctionMaxQuery((text:book)~0.01) : +DisjunctionMaxQuery((text:dog)~0.01) : DisjunctionMaxQuery((text:cat)~0.01) : -DisjunctionMaxQuery((text:frog)~0.01))~2) () ...double check that, it doesn't seem to match the query string you posted (it shows dog being mandatory. i'm guessing you cut/paste the wrong example) -Hoss
dismax parser, parens, what do they do exactly
It looks like Dismax query parser can somehow handle parens, used for applying, for instance, + or - to a group, distributing it. But I'm not sure what effect they have on the overall query. For instance, if I give dismax this: book (dog +( cat -frog)) debugQuery shows: +((DisjunctionMaxQuery((text:book)~0.01) +DisjunctionMaxQuery((text:dog)~0.01) DisjunctionMaxQuery((text:cat)~0.01) -DisjunctionMaxQuery((text:frog)~0.01))~2) () How will that be treated by mm? Let's say I have an mm of 50%. Does that apply to the top-level, like either book needs to match or +(dog +( cat -frog)) needs to match? And for +(dog +( cat -frog)) to match, do just 50% of that subquery need to match... or is mm ignored there? Or something else entirely? Can anyone clear this up? Continuing to try experimentally to clear it up... it _looks_ like the mm actually applies to each _individual_ low-level query. So even though the semantics of: book (dog +( cat -frog)) are respected, if mm is 50%, the nesting is irrelvant, exactly 50% of book, dog, +cat, and +-frog (distributing the operators through I guess?) are required. I think. I'm getting confused even talking about it.