Re: dismax parser, parens, what do they do exactly

2011-03-24 Thread Jonathan Rochkind
Thanks Hoss, this is very helpful, okay, dismax is not intended to do 
anything with parens for semantics, they're just like any other char, 
handled by analyzers.


I think you're right I cut and paste the wrong query before. Just for 
the record, on 1.4.1:


qf=text
pf=
q=book (dog +(cat -frog))

str name=parsedquery
+((DisjunctionMaxQuery((text:book)~0.01) 
DisjunctionMaxQuery((text:dog)~0.01) 
DisjunctionMaxQuery((text:cat)~0.01) 
-DisjunctionMaxQuery((text:frog)~0.01))~3) ()

/str

str name=parsedquery_toString
+(((text:book)~0.01 (text:dog)~0.01 (text:cat)~0.01 -(text:frog)~0.01)~3) ()
/str




Re: dismax parser, parens, what do they do exactly

2011-03-23 Thread Chris Hostetter

: It looks like Dismax query parser can somehow handle parens, used for
: applying, for instance, + or - to a group, distributing it. But I'm not
: sure what effect they have on the overall query.

parens are treated like any regular character -- they have no semantic 
meaning.

what may be confusing you is what the *analyzer* you have configured for 
your query field then does with the paren.

For instance, using the example schema on trunk, try the same query...

q = book (dog +(cat -frog))

..but using a qf param containing a string field (no analysis) ...

/select?defType=dismaxq=book+(dog+%2B(cat+-frog))tie=0.01qf=text_sdebugQuery=true

It produces the following output (i've added some whitespace)...

str name=parsedquery
 +(  DisjunctionMaxQuery((  text_s:book)~0.01) 
 DisjunctionMaxQuery((  text_s:(dog)~0.01) 
+DisjunctionMaxQuery((  text_s:(cat)~0.01) 
-DisjunctionMaxQuery((  text_s:frog))  )~0.01)
  ) 
  ()
/str

...the parens from your query are being treated literally as characters in 
your terms.  you just don't see them in the parsed query because it shows 
you what those terms look like after the anlsysis.

Incidently...

: debugQuery shows:
: 
: +((DisjunctionMaxQuery((text:book)~0.01)
: +DisjunctionMaxQuery((text:dog)~0.01)
: DisjunctionMaxQuery((text:cat)~0.01)
: -DisjunctionMaxQuery((text:frog)~0.01))~2) ()

...double check that, it doesn't seem to match the query string you posted 
(it shows dog being mandatory.  i'm guessing you cut/paste the wrong 
example)


-Hoss


dismax parser, parens, what do they do exactly

2011-03-16 Thread Jonathan Rochkind

It looks like Dismax query parser can somehow handle parens, used for
applying, for instance, + or - to a group, distributing it. But I'm not
sure what effect they have on the overall query.

For instance, if I give dismax this:

book (dog +( cat -frog))

debugQuery shows:

+((DisjunctionMaxQuery((text:book)~0.01)
+DisjunctionMaxQuery((text:dog)~0.01)
DisjunctionMaxQuery((text:cat)~0.01)
-DisjunctionMaxQuery((text:frog)~0.01))~2) ()


How will that be treated by mm?  Let's say I have an mm of 50%.  Does
that apply to the top-level, like either book needs to match or
+(dog +( cat -frog)) needs to match?  And for +(dog +( cat -frog))
to match, do just 50% of that subquery need to match... or is mm ignored
there?  Or something else entirely?

Can anyone clear this up?  Continuing to try experimentally to clear it up... 
it _looks_ like the mm actually applies to each _individual_ low-level query.  
So even though the semantics of:
book (dog +( cat -frog))

are respected, if mm is 50%, the nesting is irrelvant, exactly 50% of book, dog, 
+cat, and +-frog (distributing the operators through I guess?) are required. I think. I'm 
getting confused even talking about it.