Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Upayavira,

Thank you for your analysis. I thought 'AND'  groupings are supported as
per documentation:

http://docs.lucidworks.com/display/solr/The+Extended+DisMax+Query+Parser
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html#Grouping

But yes, q=doc-id:3000 AND (-text:[* TO *]) works as expected.

Thanks
-Shankar



On Thu, May 23, 2013 at 5:31 PM, Upayavira u...@odoko.co.uk wrote:

 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord

 You're using edismax, not lucene. So AND is being considered as a search
 term, not an operator, and the word 'and' probably exists in 631580
 documents.

 Why is it triggering dismax? Probably because field:() is not valid
 syntax, so edismax is dropping to dismax because it isn't a valid lucene
 query.

 What do you expect text:() to do?

 If you want to match any docs that have a value in the text field, use
 q=text:[* TO *]

 To match docs that *don't* have a value in the text field: q=-text[* TO
 *]

 Upayavira

 On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
  Hi Erick,
 
  Here's the output after turning on the debug flag:
 
  *q=text:()debug=query*
 
  yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime17/int
  lst name=params
  str name=indenttrue/str
  str name=qtext:()/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=0 start=0 maxScore=0.0/result
  lst name=debug
  str name=rawquerystringtext:()/str
  str name=querystringtext:()/str
  str name=parsedquery(+())/no_coord/str
  str name=parsedquery_toString+()/str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *q=doc-id:3000debug=query*
 
  yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime17/int
  lst name=params
  str name=qdoc-id:3000/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=1 start=0 maxScore=11.682044
  doc
:
:
  /doc
  /result
  lst name=debug
  str name=rawquerystringdoc-id:3000/str
  str name=querystringdoc-id:3000/str
  str name=parsedquery(+doc-id:3000)/no_coord/str
  str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *q=doc-id:3000 AND text:()debug=query*
 
yields
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime23/int
  lst name=params
  str name=qdoc-id:3000 AND text:()/str
  str name=debugquery/str
  /lst
  /lst
  result name=response numFound=631647 start=0 maxScore=8.056607
  doc
   :
  /doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  doc
   :
  /doc
  /result
  lst name=debug
  str name=rawquerystringdoc-id:3000 AND text:()/str
  str name=querystringdoc-id:3000 AND text:()/str
  str name=parsedquery
  (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
  Classification:and^2.0 | Contributors:and^2.0 |
  Title:and^3.0/no_coord
  /str
  str name=parsedquery_toString
  +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
  Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
  /str
  str name=QParserExtendedDismaxQParser/str
  null name=altquerystring/
  null name=boost_queries/
  arr name=parsed_boost_queries/
  null name=boostfuncs/
  /lst
  /response
 
  *solrconfig.xml:*
  requestHandler name=/select class=solr.SearchHandler
   lst name=defaults
 str name=echoParamsexplicit/str
 int name=rows10/int
 str name=dftext/str
 str name=defTypeedismax/str
 str name=qftext^1.0 Title^3.0 Classification^2.0
  Contributors^2.0 Publisher^2.0/str
   /lst
 
  *schema.xml:*
  field name=text type=my_text indexed=true stored=false required=
  false/*
  *
  dynamicField name=* type=my_text indexed=true stored=true
  multiValued=false/
  fieldType name=my_text class=solr.TextField analyzer type=index
  class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
  analyzer
  type=multiterm class=MyAnalyzer/ /fieldType
  *
  *
  *Note:* MyAnalyzer among few other customizations, uses
  WhitespaceTokenizer
  and LoweCaseFilter
 
  Thanks a lot.
 
  -Shankar
 
 
  On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
  erickerick...@gmail.comwrote:
 
   Please post the results of adding debug=query to the URL.
   That'll tell us what the query parser spits out which is much
   easier to analyze.
  
   Best
   Erick
  
   On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
   shan...@ebrary.com wrote:
This query returns 0 documents: *q=(+Title:() +Classification:()
+Contributors:() +text:())*
   
This returns 1 document: *q=doc-id:3000*
   
And this returns 631580 documents 

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Shankar Sundararaju
Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky j...@basetechnology.comwrote:

 Okay... sorry I wasn't paying close enough attention. What is happening is
 that the empty parentheses are illegal in Lucene query syntax:

  str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
 AND text:()': Encountered  ) )  at line 1, column 15.
 Was expecting one of:
lt;NOTgt; ...
+ ...
- ...
lt;BAREOPERgt; ...
( ...
* ...
lt;QUOTEDgt; ...
lt;TERMgt; ...
lt;PREFIXTERMgt; ...
lt;WILDTERMgt; ...
lt;REGEXPTERMgt; ...
[ ...
{ ...
lt;LPARAMSgt; ...
lt;NUMBERgt; ...
lt;TERMgt; ...
* ...
/str
  int name=code400/int

 Edismax traps such errors and then escapes the query so that Lucene will
 no longer throw an error. In this case, it puts quotes around the AND
 operator, which is why you see and included in the parsed query as if it
 were a term. And I believe it turns text:() into text:(), which makes
 the original Lucene error go away, but the () analyzes to nothing and
 generates no term in the query.

 So, fix your syntax error and the anomaly should go away.

 -- Jack Krupansky

 -Original Message- From: Shankar Sundararaju
 Sent: Thursday, May 23, 2013 7:23 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Can anyone explain this Solr query behavior?


 Hi Erick,

 Here's the output after turning on the debug flag:

 *q=text:()debug=query*


yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()**/str
 str name=querystringtext:()/**str
 str name=parsedquery(+())/no_**coord/str
 str name=parsedquery_toString+(**)/str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *q=doc-id:3000debug=query*


yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
  :
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:**3000/str
 str name=querystringdoc-id:**3000/str
 str name=parsedquery(+doc-id:**3000)/no_coord/str
 str name=parsedquery_toString+**doc-id:`#8;#0;#0;#23;8/str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *q=doc-id:3000 AND text:()debug=query*

  yields

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
 :
 /doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 doc
 :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:**3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParser**ExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response

 *solrconfig.xml:*

 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/**str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
 /lst

 *schema.xml:*

 field name=text type=my_text indexed=true stored=false required=
 false/*
 *

 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer

 and LoweCaseFilter

 Thanks a lot.

 -Shankar


 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.com*
 *wrote:

  Please post the results

Re: Can anyone explain this Solr query behavior?

2013-05-24 Thread Jack Krupansky
Oh, I simply changed the query parser type to lucene, with defType=lucene 
and then I see essentially the same error that edismax does when it 
internally tries to parse the query.


But, it might be nice if DEBUG level logging for edismax did display the 
error as well and then told you what remediation it was performing..


-- Jack Krupansky

-Original Message- 
From: Shankar Sundararaju

Sent: Friday, May 24, 2013 1:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Okay... sorry I wasn't paying close enough attention. What is happening is
that the empty parentheses are illegal in Lucene query syntax:

 str name=msgorg.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered  ) )  at line 1, column 15.
Was expecting one of:
   lt;NOTgt; ...
   + ...
   - ...
   lt;BAREOPERgt; ...
   ( ...
   * ...
   lt;QUOTEDgt; ...
   lt;TERMgt; ...
   lt;PREFIXTERMgt; ...
   lt;WILDTERMgt; ...
   lt;REGEXPTERMgt; ...
   [ ...
   { ...
   lt;LPARAMSgt; ...
   lt;NUMBERgt; ...
   lt;TERMgt; ...
   * ...
   /str
 int name=code400/int

Edismax traps such errors and then escapes the query so that Lucene will
no longer throw an error. In this case, it puts quotes around the AND
operator, which is why you see and included in the parsed query as if it
were a term. And I believe it turns text:() into text:(), which 
makes

the original Lucene error go away, but the () analyzes to nothing and
generates no term in the query.

So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-Original Message- From: Shankar Sundararaju
Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?


Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*


   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()**/str
str name=querystringtext:()/**str
str name=parsedquery(+())/no_**coord/str
str name=parsedquery_toString+(**)/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*


   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
 :
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:**3000/str
str name=querystringdoc-id:**3000/str
str name=parsedquery(+doc-id:**3000)/no_coord/str
str name=parsedquery_toString+**doc-id:`#8;#0;#0;#23;8/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

 yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
:
/doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:**3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParser**ExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*

requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/**str
  int name=rows10/int
  str name=dftext/str
  str name=defTypeedismax/str
  str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
/lst

*schema.xml:*

field name=text type=my_text indexed=true stored=false required=
false/*
*

dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Erick Erickson
Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Shankar Sundararaju
Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
  :
  :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

  yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
 :
/doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
 /lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.comwrote:

 Please post the results of adding debug=query to the URL.
 That'll tell us what the query parser spits out which is much
 easier to analyze.

 Best
 Erick

 On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
 shan...@ebrary.com wrote:
  This query returns 0 documents: *q=(+Title:() +Classification:()
  +Contributors:() +text:())*
 
  This returns 1 document: *q=doc-id:3000*
 
  And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
  AND (+Title:() +Classification:() +Contributors:() +text:())*
 
  Am I missing something here? Can someone please explain? I am using Solr
  4.2.1
 
  Thanks
  -Shankar




-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Upayavira
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 |
Title:and^3.0/no_coord

You're using edismax, not lucene. So AND is being considered as a search
term, not an operator, and the word 'and' probably exists in 631580
documents.

Why is it triggering dismax? Probably because field:() is not valid
syntax, so edismax is dropping to dismax because it isn't a valid lucene
query.

What do you expect text:() to do?

If you want to match any docs that have a value in the text field, use
q=text:[* TO *]

To match docs that *don't* have a value in the text field: q=-text[* TO
*]

Upayavira

On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
 Hi Erick,
 
 Here's the output after turning on the debug flag:
 
 *q=text:()debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()/str
 str name=querystringtext:()/str
 str name=parsedquery(+())/no_coord/str
 str name=parsedquery_toString+()/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
   :
   :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000/str
 str name=querystringdoc-id:3000/str
 str name=parsedquery(+doc-id:3000)/no_coord/str
 str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000 AND text:()debug=query*
 
   yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
  :
 /doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *solrconfig.xml:*
 requestHandler name=/select class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftext/str
str name=defTypeedismax/str
str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
  /lst
 
 *schema.xml:*
 field name=text type=my_text indexed=true stored=false required=
 false/*
 *
 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
 analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses
 WhitespaceTokenizer
 and LoweCaseFilter
 
 Thanks a lot.
 
 -Shankar
 
 
 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
 erickerick...@gmail.comwrote:
 
  Please post the results of adding debug=query to the URL.
  That'll tell us what the query parser spits out which is much
  easier to analyze.
 
  Best
  Erick
 
  On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
  shan...@ebrary.com wrote:
   This query returns 0 documents: *q=(+Title:() +Classification:()
   +Contributors:() +text:())*
  
   This returns 1 document: *q=doc-id:3000*
  
   And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
   AND (+Title:() +Classification:() +Contributors:() +text:())*
  
   Am I missing something here? Can someone please explain? I am using Solr
   4.2.1
  
   Thanks
   -Shankar
 
 
 
 
 -- 
 Regards,
 *Shankar Sundararaju
 *Sr. Software Architect
 ebrary, a ProQuest company
 410 Cambridge Avenue, Palo Alto, CA 94306 USA
 shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Jack Krupansky
Okay... sorry I wasn't paying close enough attention. What is happening is 
that the empty parentheses are illegal in Lucene query syntax:


 str name=msgorg.apache.solr.search.SyntaxError: Cannot parse 'id:* AND 
text:()': Encountered  ) )  at line 1, column 15.

Was expecting one of:
   lt;NOTgt; ...
   + ...
   - ...
   lt;BAREOPERgt; ...
   ( ...
   * ...
   lt;QUOTEDgt; ...
   lt;TERMgt; ...
   lt;PREFIXTERMgt; ...
   lt;WILDTERMgt; ...
   lt;REGEXPTERMgt; ...
   [ ...
   { ...
   lt;LPARAMSgt; ...
   lt;NUMBERgt; ...
   lt;TERMgt; ...
   * ...
   /str
 int name=code400/int

Edismax traps such errors and then escapes the query so that Lucene will 
no longer throw an error. In this case, it puts quotes around the AND 
operator, which is why you see and included in the parsed query as if it 
were a term. And I believe it turns text:() into text:(), which makes 
the original Lucene error go away, but the () analyzes to nothing and 
generates no term in the query.


So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-Original Message- 
From: Shankar Sundararaju

Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
 :
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

 yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
:
/doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/str
  int name=rows10/int
  str name=dftext/str
  str name=defTypeedismax/str
  str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
/lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson 
erickerick...@gmail.comwrote:



Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar





--
Regards,
*Shankar Sundararaju
*Sr. Software

Can anyone explain this Solr query behavior?

2013-05-22 Thread Shankar Sundararaju
This query returns 0 documents: *q=(+Title:() +Classification:()
+Contributors:() +text:())*

This returns 1 document: *q=doc-id:3000*

And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
AND (+Title:() +Classification:() +Contributors:() +text:())*

Am I missing something here? Can someone please explain? I am using Solr
4.2.1

Thanks
-Shankar