I believe we are experiencing the same thing.

We recently upgraded to our Drupal 8 sites to SOLR 8.3.1.  We are now getting 
reports of certain patterns of search terms resulting in an error that reads, 
“The website encountered an unexpected error. Please try again later.”



Below is a list of example terms that always result in this error and a similar 
list that works fine.  The problem pattern seems to be a search term that 
contains 2 or 3 characters followed by a space, followed by additional text.



To confirm that the problem is version 8 of SOLR, I have updated our local and 
UAT sites with the latest Drupal updates that did include an update to the 
Search API Solr module and tested the terms below under SOLR 7.7.2, 8.3.1, and 
8.4.1.  Under version 7.7.2  everything works fine. Under either of the version 
8, the problem returns.



Thoughts?



Search terms that result in error

  *   w-2 agency directory
  *   agency w-2 directory
  *   w-2 agency
  *   w-2 directory
  *   w2 agency directory
  *   w2 agency
  *   w2 directory



Search terms that do not result in error

  *   w-22 agency directory
  *   agency directory w-2
  *   agency w-2directory
  *   agencyw-2 directory
  *   w-2
  *   w2
  *   agency directory
  *   agency
  *   directory
  *   -2 agency directory
  *   2 agency directory
  *   w-2agency directory
  *   w2agency directory




________________________________
From: Hongtai Xue <h...@yahoo-corp.jp>
Sent: Monday, March 2, 2020 3:45 AM
To: solr_user lucene_apache <solr-u...@lucene.apache.org>
Cc: dev@lucene.apache.org <dev@lucene.apache.org>
Subject: strange behavior of solr query parser


Hi,



Our team found a strange behavior of solr query parser.

In some specific cases, some conditional clauses on unindexed field will be 
ignored.



for query like, q=A:1 OR B:1 OR A:2 OR B:2

if field B is not indexed(but docValues="true"), "B:1" will be lost.



but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,

it will work perfect.



the only difference of two queries is that they are wrote in different orders.

one is ABAB, another is AABB,



■reproduce steps and example explanation

you can easily reproduce this problem on a solr collection with _default 
configset and exampledocs/books.csv data.



1. create a _default collection

bin/solr create -c books -s 2 -rf 2



2. post books.csv.

bin/post -c books example/exampledocs/books.csv



3. run following query.

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+cat%3Abook+OR+name_str%3AJhereg+OR+cat%3Acd%29&debug=query





I printed query parsing debug information.

you can tell "name_str:Foundation" is lost.



query: "name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd"

(please note "Jhereg" is "4a 68 65 72 65 67" and "Foundation" is "46 6f 75 6e 
64 61 74 69 6f 6e")

--------

  "debug":{

    "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",

    "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",

    "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",

    "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]])",

    "QParser":"LuceneQParser"}}

--------



but for query: "name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd",

everything is OK. "name_str:Foundation" is not lost.

--------

  "debug":{

    "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",

    "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",

    "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 
6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])))",

    "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 
69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO 
[4a 68 65 72 65 67]]))",

    "QParser":"LuceneQParser"}}

--------

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+name_str%3AJhereg+OR+cat%3Abook+OR+cat%3Acd%29&debug=query



we did a little bit research, and we wander if it is a bug of SolrQueryParser.

more specifically, we think if statement here might be wrong.

https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L711



Could you please tell us if it is a bug, or it's just a wrong query statement.



Thanks,

Hongtai Xue

Reply via email to