[
https://issues.apache.org/jira/browse/LUCENE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475457
]
Michael Busch commented on LUCENE-800:
--------------------------------------
Hi Dilip,
the backslash is the escape character in Lucene's queryparser syntax. So if you
want to search for a backslash you have to escape it. That means that the first
two examples you provides are working as expected:
item:\\ -> item:\ is correct
item:\\* -> item:\* is correct too
If you want to search for two backslashes you have to escape both, meaning you
have to put four backslashes in the query string:
item:\\\\* -> item:\\*
But you indeed found two other problems. You are right, the last example should
not throw a ParseException.
In (item:\\ item:ABCD\\) the queryparser falsely thinks that the closing
parenthesis is escaped, but actually the backslash is the escaped character. I
will provide a patch for this problem soon.
And as you said the third example should throw a ParseException because there
are too many closing parenthesis. There is already a patch for this problem in
JIRA:
http://issues.apache.org/jira/browse/LUCENE-372
I will commit fixes for both problems soon.
Thanks again, Dilip! Good catches :-)
> Incorrect parsing by QueryParser.parse() when it encounters backslashes
> (always eats one backslash.)
> ----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-800
> URL: https://issues.apache.org/jira/browse/LUCENE-800
> Project: Lucene - Java
> Issue Type: Bug
> Components: QueryParser
> Reporter: Dilip Nimkar
> Assigned To: Michael Busch
>
> Test code and output follow. Tested Lucene 1.9 version only. Affects hose
> who would index/search for Lucene's reserved characters.
> Description: When an input search string has a sequence of N (java-escaped)
> backslashes, where N >= 2, the QueryParser will produce a query in which that
> sequence has N-1 backslashes.
> TEST CODE:
> Analyzer analyzer = new WhitespaceAnalyzer();
> String[] queryStrs = {"item:\\\\",
> "item:\\\\*",
> "(item:\\\\ item:ABCD\\\\))",
> "(item:\\\\ item:ABCD\\\\)"};
> for (String queryStr : queryStrs) {
> System.out.println("--------------------------------------");
> System.out.println("String queryStr = " + queryStr);
> Query luceneQuery = null;
> try {
> luceneQuery = new QueryParser("_default_", analyzer).parse(queryStr);
> System.out.println("luceneQuery.toString() = " +
> luceneQuery.toString());
> } catch (Exception e) {
> System.out.println(e.getClass().toString());
> }
> }
> OUTPUT (with remarks in comment notation:)
> --------------------------------------
> String queryStr = item:\\
> luceneQuery.toString() = item:\ //One backslash has disappeared.
> Searcher will fail on this query.
> --------------------------------------
> String queryStr = item:\\*
> luceneQuery.toString() = item:\* //One backslash has disappeared.
> This query will search for something unintended.
> --------------------------------------
> String queryStr = (item:\\ item:ABCD\\))
> luceneQuery.toString() = item:\ item:ABCD\) //This should have thrown a
> ParseException because of an unescaped ')'. It did not.
> --------------------------------------
> String queryStr = (item:\\ item:ABCD\\)
> class org.apache.lucene.queryParser.ParseException //...and this one
> should not have, but it did.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]