[
https://issues.apache.org/jira/browse/LUCENE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-4382:
----------------------------------
Fix Version/s: (was: 4.3)
4.4
> Unicode escape no longer works for non-suffix-only wildcard terms
> -----------------------------------------------------------------
>
> Key: LUCENE-4382
> URL: https://issues.apache.org/jira/browse/LUCENE-4382
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/queryparser
> Affects Versions: 4.0-BETA
> Reporter: Jack Krupansky
> Fix For: 4.4
>
>
> LUCENE-588 added support for escaping of wildcard characters, but when the
> de-escaping logic was pushed down from the query parser (QueryParserBase)
> into WildcardQuery, support for Unicode escaping (backslash, "u", and the
> four-digit hex Unicode code) was not included.
> Two solutions:
> 1. Do the Unicode de-escaping in the query parser before calling
> getWildcardQuery.
> 2. Support Unicode de-escaping in WildcardQuery.
> A suffix-only wildcard does not exhibit this problem because full de-escaping
> is performed in the query parser before calling getPrefixQuery.
> My test case, added at the beginning of
> TestExtendedDismaxParser.testFocusQueryParser:
> {code}
> assertQ("expected doc is missing (using escaped edismax w/field)",
> req("q", "t_special:literal\\:\\u0063olo*n",
> "defType", "edismax"),
> "//doc[1]/str[@name='id'][.='46']");
> {code}
> Note: That test case was only used to debug into WildcardQuery to see that
> the Unicode escape was not processed correctly. It fails in all cases, but
> that's because of how the field type is analyzed.
> Here is a Lucene-level test case that can also be debugged to see that
> WildcardQuery is not processing the Unicode escape properly. I added it at
> the start of TestMultiAnalyzer.testMultiAnalyzer:
> {code}
> assertEquals("literal\\:\\u0063olo*n",
> qp.parse("literal\\:\\u0063olo*n").toString());
> {code}
> Note: This case will always run correctly since it is only checking the input
> pattern string for WildcardQuery and not how the de-escaping was performed
> within WildcardQuery.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]