[
https://issues.apache.org/jira/browse/LUCENE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler resolved LUCENE-4247.
-----------------------------------
Resolution: Invalid
Assignee: Uwe Schindler
This is not a bug, because you are using a wildcard query which cannot use the
analyzer (because the analyzer would destroy the wildcards). Without the "*" at
the end this query would be parsed as you expect.
Please ask such questions on the [email protected] mailing list
first, people there will help you with such things.
> QueryParser doesn't call Analyzer
> ---------------------------------
>
> Key: LUCENE-4247
> URL: https://issues.apache.org/jira/browse/LUCENE-4247
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/queryparser
> Affects Versions: 3.6
> Reporter: Zied Hamdi
> Assignee: Uwe Schindler
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> I'm trying to escape czech characters thorough the ASCIIFoldingFilter this
> works fine in indexing since I can retrieve the non-diacritic version of the
> content I indexed. But trying to retrieve with diacritics returns always 0
> results
> In debug mode I can clearly see that the Analyzer wasn't called (in addition
> to that I've put a breakpoint in my analyser to check if it is not called
> later, and it never passes in)
> searchText = "příLIš*";
> Analyzer analyzer = (Analyzer) factory.getBean("analyzer");
> Query q = new QueryParser((Version) factory.getBean("version"),
> DestinationPlaceProperties.NAME, analyzer).parse(searchText);
> The query q has these values in debug:
> prefix Term (id=90)
> field "name" (id=100)
> text "příliš" (id=101)
> --- more details ----
> q PrefixQuery (id=65)
> boost 1.0
> numberOfTerms 0
> prefix Term (id=90)
> rewriteMethod MultiTermQuery$2 (id=92)
> ---------------------
> My analyser is quite simple: I put its code just for reference
> public class DestinationAnalyser extends Analyzer {
> /**
> *
> */
> private final Version luceneVersion;
> public DestinationAnalyser(Version lucene_version) {
> super();
> this.luceneVersion = lucene_version;
> }
> /*
> * (non-Javadoc)
> *
> * @see
> org.apache.lucene.analysis.Analyzer#tokenStream(java.lang.String,
> * java.io.Reader)
> */
> @Override
> public TokenStream tokenStream(String fieldName, Reader reader) {
> TokenStream result = new StandardTokenizer(luceneVersion,
> reader);
> result = new StandardFilter(luceneVersion, result);
> result = new LowerCaseFilter(luceneVersion, result);
> result = new ASCIIFoldingFilter(result);
> return result;
> }
> }
> --------- WORKAROUND ---------
> To avoid the problem, I'm actually using this method to transform the search
> text
> /**
> * Uses {@link ASCIIFoldingFilter} to transform diacritical text to its
> ascii
> * counterpart
> *
> * @param text
> * to transform
> * @return ascii text
> */
> public static String foldToASCII(String text) {
> int length = text.length();
> char[] toReturn = new char[length];
> ASCIIFoldingFilter.foldToASCII(text.toCharArray(), 0, toReturn,
> 0, length);
> return new String(toReturn);
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]