I will be replying to my own question:
Looking at the source code of KeywordAnalyzer, I noticed was not
lowercasing the indexed fields, and index did not contain lowercased
letters anyway, so I thought that query parser was responsible for this.
And again looking at source code of QueryParser I found out about:
setLowercaseExpandedTerms(boolean)
/**
* Whether terms of wildcard, prefix, fuzzy and range queries are to be
automatically
* lower-cased or not. Default is <code>true</code>.
*/
@Override
public void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms) {
this.lowercaseExpandedTerms = lowercaseExpandedTerms;
}
Query parser lowercases the queries only if it is a wildcard, prefix ,
fuzzy and range query. and it can be turned off by
parser.setLowerCaseExpandedTerms(false);
Which solved my problem,
Best regards,
C.
On Thu, Sep 22, 2016 at 5:01 PM, Cam Bazz <[email protected]> wrote:
> Hello,
>
> I am indexing userAgent fields found in apache logs. Indexing and querying
> everything with
> KeywordAnalyzer - But I found something strange:
>
>
> IndexSearcher searcher = new IndexSearcher(reader);
> Analyzer q_analyzer = new KeywordAnalyzer();
> QueryParser parser = new QueryParser("userAgent", q_analyzer);
>
> System.out.println(queryStr);
> Query query = parser.parse(queryStr);
> System.out.println(query.toString());
>
>
> When searched for userAgent:Moz* the above code will output:
>
> Info: userAgent:Mo*
> Info: userAgent:mo*
>
> The keyword analyzer is clearly turning the query string into lower case.
>
> Is there a way to avoid it? The index is case sensitive (it wont any
> documents staring with mo*)
> but the keyword analyzer turns everything into lower case.
>
> Best Regards,
> C.
>