I will be replying to my own question:

Looking at the source code of KeywordAnalyzer, I noticed was not
lowercasing the indexed fields, and index did not contain lowercased
letters anyway, so I thought that query parser was responsible for this.

And again looking at source code of QueryParser I found out about:
setLowercaseExpandedTerms(boolean)

  /**
   * Whether terms of wildcard, prefix, fuzzy and range queries are to be
automatically
   * lower-cased or not.  Default is <code>true</code>.
   */
  @Override
  public void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms) {
    this.lowercaseExpandedTerms = lowercaseExpandedTerms;
  }


Query parser lowercases the queries only if it is a wildcard, prefix ,
fuzzy and range query. and it can be turned off by
parser.setLowerCaseExpandedTerms(false);

Which solved my problem,

Best regards,
C.

On Thu, Sep 22, 2016 at 5:01 PM, Cam Bazz <camb...@gmail.com> wrote:

> Hello,
>
> I am indexing userAgent fields found in apache logs. Indexing and querying
> everything with
> KeywordAnalyzer - But I found something strange:
>
>
>             IndexSearcher searcher = new IndexSearcher(reader);
>             Analyzer q_analyzer = new KeywordAnalyzer();
>             QueryParser parser = new QueryParser("userAgent", q_analyzer);
>
>             System.out.println(queryStr);
>             Query query = parser.parse(queryStr);
>             System.out.println(query.toString());
>
>
> When searched for userAgent:Moz* the above code will output:
>
> Info:   userAgent:Mo*
> Info:   userAgent:mo*
>
> The keyword analyzer is clearly turning the query string into lower case.
>
> Is there a way to avoid it? The index is case sensitive (it wont any
> documents staring with mo*)
> but the keyword analyzer turns everything into lower case.
>
> Best Regards,
> C.
>

Reply via email to