On Fri, May 6, 2011 at 10:35 PM, cyang2010 <ysxsu...@hotmail.com> wrote:
> When user entered text contains special character, can this being taken care
> by the tokenizer/filter configured at the field?
>
> In application code, Do i need to parse the user input string and add the
> escape in front of those special character?  If so, will those special
> characters differ for different language, such as english versus chinese?
>
> As of now, I didn't parse those special character.  i am getting this
> inconsistent/strange behavior/error.  For example:
>
> 1. search: title_name_en_US:(my! god)
> solr thinks the second term "god" is something NOT to include, why is that?

"!" is a synonym for the NOT operator in lucene query parser syntax.
The fact that it's treated as an operator even when followed by
whitespace is a bug.
This was fixed by LUCENE-2566 (which is in the trunk version, but not 3.1)

One workaround is to escape the "!" or quote the term.
title_name_en_US:(my\! god)
title_name_en_US:("my!" god)

In general, the lucene query parser isn't meant for directly handling
literal user queries since it has a more strict syntax (like SQL).
Something like the dismax or edismax may help (try adding
defType=dismax to your request).  They are designed to try and never
throw exceptions.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

Reply via email to