On Fri, May 6, 2011 at 10:35 PM, cyang2010 <ysxsu...@hotmail.com> wrote: > When user entered text contains special character, can this being taken care > by the tokenizer/filter configured at the field? > > In application code, Do i need to parse the user input string and add the > escape in front of those special character? If so, will those special > characters differ for different language, such as english versus chinese? > > As of now, I didn't parse those special character. i am getting this > inconsistent/strange behavior/error. For example: > > 1. search: title_name_en_US:(my! god) > solr thinks the second term "god" is something NOT to include, why is that?
"!" is a synonym for the NOT operator in lucene query parser syntax. The fact that it's treated as an operator even when followed by whitespace is a bug. This was fixed by LUCENE-2566 (which is in the trunk version, but not 3.1) One workaround is to escape the "!" or quote the term. title_name_en_US:(my\! god) title_name_en_US:("my!" god) In general, the lucene query parser isn't meant for directly handling literal user queries since it has a more strict syntax (like SQL). Something like the dismax or edismax may help (try adding defType=dismax to your request). They are designed to try and never throw exceptions. -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco