The general idea is that tokenization can generally be done in a language-independent manner, but stemming, synonyms, stop words, etc. must be done in a language-dependent manner.

So, yes, text_en is a better starting point for adding in the more advanced language processing features.

-- Jack Krupansky

-----Original Message----- From: Mysurf Mail
Sent: Monday, June 24, 2013 10:26 AM
To: solr-user@lucene.apache.org
Subject: What should be the definitions ( field type ) for a field that will be search with user free text

currently I am using text_general.
I want to search with user free text search, therefor I would like
tokenization, stemmings ...
How do I define stemmers?
Should I use text_en instead of  text_general?
Thank you.

Reply via email to