Customizing Solr For a Certain Language

2013-05-02 Thread Furkan KAMACI
Hi folks;

I want to use Solr to index any other language except for English. I will
use Turkish documents to index with Solr. I will implement some algorithms
that is more suitable to Turkish rather than English. Is there any wiki
page that explains to steps for it? I mean what are the main parts of a
customized Analyzer. i.e. a suitable stopwords.txt, a stemmer algorithm,
customized tokenizer for that language, customized tokenizer filter etc.
etc.

Which steps should I follow?


Re: Customizing Solr For a Certain Language

2013-05-02 Thread Alexandre Rafalovitch
Have you looked at the main example that comes with Solr? It contains
a specific configuration for Turkish. Perhaps you could try that and
narrow the question to more precise issues?

I don't remember any Turkish-specific discussions, but perhaps
something can be learned from searching for discussions on supporting
Chinese and German languages.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 2, 2013 at 5:08 PM, Furkan KAMACI furkankam...@gmail.com wrote:
 Hi folks;

 I want to use Solr to index any other language except for English. I will
 use Turkish documents to index with Solr. I will implement some algorithms
 that is more suitable to Turkish rather than English. Is there any wiki
 page that explains to steps for it? I mean what are the main parts of a
 customized Analyzer. i.e. a suitable stopwords.txt, a stemmer algorithm,
 customized tokenizer for that language, customized tokenizer filter etc.
 etc.

 Which steps should I follow?