Re: how to Index and Search non-Eglish Text in solr

2011-06-10 Thread Erick Erickson
Well, no. Specifying both indexed and stored as false is essentially a no-op, you'd never find anything! But even with indexed=true, this solution has problems. It's essentially using a single field to store text from different languages. The problem is that tokenization, stemming etc. behaves

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Mohammad Shariq
Can I specify multiple language in filter tag in schema.xml ??? like below fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr. WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Erick Erickson
No, you'd have to create multiple fieldTypes, one for each language Best Erick On Thu, Jun 9, 2011 at 5:26 AM, Mohammad Shariq shariqn...@gmail.com wrote: Can I specify multiple language in filter tag in schema.xml ???  like below fieldType name=text class=solr.TextField

Re: how to Index and Search non-Eglish Text in solr

2011-06-09 Thread Mohammad Shariq
Thanks Erick for your help. I have another silly question. Suppose I created mutiple fieldTypes e.g. news_English, news_Chinese, news_Japnese etc. after creating these field, can I copy all these to CopyField *defaultquery *like below : *copyField source=news_English dest=defaultquery/ copyField

Re: how to Index and Search non-Eglish Text in solr

2011-06-08 Thread Erick Erickson
This page is a handy reference for individual languages... http://wiki.apache.org/solr/LanguageAnalysis But the usual approach, especially for Chinese/Japanese/Korean (CJK) is to index the content in different fields with language-specific analyzers then spread your search across the