Thanks Erick for your help. I have another silly question. Suppose I created mutiple fieldTypes e.g. news_English, news_Chinese, news_Japnese etc. after creating these field, can I copy all these to CopyField "*defaultquery" *like below :
*<copyField source="news_English" dest="defaultquery"/> <copyField source="news_Chinese" dest="defaultquery"/> <copyField source="news_Japnese" dest="defaultquery"/> *and my "defaultquery" looks like :* <field name="defaultquery" type="query_text" indexed="false" stored="false" multiValued="true"/> *Is this right way to deal with multiple language Indexing and searching* * ???* * On 9 June 2011 19:06, Erick Erickson <erickerick...@gmail.com> wrote: > No, you'd have to create multiple fieldTypes, one for each language.... > > Best > Erick > > On Thu, Jun 9, 2011 at 5:26 AM, Mohammad Shariq <shariqn...@gmail.com> > wrote: > > Can I specify multiple language in filter tag in schema.xml ??? like > below > > > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr. > > WhitespaceTokenizerFactory"/> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="stopwords.txt" enablePositionIncrements="true"/> > > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" > > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > > catenateAll="0" splitOnCaseChange="1"/> > > > > <filter class="solr.SnowballPorterFilterFactory" language="Dutch" /> > > <filter class="solr.SnowballPorterFilterFactory" language="English" /> > > <filter class="solr.SnowballPorterFilterFactory" language="Chinese" /> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <tokenizer class="solr.CJKTokenizerFactory"/> > > > > > > > > <filter class="solr.LowerCaseFilterFactory"/><filter > > class="solr.SnowballPorterFilterFactory" language="Hungarian" /> > > > > > > On 8 June 2011 18:47, Erick Erickson <erickerick...@gmail.com> wrote: > > > >> This page is a handy reference for individual languages... > >> http://wiki.apache.org/solr/LanguageAnalysis > >> > >> But the usual approach, especially for Chinese/Japanese/Korean > >> (CJK) is to index the content in different fields with language-specific > >> analyzers then spread your search across the language-specific > >> fields (e.g. title_en, title_fr, title_ar). Stemming and stopwords > >> particularly give "surprising" results if you put words from different > >> languages in the same field. > >> > >> Best > >> Erick > >> > >> On Wed, Jun 8, 2011 at 8:34 AM, Mohammad Shariq <shariqn...@gmail.com> > >> wrote: > >> > Hi, > >> > I had setup solr( solr-1.4 on Ubuntu 10.10) for indexing news articles > in > >> > English, but my requirement extend to index the news of other > languages > >> too. > >> > > >> > This is how my schema looks : > >> > <field name="news" type="text" indexed="true" stored="false" > >> > required="false"/> > >> > > >> > > >> > And the "text" Field in schema.xml looks like : > >> > > >> > <fieldType name="text" class="solr.TextField" > positionIncrementGap="100"> > >> > <analyzer type="index"> > >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> > <filter class="solr.StopFilterFactory" ignoreCase="true" > >> > words="stopwords.txt" enablePositionIncrements="true"/> > >> > <filter class="solr.WordDelimiterFilterFactory" > >> generateWordParts="1" > >> > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > >> > catenateAll="0" splitOnCaseChange="1"/> > >> > <filter class="solr.LowerCaseFilterFactory"/> > >> > <filter class="solr.SnowballPorterFilterFactory" > language="English" > >> > protected="protwords.txt"/> > >> > </analyzer> > >> > <analyzer type="query"> > >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> > <filter class="solr.SynonymFilterFactory" > synonyms="synonyms.txt" > >> > ignoreCase="true" expand="true"/> > >> > <filter class="solr.StopFilterFactory" ignoreCase="true" > >> > words="stopwords.txt" enablePositionIncrements="true"/> > >> > <filter class="solr.WordDelimiterFilterFactory" > >> generateWordParts="1" > >> > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > >> > catenateAll="0" splitOnCaseChange="1"/> > >> > <filter class="solr.LowerCaseFilterFactory"/> > >> > <filter class="solr.SnowballPorterFilterFactory" > language="English" > >> > protected="protwords.txt"/> > >> > </analyzer> > >> > </fieldType> > >> > > >> > > >> > My Problem is : > >> > Now I want to index the news articles in other languages to e.g. > >> > Chinese,Japnese. > >> > How I can I modify my text field so that I can Index the news in other > >> lang > >> > too and make it searchable ?? > >> > > >> > Thanks > >> > Shariq > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > View this message in context: > >> > http://lucene.472066.n3.nabble.com/how-to-Index-and-Search-non-Eglish-Text-in-solr-tp3038851p3038851.html > >> > Sent from the Solr - User mailing list archive at Nabble.com. > >> > > >> > > > > > > > > -- > > Thanks and Regards > > Mohammad Shariq > > > -- Thanks and Regards Mohammad Shariq