Re: Multi-language indexing and searching

Chris Hostetter Tue, 19 Jun 2007 11:19:51 -0700

: range wouldn't be a problem in this case. The real issue I can see in this
: approach, is related to Analyzers... How to make them deal with different
: languages properly using one Solr instance with the same set of fields being
: used by documents in different languages....


i would still use the same type of schema i suggested before ... one
fieldtype per langauge and dynamic fields per language ... it's just that
now you don't bother indexing text in both the english_title and
french_title fields ... you use one or the other depending on what the
langauge of this particular translation is, just as you guessed...

: Looks like my best alternative then is using dynamic fields having then a
: set of fields for each language. But anyway I think I'll still need a way to
: apply different analyzers at query time so I can deal with each language
: details. Is it correct?

No, you just need your client to query the right field based on the
langauge ... if the end user wants to search for "playa" in all "spanish"
documents your client code should query for something
like "spanish_title:playa^3 spanish_body:playa" ... you could even have a
dismax handler instance configured per language to make this transparent
to the client...

      q=playa&qt=spanish

...since you know only have one language per "document" the reponse docs
your client gets back are even easier to deal with, becuase you can have a
single stored field for each conceptual field for display purposes using
the canonical name ... the client can display the "title" field and hte
"author" field and not have to know/remember that the search is in
spanish.



-Hoss

Re: Multi-language indexing and searching

Reply via email to