Hi Hoss

One bad thing in having fields specific for your language (in my point of
view) is that you will have to re-index your content when you add a new
language (some will need to start with one language and in future will have
others added). But OK, let's say the indexing is done.

So using dynamic fields and creating all language variations for your field
types that may need language aware processing could make it. But this way
you are going to have a different "interface" as the system will receive and
return a different set of fields in queries, wouldn't?
It could be avoided transforming the request / response to a language aware
/ unaware format:
requests: transforming  fieldName => fieldName_language
responses: transforming  fieldName_language => fieldName

And still you will not be able to search for all your documents... It may be
interesting to search for the last published contents (no matter in which
language this content is)...

What do you think about it?

Regards,
Daniel

On 12/6/07 19:50, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:

> 
> : Due to the prolification of number of fields.  Say, we want
> : to have the field "title" to have the title of the book in
> : its original language.  But because Solr has this implicit
> : assumption of one language per field, we would have to have
> : the artifitial fields title_fr, title_de, title_en, title_es,
> : etc. etc. for the number of supported languages, only one of
> : which has a ral value per document.  This sounds silly, doesn't it?
> 
> not really, i have indexes with *thousands* of fields ... if you turn
> field norms off it's extremely efficient, but even with norms: 50*n fields
> where n is the number of "real" fields you have (title, author, etc..)
> should work fine.
> 
> furthermore, declaration of these fields can be simple -- if you have a
> language you want to treat special, then presumably you have a special
> analyzer for it.  dynamicFields where the field name is the wildcard
> and the language is set can be used to handle all of the different
> "indexed" fields,
> 
> <dynamicField name="*english" type="english" />
> <dynamicField name="*french" type="french" />
> <dynamicField name="*spanish" type="german" />
> ...more like the above for each lanague you wnat to support...
> <copyField source="*_english" dest="english" />
> <copyField source="*_french" dest="french" />
> <copyField source="*_spanish" dest="spanish" />
> ...more like the above for each lanague you wnat to support...
> 
> and now you can index documents with fields like this...
> 
>    author_english = Mr. Chris Hostetter
>    author_spanish = Senor Cristobol Hostetter
>    body_english = I can't Believe It's not butter
>    body_spanish = No puedo creer que no es mantaquea
>    title_english = One Man's Disbelief
> 
> ...and you can search on english:Chris, spanish:Cristobol,
> author_spanish:Cristobol, etc...
> 
> you could even add dynamicFields with the field name set and the language
> wildcarded to handle any fields used solely for display with even less
> declaration (one per field instead of one per langauge) ...
> 
> <dynamicField name="display_title_*" type="string" />
> ...
> 
> 
> 
> 
> -Hoss
> 


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

Reply via email to