If language is known also at search time, PerFieldAnalyzerWrapper seems a nice third option: single document per feed, with a separate field for each language, additional field(s) for the common data; using PerFieldAnalyzerWrapper at both indexing and search; using FieldSelector at search to retrieve only the relevant field(s) for matched documents. (never done this myself though.) - Doron
"Melanie Langlois" <[EMAIL PROTECTED]> wrote on 21/03/2007 23:03:03: > Hi, > > > > I saw that there are many post on the mailing list about indexing in > multiple language, so I will try to not post duplicate question. In > my case, I want to index rss feeds, so one feed contains several > items in different languages, and some common data for all the items > (date, source..). After reading the different posts, I think I will > create a document per item, index them in the same index using each > time a language specific analyzer, and store lang field for specific > search. But I'm wondering how I should handle the common fields, it > seems I have two options: > > 1 : store the common data in each item. What happen if duplicate > information are entered, are they duplicate in the index ? > > > > 2 : create a separate document for the common data. In this case I > will need to link these data to all underlying items storing some > ids. The issue is that I would need to search the index twice if the > search is done only per date, because I would need to retrieve the > items contents. > > > > Thank in advance for your help. > > > > Mélanie > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]