Hi,
I'm using Nutch with the language-identifier plugin enabled to detect the
language of the html-pages. For indexing I use a Solr server.
So far everything works but there's one problem: I don't know how to map
multilingual fields to their corresponding Solr-field.
The mapping file solrindex-mapping.xml contains the following:
<field dest="lang" source="lang"/>
<field dest="title" source="title"/>
But what I would like to have is the following
<field dest="lang" source="lang"/>
<field dest="title" source="title" multilingual="true" language="lang"/>
or maybe
<field dest="lang" source="lang"/>
<field dest="title_${lang}" source="title" />
so that the title-field gets mapped to title_en for English-pages and
tilte_fr for French pages.
I found the SolrWriter- and SorlMappingReader-classes in the source-code, an
it should be easy to integrate it there.
What do you think? Could this be useful also to others?
Or are there any other solutions out there?
Thanks
Matthias