[
https://issues.apache.org/jira/browse/NUTCH-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1607.
----------------------------------
Resolution: Not A Problem
Sorry for the later reply. A simple workaround is to modify the schema and mark
that field as multivalued. Adding a specific behaviour to enforce what you are
describing seems a bit heavy handed. After all, Nutch processes the document
correctly. If you really want to enforce a single value, you could also do that
as a separate indexfilter
> Make inproper multiValued field configurable
> --------------------------------------------
>
> Key: NUTCH-1607
> URL: https://issues.apache.org/jira/browse/NUTCH-1607
> Project: Nutch
> Issue Type: Bug
> Components: indexer
> Affects Versions: 1.7
> Reporter: -christian
> Fix For: 1.8
>
>
> After using 1.7 some errors on commiting to Solr came up:
> Nutch is indexing a specific URL where he is parsing a field, that -
> according to schema.xml for solr - is a single value field, e.g. description
> For that particular site the description is declared twice - which is beyond
> our control, leading to an error when commiting those documents to Solr. Is
> there any workaround for that, e.g. telling nutch to take only the first
> occurance and ignore the rest?
> Maybe this is an regression from
> https://issues.apache.org/jira/browse/NUTCH-1560
--
This message was sent by Atlassian JIRA
(v6.1#6144)