Can you not just add multiValued="true" to your content field then
specify the correct mapping? This will map stripedContent to Content
as well as Content to Content.

On Thu, Jul 26, 2012 at 2:35 AM, Matt Poff <[email protected]> wrote:
> Hi,
>
> I'm using the blacklist|whitelist plug-in with Nutch 1.3 provided with the 
> patch at: https://issues.apache.org/jira/browse/NUTCH-585. The plug-in strips 
> out content from HTML pages identified by HTML element:class or element:id 
> descriptors. The ReadMe instructions note that the following needs to be 
> added to the schema.xml file.
>
> <!-- fields for the blacklist/whitelist plugin -->
> <field name="strippedContent" type="text" stored="true" indexed="true"/>
>
> I've done this for the schema file in both Solr and Nutch and also added this 
> line to solrindex-mapping.xml:
>
> <field  dest="strippedContent" source="strippedContent"/>
>
>
> A crawl with this config works great. I can see the new field containing the 
> stripped content in the index. Problem is I want to target the contents of 
> strippedContent into the Content field but all attempts are resulting in this 
> error:
>
> Jul 26, 2012 1:09:04 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: ERROR: multiple values 
> encountered for non multiValued copy field content: .....
>
> In schema.xml (Nutches and Solr I have):
>
>     <field name="content" type="text" indexed="true" stored="true" 
> termVectors="true"/>
>     ...
>     <!-- fields for the blacklist/whitelist plugin -->
>     <field name="strippedContent" type="text" stored="true" indexed="true"/>
>     ...
>     <copyField source="strippedContent" dest="content"/>
>
>
> In Nutch's solrindexmapping.xml file I have no directives for either the 
> content or strippedContent fields. Can anyone point me to where I'm going 
> wrong with the config? My ideal state is to write the strippedContent field 
> into the content field and not keep a copy of the strippedContent field in 
> the index at all.
>
> Thanks in advance,
> Matt
>
>
>
>
>
>
> .headfirst
> WEB DEVELOPERS .ENGAGING .USEFUL .WORKS
> web:www.headfirst.co.nz
> email:[email protected]
> phone:(04) 498 5737
> mobile:022 384 3874
>
>
>



-- 
Lewis

Reply via email to