[ 
https://issues.apache.org/jira/browse/NUTCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antony Bowesman updated NUTCH-564:
----------------------------------

    Fix Version/s: 0.9.0
      Description: 
When an external component generates text, which is returned to the external 
parser, it always converts the text using the default character set.  
(os.toString()).  For example, the returned text may be utf-8, but will not be 
converted to a String correctly.

I added the attribute <encoding> to the <implementation> XML in plugin.xml and 
this is then used to convert the text.

I have tested my original fix on my local 0.9 and include a patch, but have 
also made an untested patch for trunk.





  was:
When an external component generates text, which is returned to the external 
parser, it always converts the text using the default character set.  
(os.toString()).  For example, the returned text may be utf-8, but will not be 
converted to a String correctly.

I added the attribute <encoding> to the <implementation> XML in plugin.xml and 
this is then used to convert the text.

I have made my original fix to my local 0.9, but have made a patch based on the 
trunk.





> External parser supports encoding attribute
> -------------------------------------------
>
>                 Key: NUTCH-564
>                 URL: https://issues.apache.org/jira/browse/NUTCH-564
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 0.9.0
>         Environment: All
>            Reporter: Antony Bowesman
>            Priority: Minor
>             Fix For: 0.9.0, 1.0.0
>
>         Attachments: ExtParser_0.9.0.patch, ExtParser_1.0.0.patch
>
>
> When an external component generates text, which is returned to the external 
> parser, it always converts the text using the default character set.  
> (os.toString()).  For example, the returned text may be utf-8, but will not 
> be converted to a String correctly.
> I added the attribute <encoding> to the <implementation> XML in plugin.xml 
> and this is then used to convert the text.
> I have tested my original fix on my local 0.9 and include a patch, but have 
> also made an untested patch for trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to