[ https://issues.apache.org/jira/browse/NUTCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated NUTCH-564: ------------------------------------ Patch Info: [Patch Available] Fix Version/s: (was: 1.1) - pushing this out per http://bit.ly/c7tBv9 > External parser supports encoding attribute > ------------------------------------------- > > Key: NUTCH-564 > URL: https://issues.apache.org/jira/browse/NUTCH-564 > Project: Nutch > Issue Type: Improvement > Components: indexer > Affects Versions: 0.9.0 > Environment: All > Reporter: Antony Bowesman > Priority: Minor > Attachments: ExtParser_0.9.0.patch, ExtParser_1.0.0.patch > > > When an external component generates text, which is returned to the external > parser, it always converts the text using the default character set. > (os.toString()). For example, the returned text may be utf-8, but will not > be converted to a String correctly. > I added the attribute <encoding> to the <implementation> XML in plugin.xml > and this is then used to convert the text. > I have tested my original fix on my local 0.9 and include a patch, but have > also made an untested patch for trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.