[ https://issues.apache.org/jira/browse/NUTCH-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530797 ]
Hudson commented on NUTCH-487: ------------------------------ Integrated in Nutch-Nightly #219 (See [http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/219/]) > Neko HTML parser goes on default settings. > ------------------------------------------ > > Key: NUTCH-487 > URL: https://issues.apache.org/jira/browse/NUTCH-487 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.9.0 > Environment: Linux, Java 1.5.0. > Reporter: Marcin Okraszewski > Fix For: 1.0.0 > > Attachments: neko_setup.patch > > > The Neko HTML parser set up is done in silent try / catch statement (Nutch > 0.9: HtmlParser.java:248-259). The problem is that the first feature being > set thrown an exception. So, the whole setup block is skipped. The catch > statement does nothing, so probably nobody noticed this. > I attach a patch which fixes this. It was done on Nutch 0.9, but SVN trunk > contains the same code. > The patch does: > 1. Fixes augmentations feature. > 2. Removes include-comments feature, because I couldn't find anything similar > at http://people.apache.org/~andyc/neko/doc/html/settings.html > 3. Prints warn message when exception is caught. > Please note that now there goes a lot for messages to console (not log4j > log), because "report-errors" feature is being set. Shouldn't it be removed? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.