Hi, On 5/21/07, Marcin Okraszewski <[EMAIL PROTECTED]> wrote: > Hi, > The Neko HTML parser set up is done in silent try / catch statement (Nutch > 0.9: HtmlParser.java:248-259). The problem is that the first feature being > set thrown an exception. So, the whole setup block is skipped. The catch > statement does nothing, so probably nobody noticed this. > > I attach a patch which fixes this. It was done on Nutch 0.9, but SVN trunk > contains the same code. > > The patch does: > 1. Fixes augmentations feature. > 2. Removes include-comments feature, because I couldn't find anything similar > at http://people.apache.org/~andyc/neko/doc/html/settings.html > 3. Prints warn message when exception is caught. > > Please note that now there goes a lot for messages to console (not log4j > log), because "report-errors" feature is being set. Shouldn't it be removed?
I would suggest that you open a JIRA issue and attach the patch there. For this case, there is a similar issue(with patch) at NUTCH-369. > > Cheers, > Marcin > -- Doğacan Güney ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers