Hi,

On 5/21/07, Marcin Okraszewski <[EMAIL PROTECTED]> wrote:
> Hi,
> The Neko HTML parser set up is done in silent try / catch statement (Nutch 
> 0.9: HtmlParser.java:248-259). The problem is that the first feature being 
> set thrown an exception. So, the whole setup block is skipped. The catch 
> statement does nothing, so probably nobody noticed this.
>
> I attach a patch which fixes this. It was done on Nutch 0.9, but SVN trunk 
> contains the same code.
>
> The patch does:
> 1. Fixes augmentations feature.
> 2. Removes include-comments feature, because I couldn't find anything similar 
> at http://people.apache.org/~andyc/neko/doc/html/settings.html
> 3. Prints warn message when exception is caught.
>
> Please note that now there goes a lot for messages to console (not log4j 
> log), because "report-errors" feature is being set. Shouldn't it be removed?

I would suggest that you open a JIRA issue and attach the patch there.
For this case, there is a similar issue(with patch) at NUTCH-369.

>
> Cheers,
> Marcin
>


-- 
Doğacan Güney
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to