What to do with items for which is no parser?

Markus Jelsma Tue, 03 Jan 2012 09:19:03 -0800

Hi,

Right now the state of the crawldb is set to success for items without a 
parser that throw:


Exception in thread "main" org.apache.nutch.parse.ParseException: parser not 
found for contentType=video/x-flv url=
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:78)
        at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:101)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:138)

Should we do that at all? It doesn't seem right. I, for instance, am not 
interested in retrying such an URL again for a very long time.

Thoughts?
Thanks

What to do with items for which is no parser?

Reply via email to