Hi,

Right now the state of the crawldb is set to success for items without a 
parser that throw: 

Exception in thread "main" org.apache.nutch.parse.ParseException: parser not 
found for contentType=video/x-flv url=
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:78)
        at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:101)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:138)

Should we do that at all? It doesn't seem right. I, for instance, am not 
interested in retrying such an URL again for a very long time.

Thoughts?
Thanks

Reply via email to