subject:"What to do with items for which is no parser\?"

What to do with items for which is no parser?

2012-01-03 Thread Markus Jelsma

Hi, Right now the state of the crawldb is set to success for items without a parser that throw: Exception in thread main org.apache.nutch.parse.ParseException: parser not found for contentType=video/x-flv url= at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:78) at

Re: What to do with items for which is no parser?

2012-01-03 Thread Markus Jelsma

It's a good point Markus. I would imagine that we would wish to do one of two things 1) Create a parser to fetch the contentType in question (not the aim of Nutch but geared more towards Tika contribution...) 2) As you mention, use a parser implementation which stores this contentType as