[
https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178719#comment-13178719
]
Lewis John McGibbney commented on NUTCH-1138:
-
Hey Markus, it's been committed
Hi,
Right now the state of the crawldb is set to success for items without a
parser that throw:
Exception in thread main org.apache.nutch.parse.ParseException: parser not
found for contentType=video/x-flv url=
at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:78)
at
It's a good point Markus. I would imagine that we would wish to do one
of two things
1) Create a parser to fetch the contentType in question (not the aim
of Nutch but geared more towards Tika contribution...)
2) As you mention, use a parser implementation which stores this
contentType as
See https://builds.apache.org/job/Nutch-trunk/1714/
--
[...truncated 2386 lines...]
resolve-default:
[ivy:resolve] :: loading settings :: file =
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-trunk/trunk/ivy/ivysettings.xml
compile:
4 matches
Mail list logo