[
https://issues.apache.org/jira/browse/NUTCH-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102130#comment-13102130
]
hadi commented on NUTCH-1108:
-----------------------------
I can parse formats such as pdf,doc,zip,txt with nutch and i set the
file.content.limit value to -1
also i add the following config in nutch-site.xml:
<property>
<name>plugin.includes</name>
<value>nutch-extensionpoints|protocol-file|protocol-http|urlfilter-regex|parse-(html|tika|pdf|zip|avi)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
but the same error happen
> Index image and video format with nutch 1.3
> -------------------------------------------
>
> Key: NUTCH-1108
> URL: https://issues.apache.org/jira/browse/NUTCH-1108
> Project: Nutch
> Issue Type: Bug
> Reporter: hadi
> Labels: image, index, nuch1.3,index, video
>
> when i want to index video file with nutch 1.3 i get the following error :
> Error parsing: file:///D:/film.avi: failed(2,0): Can't retrieve Tika parser
> for
> mime-type video/x-msvideo
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira