[ 
https://issues.apache.org/jira/browse/NUTCH-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102130#comment-13102130
 ] 

hadi commented on NUTCH-1108:
-----------------------------

I can parse formats such as pdf,doc,zip,txt with nutch and i set the 
file.content.limit value to -1
also i add the following config in nutch-site.xml:
<property>
  <name>plugin.includes</name>
  
<value>nutch-extensionpoints|protocol-file|protocol-http|urlfilter-regex|parse-(html|tika|pdf|zip|avi)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
  
</property>

but the same error happen

> Index image and video format with nutch 1.3
> -------------------------------------------
>
>                 Key: NUTCH-1108
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1108
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: hadi
>              Labels: image, index, nuch1.3,index, video
>
> when i want to index video file with nutch 1.3 i get the following error :
> Error parsing: file:///D:/film.avi: failed(2,0): Can't retrieve Tika parser 
> for
>    mime-type video/x-msvideo

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to