[0] says that Tika 1.2 can only parse flash videos and no other video file
formats.

[0] : http://tika.apache.org/1.2/formats.html#Video_formats


On Mon, Mar 4, 2013 at 1:29 PM, <[email protected]> wrote:

> Hi,
>
> I am using Nutch 1.5.1 and I am trying to crawl and parse video/mp4,
> video/x-ms-wmv. I do not see any mp4 files being fetched or parsed and  I
> am getting following error for a wmv file in the logs:
>
> Error parsing: http://www.server-abc.com/Darpa_Video_Final.wmv:
> failed(2,0): Can't retrieve Tika parser for mime-type video/x-ms-wmv
>
> Here is my regex-urlfilter.txt configuration file:
>
> -\(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|wmf|WMF|zip|ZIP|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP|js|JS)$
>
> Parse-plugins.xml has following:
>
> <mimeType name="video/x-ms-wmv">
>    <plugin id="parse-tika" />
> </mimeType>
>
> <mimeType name="video/mp4">
>    <plugin id="parse-tika" />
> </mimeType>
>
> Is there anything else I need to check or missing? Does the http.accept
> property need to have all the mime types that can be accepted? I am going
> to try and add it next after my current crawl finishes.  Any help will be
> greatly appreciated.
>
> Thanks,
> Madhvi
>
>
>

Reply via email to