Thank you so much Tejas. That explains the wmv parsing error. I thought that video/mp4 could run an Adobe Flash but I am not sure. I am inquiring from our company's media expert. Since Tika only parses flash files is there any other plugin available that we can use?
On 3/4/13 11:04 PM, "Tejas Patil" <[email protected]> wrote: >[0] says that Tika 1.2 can only parse flash videos and no other video file >formats. > >[0] : http://tika.apache.org/1.2/formats.html#Video_formats > > >On Mon, Mar 4, 2013 at 1:29 PM, <[email protected]> wrote: > >> Hi, >> >> I am using Nutch 1.5.1 and I am trying to crawl and parse video/mp4, >> video/x-ms-wmv. I do not see any mp4 files being fetched or parsed and >>I >> am getting following error for a wmv file in the logs: >> >> Error parsing: http://www.server-abc.com/Darpa_Video_Final.wmv: >> failed(2,0): Can't retrieve Tika parser for mime-type video/x-ms-wmv >> >> Here is my regex-urlfilter.txt configuration file: >> >> >>-\(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|wmf|WMF|zip|ZI >>P|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP >>|js|JS)$ >> >> Parse-plugins.xml has following: >> >> <mimeType name="video/x-ms-wmv"> >> <plugin id="parse-tika" /> >> </mimeType> >> >> <mimeType name="video/mp4"> >> <plugin id="parse-tika" /> >> </mimeType> >> >> Is there anything else I need to check or missing? Does the http.accept >> property need to have all the mime types that can be accepted? I am >>going >> to try and add it next after my current crawl finishes. Any help will >>be >> greatly appreciated. >> >> Thanks, >> Madhvi >> >> >>

