new JIRA?

On 9 August 2012 23:30, Markus Jelsma <[email protected]> wrote:

> hmm, i'm not sure but maybe we don't include all Tika parser deps in our
> build.xml?
>
>
>
> -----Original message-----
> > From:Sebastian Nagel <[email protected]>
> > Sent: Thu 09-Aug-2012 23:18
> > To: [email protected]
> > Subject: Re: CHM Files and Tika
> >
> > Hi Jan,
> >
> > confirmed: Nutch cannot parse, while Tika (same version used by Nutch)
> > can parse chm. The chm parsers are in tika-parser*.jar which is contained
> > in the Nutch package.
> >
> > Any ideas?
> >
> > Sebastian
> >
> > On 08/08/2012 12:03 PM, Jan Riewe wrote:
> > > Hey there,
> > >
> > > i try to parse CHM (Microsoft Help Files) with Nucht, but i get a:
> > >
> > > Can't retrieve Tika parser for mime-type application/vnd.ms-htmlhelp
> > >
> > > i've tried version 1.4 (tika 0.10) and 1.51 from nutch (tika 1.1) which
> > > should be able to parse those files
> > > https://issues.apache.org/jira/browse/TIKA-245
> > >
> > > In the tika-mimetypes.xml i do find a entry related to
> > > application/vnd.ms-htmlhelp
> > >
> > > Does anyone ever ran into the same issues and knows how to fix that?
> > >
> > > Bye
> > > Jan
> > >
> >
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to