Hi Chris,

On Tue, Apr 17, 2012 at 3:20 PM, Mattmann, Chris A (388J) <
[email protected]> wrote:

>
> I think it would be super awesome to add the Any23 parsing functionality
> as a Tika parser, and potentially
> an extension to the MIME repository to detect microformats, etc. Then in
> Nutch, we could take advantage of
> the any23 parser with the existing tika-parser interface.
>
> Thoughts?
>

Well on top of what I've managed to ramble on elsewhere, I think that this
utopian vision is something I will definitely be pushing for. It makes
perfect sense but I think it's a case of Any23 maturing within the
incubator before we can push it up to the Tika PMC for this stuff. It would
be a win win situation as Nutch would benefit also.

For the time being I think a step back (Tika wrapper) plugin implementing
the Any23 functionality would be a good start. I'll be making headways on
this over the next while so will keep all you guys up to date with it.

Thanks
Lewis

Reply via email to