have you tried using BodyContentHandler? for example:

...
ContentHandler handler = new BodyContentHandler();
parser.parse(inputStream, handler, metadata, context);
InputStream charStream = new ByteArrayInputStream(handler.toString());
...



Regards,
Wade




On Wed, Apr 25, 2012 at 12:08 PM, Alec Swan <[email protected]> wrote:

> Hello,
>
> We are replacing another text extraction library with Tika. We have
> legacy code which expects document text to be output as an
> InputStream. I understand that this is not directly related to Tika,
> but I am assuming that other Tika users already solved this problem.
>
> Does anybody have any sample code or ideas that will help us pipe
> chars in ContentHandler#characters(..) method to a stream? Is there an
> existing ContentHandler implementation that does this already?
>
> Thanks,
>
> Alec
>

Reply via email to