On Feb 5, 2009, at 1:22 AM, Jukka Zitting wrote:

Hi,

On Thu, Feb 5, 2009 at 3:02 AM, Jonathan Koren <jonat...@soe.ucsc.edu> wrote:
What I really want is someone to tell me how to get back a usable stream of plaintext, whether this involves a radical change to Tika's ContentHandler class or some trick with Java, I really don't care, as long as it's single
thread save.

Have you looked at the ParsingReader class? It seems like a perfect
match to your needs. The ParsingReader class fires a background thread
to do the parsing and pipes the output so you can control when and how
you want to read the extracted text.

I had no idea that class existed.  Thanks.

--
Jonathan Koren
jonat...@soe.ucsc.edu
http://www.soe.ucsc.edu/~jonathan/


Reply via email to