Hello,

Can someone point me in the right direction for streaming the structured
xhtml output from a Tika Parser. The closest I am getting is using a
BodyContentHandler as below.

        Parser parser = tika.getParser();
        ParseContext context = new ParseContext();
        context.set(Locale.class, Locale.ENGLISH);
        PrintStream printer = new PrintStream(System.out);
        ContentHandler handler = new BodyContentHandler(printer);
        Metadata mtdt = new Metadata();
        parser.parse(new FileInputStream(f), handler, mtdt, context);
        printer.close();

Is there a ContentHandler that can do this easily? I apologize that my
comprehension of the SAX api is minimal at best.

Thanks,

Don

Reply via email to