Re: XHTML Bean and corresponding content handler

Jukka Zitting Thu, 06 Aug 2009 08:19:06 -0700

Hi,

On Tue, Aug 4, 2009 at 9:30 AM, Michael
Wechner<michael.wech...@wyona.com> wrote:
> String XHTMLBean.getHead().getMeta(XHTMLBean.DESCRIPTION)
> String XHTMLBean.getHead().getTitle()


These you can get from the Metadata object.

> String[] XHTMLBean.getBody().getParagraphs();

This is a bit troublesome as not all parsers produce paragraphs of
content. For example the Excel parser produces XHTML tables.

You can either get just the plain character stream using tools like
BodyContentHandler, or the full XHTML output as SAX events (which you
can serialize to a byte stream if you want). I'm not sure if there's
any reasonable intermediate content abstraction.

BR,

Jukka Zitting

Re: XHTML Bean and corresponding content handler

Reply via email to