On Mon, 28 Mar 2011, Withanage, Dulip wrote:
We are interesting in extracting the row metadata (not formatted in XHML as SAX events) from the files using tika.
Generally speaking, all of the metadata that is extracted is placed into the Metadata object you supply when parsing. The SAX events are generated for the textual content of the file.
I think you might find that Tika already does what you need, but if not I'd suggest you come back with a concrete example of a file, it's metadata, the XML you get, and what you'd really hoped for!
Nick
