On Wed, 23 Oct 2013, Samuel Desseaux wrote:
What language are you writing the rest of your solution in? How are you
planning to transform and filter the metadata to get your xml?
With java, i think.
The simplest way to get started them would be using the Tika Facade helper
class:
http://tika.apache.org/1.4/api/org/apache/tika/Tika.html
That provides a simple way to call Tika and get back text + metadata.
Later, you might want to call the parser(s) directly, but the above should
get you going very quickly
Nick