On Wed, 6 Apr 2011, Withanage, Dulip wrote:
1. I tried calling the --medatadata option and it gives me the metadataname:value. So this looks promising to me, if i could format the above output as xml. what is your advice to do it the best way?

You'll probably want to write some java code at this point, rather than just calling the tika app on the command line. Grab the Metadata object back, then loop over the entries and output them as XML in whatever format you want them to be in.

2. I have seen the xmp class org.apache.tika.parser.image.xmp.xmp.XMPPackerScanner Is there anyway to use this as default parser for the jpeg and tiff. Can it be done by configuring in the tika-mimetypes.xml ?

Nope, tika-mimetypes.xml is used for detection, and the parser stuff is different.

XMPPackerScanner isn't a parser though, but instead is called by the existing parsers when they detect XMP metadata within a file. The existing JPEG and TIFF parsers already do this for you

Nick

Reply via email to