extract metadata of pdf files with tika

Hi,

I'm a little newbie with tika and would need some help.

I have many pdf files which i would like to extract metadata, in orderto have an xml file (which respect dublin core).

I've followed these linkshttp://www.hascode.com/2012/12/content-detection-metadata-and-content-extraction-with-apache-tika/#Extracting_Metadata_from_a_PDF_using_a_concrete_Parserand http://tika.apache.org/0.8/api/org/apache/tika/metadata/Metadata.html


Do i have to write a program with tika to do it?

How could i do that?

Best regards

Samuel

<<attachment: samuel_desseaux.vcf>>

extract metadata of pdf files with tika

Reply via email to