Hi, I am attempting to extract metadata from a pdf into JSON with the following command....
java -jar tika-app-1.5.jar -j example.pdf This appears to give the output twice. For example the following pdf gives the following output below which throws my JSON parser. Am I doing something wrong? Thanks. Example PDF http://www.dadsgarage.com/~/media/Files/example.ashx Output { "Author":null, "Content-Length":194007, "Content-Type":"application/pdf", "Keywords":null, "cp:subject":null, "creator":null, "dc:creator":null, "dc:subject":null, "dc:title":null, "meta:author":null, "meta:keyword":null, "producer":"dvips + GNU Ghostscript 7.05", "resourceName":"example.pdf", "subject":null, "title":null, "xmp:CreatorTool":"LaTeX with hyperref package", "xmpTPg:NPages":10 }{ "Author":null, "Content-Length":194007, "Content-Type":"application/pdf", "Keywords":null, "cp:subject":null, "creator":null, "dc:creator":null, "dc:subject":null, "dc:title":null, "meta:author":null, "meta:keyword":null, "producer":"dvips + GNU Ghostscript 7.05", "resourceName":"example.pdf", "subject":null, "title":null, "xmp:CreatorTool":"LaTeX with hyperref package", "xmpTPg:NPages":10 } This e-mail message and any attached file is the property of the sender and is sent in confidence to the addressee only. Internet communications are not secure and RPS is not responsible for their abuse by third parties, any alteration or corruption in transmission or for any loss or damage caused by a virus or by other means. Any advice contained in this e-mail is for information purposes only. RPS Planning and Development Limited, company number: 02947164 (England). Registered office: 20 Western Avenue Milton Park Abingdon Oxfordshire OX14 4SH. RPS Group Plc web link: <http://www.rpsgroup.com>
