Hi,

I am attempting to extract metadata from a pdf into JSON with the following 
command....

java -jar tika-app-1.5.jar -j example.pdf

This appears to give the output twice.  For example the following pdf gives the 
following output below which throws my JSON parser.  Am I doing something wrong?

Thanks.

Example PDF

http://www.dadsgarage.com/~/media/Files/example.ashx

Output

{ "Author":null,
"Content-Length":194007,
"Content-Type":"application/pdf",
"Keywords":null,
"cp:subject":null,
"creator":null,
"dc:creator":null,
"dc:subject":null,
"dc:title":null,
"meta:author":null,
"meta:keyword":null,
"producer":"dvips + GNU Ghostscript 7.05",
"resourceName":"example.pdf",
"subject":null,
"title":null,
"xmp:CreatorTool":"LaTeX with hyperref package",
"xmpTPg:NPages":10 }{ "Author":null,
"Content-Length":194007,
"Content-Type":"application/pdf",
"Keywords":null,
"cp:subject":null,
"creator":null,
"dc:creator":null,
"dc:subject":null,
"dc:title":null,
"meta:author":null,
"meta:keyword":null,
"producer":"dvips + GNU Ghostscript 7.05",
"resourceName":"example.pdf",
"subject":null,
"title":null,
"xmp:CreatorTool":"LaTeX with hyperref package",
"xmpTPg:NPages":10 }

This e-mail message and any attached file is the property of the sender and is 
sent in confidence to the addressee only.
Internet communications are not secure and RPS is not responsible for their 
abuse by third parties, any alteration or corruption in transmission or for any 
loss or damage caused by a virus or by other means.
 
Any advice contained in this e-mail is for information purposes only.

RPS Planning and Development Limited, company number: 02947164 (England). 
Registered office: 20 Western Avenue Milton Park Abingdon Oxfordshire OX14 4SH.
RPS Group Plc web link: <http://www.rpsgroup.com>

Reply via email to