Re: ExtractMetadata error
Figured it out, and fixed it. The problem was not with xmpbox, but with the fact that project build was bringing in JDOM and Jaxen as part of another package. Moving Jaxen from the parent project to the package where its needed took it out of my image and PDF handling package, and fixed the exception. Bonus! Removing Jaxen also fixed an error I was experiencing with JPEG metadata extraction using Drew Noakes' metadata-extractor ( https://github.com/drewnoakes/metadata-extractor). Eventually all these jars will have to live in a webapp in the same WEB-INF/lib directory, so I may not be out of the woods yet, but at least I know where the problem is coming from. On Thu, Mar 9, 2017 at 1:33 PM, Thad Humphries wrote: > Yes, I can take a stab at that in a few days, after the crunch of my > current project abates. I'll let you know when it's on GitHub. Thanks. > > On Thu, Mar 9, 2017 at 12:43 PM, Tilman Hausherr > wrote: > >> Can you create a minimal but fully working project with maven? I.e. we'd >> need code with main, and a pom. I mention this because an additional lib is >> needed, unless I misunderstood. >> >> Tilman >> >> >> Am 09.03.2017 um 16:51 schrieb Thad Humphries: >> >>> Here's my code. As I said, it is throwing an exception at "new >>> DomXmpParser()" and I have no idea why: >>> >>>protected JSONObject getPdfMetadata(byte [] buffer) >>>throws IOException, XmpParsingException, JSONException { >>> ByteArrayInputStream bais = new ByteArrayInputStream(buffer); >>> >>> JSONObject json = new JSONObject(); >>> PDDocument document = null; >>> try { >>>document = PDDocument.load(bais); >>>PDDocumentCatalog catalog = document.getDocumentCatalog(); >>>PDMetadata meta = catalog.getMetadata(); >>> >>>if (meta != null) { >>> DomXmpParser xmpParser = new DomXmpParser(); // throws >>> exception >>> XMPMetadata metadata = xmpParser.parse(meta.createInp >>> utStream()); >>> >>> DublinCoreSchema dc = metadata.getDublinCoreSchema(); >>> if (dc != null) { >>>JSONObject dcj = new JSONObject(); >>>dcj.put("Title", dc.getTitle()); >>>dcj.put("Description", dc.getDescription()); >>>... >>>json.put("Dublin", dcj); >>> } >>>... >>> >>> My goal is to return a JSON formatted string to a browser, and display >>> the >>> fomatted metadata to the user. So for now I'm getting around this >>> DomXmpParser exception from DomXmpParser by simply converting the >>> metadata >>> to JSON with JSON-java (https://github.com/stleary/JSON-java), and >>> untangling the namespace, etc. on browser side: >>> >>> PDMetadata meta = catalog.getMetadata(); >>> >>>if (meta != null) { >>> InputStream is = meta.exportXMPMetadata(); >>> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >>> int read = 0; >>> byte [] bytes = new byte[8*1024]; >>> while ((read = is.read(bytes)) != -1) { >>>baos.write(bytes, 0, read); >>> } >>> String string = new String(baos.toByteArray()); >>> json = XML.toJSONObject(string); >>> ... >>> >>> >>> On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries < >>> thad.humphr...@gmail.com> >>> wrote: >>> >>> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata example, it works. However when I put the same code into my class, it throws an exception when I call "DomXmpParser xmpParser = new DomXmpParser();" The trace is: java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuil derFactory. setFeature(Ljava/lang/String;Z)V at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81) at com.jthad.util.image.MetadataExtractor.getPdfMetadata( MetadataExtractor.java:170) at com. jthad.util.image.TestMetadataExtractor.testPdf0( TestMetadataExtractor.java:41) ... Line 81 in DomXmpParser.java is dbFactory.setFeature("http://apache.org/xml/features/disallo w-doctype-decl", true); I am at a loss to understand how "new DomXmpParser()" works from the command line but fails when called by a JUnit test in Eclipse. ... >>> - >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: users-h...@pdfbox.apache.org >> > > -- > "Hell hath no limits, nor is circumscrib'd In one self-place; but where we > are is hell, And where hell is, there must we ever be" --Christopher > Marlowe, *Doctor Faustus* (v. 121-24) > -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 121-24)
Re: ExtractMetadata error
Yes, I can take a stab at that in a few days, after the crunch of my current project abates. I'll let you know when it's on GitHub. Thanks. On Thu, Mar 9, 2017 at 12:43 PM, Tilman Hausherr wrote: > Can you create a minimal but fully working project with maven? I.e. we'd > need code with main, and a pom. I mention this because an additional lib is > needed, unless I misunderstood. > > Tilman > > > Am 09.03.2017 um 16:51 schrieb Thad Humphries: > >> Here's my code. As I said, it is throwing an exception at "new >> DomXmpParser()" and I have no idea why: >> >>protected JSONObject getPdfMetadata(byte [] buffer) >>throws IOException, XmpParsingException, JSONException { >> ByteArrayInputStream bais = new ByteArrayInputStream(buffer); >> >> JSONObject json = new JSONObject(); >> PDDocument document = null; >> try { >>document = PDDocument.load(bais); >>PDDocumentCatalog catalog = document.getDocumentCatalog(); >>PDMetadata meta = catalog.getMetadata(); >> >>if (meta != null) { >> DomXmpParser xmpParser = new DomXmpParser(); // throws exception >> XMPMetadata metadata = xmpParser.parse(meta.createInp >> utStream()); >> >> DublinCoreSchema dc = metadata.getDublinCoreSchema(); >> if (dc != null) { >>JSONObject dcj = new JSONObject(); >>dcj.put("Title", dc.getTitle()); >>dcj.put("Description", dc.getDescription()); >>... >>json.put("Dublin", dcj); >> } >>... >> >> My goal is to return a JSON formatted string to a browser, and display the >> fomatted metadata to the user. So for now I'm getting around this >> DomXmpParser exception from DomXmpParser by simply converting the metadata >> to JSON with JSON-java (https://github.com/stleary/JSON-java), and >> untangling the namespace, etc. on browser side: >> >> PDMetadata meta = catalog.getMetadata(); >> >>if (meta != null) { >> InputStream is = meta.exportXMPMetadata(); >> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> int read = 0; >> byte [] bytes = new byte[8*1024]; >> while ((read = is.read(bytes)) != -1) { >>baos.write(bytes, 0, read); >> } >> String string = new String(baos.toByteArray()); >> json = XML.toJSONObject(string); >> ... >> >> >> On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries > > >> wrote: >> >> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata >>> example, it works. However when I put the same code into my class, it >>> throws an exception when I call "DomXmpParser xmpParser = new >>> DomXmpParser();" The trace is: >>> >>> java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory. >>> setFeature(Ljava/lang/String;Z)V >>> at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81) >>> at com.jthad.util.image.MetadataExtractor.getPdfMetadata( >>> MetadataExtractor.java:170) >>> at com. jthad.util.image.TestMetadataExtractor.testPdf0( >>> TestMetadataExtractor.java:41) >>> ... >>> >>> Line 81 in DomXmpParser.java is >>> >>> dbFactory.setFeature("http://apache.org/xml/features/disallo >>> w-doctype-decl", >>> true); >>> >>> I am at a loss to understand how "new DomXmpParser()" works from the >>> command line but fails when called by a JUnit test in Eclipse. >>> ... >>> >> - > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 121-24)
Re: ExtractMetadata error
Can you create a minimal but fully working project with maven? I.e. we'd need code with main, and a pom. I mention this because an additional lib is needed, unless I misunderstood. Tilman Am 09.03.2017 um 16:51 schrieb Thad Humphries: Here's my code. As I said, it is throwing an exception at "new DomXmpParser()" and I have no idea why: protected JSONObject getPdfMetadata(byte [] buffer) throws IOException, XmpParsingException, JSONException { ByteArrayInputStream bais = new ByteArrayInputStream(buffer); JSONObject json = new JSONObject(); PDDocument document = null; try { document = PDDocument.load(bais); PDDocumentCatalog catalog = document.getDocumentCatalog(); PDMetadata meta = catalog.getMetadata(); if (meta != null) { DomXmpParser xmpParser = new DomXmpParser(); // throws exception XMPMetadata metadata = xmpParser.parse(meta.createInputStream()); DublinCoreSchema dc = metadata.getDublinCoreSchema(); if (dc != null) { JSONObject dcj = new JSONObject(); dcj.put("Title", dc.getTitle()); dcj.put("Description", dc.getDescription()); ... json.put("Dublin", dcj); } ... My goal is to return a JSON formatted string to a browser, and display the fomatted metadata to the user. So for now I'm getting around this DomXmpParser exception from DomXmpParser by simply converting the metadata to JSON with JSON-java (https://github.com/stleary/JSON-java), and untangling the namespace, etc. on browser side: PDMetadata meta = catalog.getMetadata(); if (meta != null) { InputStream is = meta.exportXMPMetadata(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); int read = 0; byte [] bytes = new byte[8*1024]; while ((read = is.read(bytes)) != -1) { baos.write(bytes, 0, read); } String string = new String(baos.toByteArray()); json = XML.toJSONObject(string); ... On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries wrote: When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata example, it works. However when I put the same code into my class, it throws an exception when I call "DomXmpParser xmpParser = new DomXmpParser();" The trace is: java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory. setFeature(Ljava/lang/String;Z)V at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81) at com.jthad.util.image.MetadataExtractor.getPdfMetadata( MetadataExtractor.java:170) at com. jthad.util.image.TestMetadataExtractor.testPdf0( TestMetadataExtractor.java:41) ... Line 81 in DomXmpParser.java is dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl";, true); I am at a loss to understand how "new DomXmpParser()" works from the command line but fails when called by a JUnit test in Eclipse. -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 121-24) - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: ExtractMetadata error
Here's my code. As I said, it is throwing an exception at "new DomXmpParser()" and I have no idea why: protected JSONObject getPdfMetadata(byte [] buffer) throws IOException, XmpParsingException, JSONException { ByteArrayInputStream bais = new ByteArrayInputStream(buffer); JSONObject json = new JSONObject(); PDDocument document = null; try { document = PDDocument.load(bais); PDDocumentCatalog catalog = document.getDocumentCatalog(); PDMetadata meta = catalog.getMetadata(); if (meta != null) { DomXmpParser xmpParser = new DomXmpParser(); // throws exception XMPMetadata metadata = xmpParser.parse(meta.createInputStream()); DublinCoreSchema dc = metadata.getDublinCoreSchema(); if (dc != null) { JSONObject dcj = new JSONObject(); dcj.put("Title", dc.getTitle()); dcj.put("Description", dc.getDescription()); ... json.put("Dublin", dcj); } ... My goal is to return a JSON formatted string to a browser, and display the fomatted metadata to the user. So for now I'm getting around this DomXmpParser exception from DomXmpParser by simply converting the metadata to JSON with JSON-java (https://github.com/stleary/JSON-java), and untangling the namespace, etc. on browser side: PDMetadata meta = catalog.getMetadata(); if (meta != null) { InputStream is = meta.exportXMPMetadata(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); int read = 0; byte [] bytes = new byte[8*1024]; while ((read = is.read(bytes)) != -1) { baos.write(bytes, 0, read); } String string = new String(baos.toByteArray()); json = XML.toJSONObject(string); ... On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries wrote: > When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata > example, it works. However when I put the same code into my class, it > throws an exception when I call "DomXmpParser xmpParser = new > DomXmpParser();" The trace is: > > java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory. > setFeature(Ljava/lang/String;Z)V > at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81) > at com.jthad.util.image.MetadataExtractor.getPdfMetadata( > MetadataExtractor.java:170) > at com. jthad.util.image.TestMetadataExtractor.testPdf0( > TestMetadataExtractor.java:41) > ... > > Line 81 in DomXmpParser.java is > > dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl";, > true); > > I am at a loss to understand how "new DomXmpParser()" works from the > command line but fails when called by a JUnit test in Eclipse. > > -- > "Hell hath no limits, nor is circumscrib'd In one self-place; but where we > are is hell, And where hell is, there must we ever be" --Christopher > Marlowe, *Doctor Faustus* (v. 121-24) > -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 121-24)
ExtractMetadata error
When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata example, it works. However when I put the same code into my class, it throws an exception when I call "DomXmpParser xmpParser = new DomXmpParser();" The trace is: java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.setFeature(Ljava/lang/String;Z)V at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81) at com.jthad.util.image.MetadataExtractor.getPdfMetadata(MetadataExtractor.java:170) at com. jthad.util.image.TestMetadataExtractor.testPdf0(TestMetadataExtractor.java:41) ... Line 81 in DomXmpParser.java is dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl";, true); I am at a loss to understand how "new DomXmpParser()" works from the command line but fails when called by a JUnit test in Eclipse. -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 121-24)