Re: ExtractMetadata error

2017-03-10 Thread Thad Humphries
Figured it out, and fixed it. The problem was not with xmpbox, but with the
fact that project build was bringing in JDOM and Jaxen as part of another
package. Moving Jaxen from the parent project to the package where its
needed took it out of my image and PDF handling package, and fixed the
exception.

Bonus! Removing Jaxen also fixed an error I was experiencing with JPEG
metadata extraction using Drew Noakes' metadata-extractor (
https://github.com/drewnoakes/metadata-extractor).

Eventually all these jars will have to live in a webapp in the same
WEB-INF/lib directory, so I may not be out of the woods yet, but at least I
know where the problem is coming from.

On Thu, Mar 9, 2017 at 1:33 PM, Thad Humphries 
wrote:

> Yes, I can take a stab at that in a few days, after the crunch of my
> current project abates. I'll let you know when it's on GitHub. Thanks.
>
> On Thu, Mar 9, 2017 at 12:43 PM, Tilman Hausherr 
> wrote:
>
>> Can you create a minimal but fully working project with maven? I.e. we'd
>> need code with main, and a pom. I mention this because an additional lib is
>> needed, unless I misunderstood.
>>
>> Tilman
>>
>>
>> Am 09.03.2017 um 16:51 schrieb Thad Humphries:
>>
>>> Here's my code. As I said, it is throwing an exception at "new
>>> DomXmpParser()" and I have no idea why:
>>>
>>>protected JSONObject getPdfMetadata(byte [] buffer)
>>>throws IOException, XmpParsingException, JSONException {
>>>  ByteArrayInputStream bais = new ByteArrayInputStream(buffer);
>>>
>>>  JSONObject json = new JSONObject();
>>>  PDDocument document = null;
>>>  try {
>>>document = PDDocument.load(bais);
>>>PDDocumentCatalog catalog = document.getDocumentCatalog();
>>>PDMetadata meta = catalog.getMetadata();
>>>
>>>if (meta != null) {
>>>  DomXmpParser xmpParser = new DomXmpParser();  // throws
>>> exception
>>>  XMPMetadata metadata = xmpParser.parse(meta.createInp
>>> utStream());
>>>
>>>  DublinCoreSchema dc = metadata.getDublinCoreSchema();
>>>  if (dc != null) {
>>>JSONObject dcj = new JSONObject();
>>>dcj.put("Title", dc.getTitle());
>>>dcj.put("Description", dc.getDescription());
>>>...
>>>json.put("Dublin", dcj);
>>>  }
>>>...
>>>
>>> My goal is to return a JSON formatted string to a browser, and display
>>> the
>>> fomatted metadata to the user. So for now I'm getting around this
>>> DomXmpParser exception from DomXmpParser by simply converting the
>>> metadata
>>> to JSON with JSON-java (https://github.com/stleary/JSON-java), and
>>> untangling the namespace, etc. on browser side:
>>>
>>>  PDMetadata meta = catalog.getMetadata();
>>>
>>>if (meta != null) {
>>>  InputStream is = meta.exportXMPMetadata();
>>>  ByteArrayOutputStream baos = new ByteArrayOutputStream();
>>>  int read = 0;
>>>  byte [] bytes = new byte[8*1024];
>>>  while ((read = is.read(bytes)) != -1) {
>>>baos.write(bytes, 0, read);
>>>  }
>>>  String string = new String(baos.toByteArray());
>>>  json = XML.toJSONObject(string);
>>>  ...
>>>
>>>
>>> On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries <
>>> thad.humphr...@gmail.com>
>>> wrote:
>>>
>>> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
 example, it works. However when I put the same code into my class, it
 throws an exception when I call "DomXmpParser xmpParser = new
 DomXmpParser();"  The trace is:

 java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuil
 derFactory.
 setFeature(Ljava/lang/String;Z)V
 at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81)
 at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
 MetadataExtractor.java:170)
 at com. jthad.util.image.TestMetadataExtractor.testPdf0(
 TestMetadataExtractor.java:41)
 ...

 Line 81 in DomXmpParser.java is

 dbFactory.setFeature("http://apache.org/xml/features/disallo
 w-doctype-decl",
 true);

 I am at a loss to understand how "new DomXmpParser()" works from the
 command line but fails when called by a JUnit test in Eclipse.
 ...

>>> -
>> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
>> For additional commands, e-mail: users-h...@pdfbox.apache.org
>>
>
> --
> "Hell hath no limits, nor is circumscrib'd In one self-place; but where we
> are is hell, And where hell is, there must we ever be" --Christopher
> Marlowe, *Doctor Faustus* (v. 121-24)
>



-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)


Re: ExtractMetadata error

2017-03-09 Thread Thad Humphries
Yes, I can take a stab at that in a few days, after the crunch of my
current project abates. I'll let you know when it's on GitHub. Thanks.

On Thu, Mar 9, 2017 at 12:43 PM, Tilman Hausherr 
wrote:

> Can you create a minimal but fully working project with maven? I.e. we'd
> need code with main, and a pom. I mention this because an additional lib is
> needed, unless I misunderstood.
>
> Tilman
>
>
> Am 09.03.2017 um 16:51 schrieb Thad Humphries:
>
>> Here's my code. As I said, it is throwing an exception at "new
>> DomXmpParser()" and I have no idea why:
>>
>>protected JSONObject getPdfMetadata(byte [] buffer)
>>throws IOException, XmpParsingException, JSONException {
>>  ByteArrayInputStream bais = new ByteArrayInputStream(buffer);
>>
>>  JSONObject json = new JSONObject();
>>  PDDocument document = null;
>>  try {
>>document = PDDocument.load(bais);
>>PDDocumentCatalog catalog = document.getDocumentCatalog();
>>PDMetadata meta = catalog.getMetadata();
>>
>>if (meta != null) {
>>  DomXmpParser xmpParser = new DomXmpParser();  // throws exception
>>  XMPMetadata metadata = xmpParser.parse(meta.createInp
>> utStream());
>>
>>  DublinCoreSchema dc = metadata.getDublinCoreSchema();
>>  if (dc != null) {
>>JSONObject dcj = new JSONObject();
>>dcj.put("Title", dc.getTitle());
>>dcj.put("Description", dc.getDescription());
>>...
>>json.put("Dublin", dcj);
>>  }
>>...
>>
>> My goal is to return a JSON formatted string to a browser, and display the
>> fomatted metadata to the user. So for now I'm getting around this
>> DomXmpParser exception from DomXmpParser by simply converting the metadata
>> to JSON with JSON-java (https://github.com/stleary/JSON-java), and
>> untangling the namespace, etc. on browser side:
>>
>>  PDMetadata meta = catalog.getMetadata();
>>
>>if (meta != null) {
>>  InputStream is = meta.exportXMPMetadata();
>>  ByteArrayOutputStream baos = new ByteArrayOutputStream();
>>  int read = 0;
>>  byte [] bytes = new byte[8*1024];
>>  while ((read = is.read(bytes)) != -1) {
>>baos.write(bytes, 0, read);
>>  }
>>  String string = new String(baos.toByteArray());
>>  json = XML.toJSONObject(string);
>>  ...
>>
>>
>> On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries > >
>> wrote:
>>
>> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
>>> example, it works. However when I put the same code into my class, it
>>> throws an exception when I call "DomXmpParser xmpParser = new
>>> DomXmpParser();"  The trace is:
>>>
>>> java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.
>>> setFeature(Ljava/lang/String;Z)V
>>> at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81)
>>> at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
>>> MetadataExtractor.java:170)
>>> at com. jthad.util.image.TestMetadataExtractor.testPdf0(
>>> TestMetadataExtractor.java:41)
>>> ...
>>>
>>> Line 81 in DomXmpParser.java is
>>>
>>> dbFactory.setFeature("http://apache.org/xml/features/disallo
>>> w-doctype-decl",
>>> true);
>>>
>>> I am at a loss to understand how "new DomXmpParser()" works from the
>>> command line but fails when called by a JUnit test in Eclipse.
>>> ...
>>>
>> -
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>

-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)


Re: ExtractMetadata error

2017-03-09 Thread Tilman Hausherr
Can you create a minimal but fully working project with maven? I.e. we'd 
need code with main, and a pom. I mention this because an additional lib 
is needed, unless I misunderstood.


Tilman

Am 09.03.2017 um 16:51 schrieb Thad Humphries:

Here's my code. As I said, it is throwing an exception at "new
DomXmpParser()" and I have no idea why:

   protected JSONObject getPdfMetadata(byte [] buffer)
   throws IOException, XmpParsingException, JSONException {
 ByteArrayInputStream bais = new ByteArrayInputStream(buffer);

 JSONObject json = new JSONObject();
 PDDocument document = null;
 try {
   document = PDDocument.load(bais);
   PDDocumentCatalog catalog = document.getDocumentCatalog();
   PDMetadata meta = catalog.getMetadata();

   if (meta != null) {
 DomXmpParser xmpParser = new DomXmpParser();  // throws exception
 XMPMetadata metadata = xmpParser.parse(meta.createInputStream());

 DublinCoreSchema dc = metadata.getDublinCoreSchema();
 if (dc != null) {
   JSONObject dcj = new JSONObject();
   dcj.put("Title", dc.getTitle());
   dcj.put("Description", dc.getDescription());
   ...
   json.put("Dublin", dcj);
 }
   ...

My goal is to return a JSON formatted string to a browser, and display the
fomatted metadata to the user. So for now I'm getting around this
DomXmpParser exception from DomXmpParser by simply converting the metadata
to JSON with JSON-java (https://github.com/stleary/JSON-java), and
untangling the namespace, etc. on browser side:

 PDMetadata meta = catalog.getMetadata();

   if (meta != null) {
 InputStream is = meta.exportXMPMetadata();
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 int read = 0;
 byte [] bytes = new byte[8*1024];
 while ((read = is.read(bytes)) != -1) {
   baos.write(bytes, 0, read);
 }
 String string = new String(baos.toByteArray());
 json = XML.toJSONObject(string);
 ...


On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries 
wrote:


When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
example, it works. However when I put the same code into my class, it
throws an exception when I call "DomXmpParser xmpParser = new
DomXmpParser();"  The trace is:

java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.
setFeature(Ljava/lang/String;Z)V
at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81)
at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
MetadataExtractor.java:170)
at com. jthad.util.image.TestMetadataExtractor.testPdf0(
TestMetadataExtractor.java:41)
...

Line 81 in DomXmpParser.java is

dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl;,
true);

I am at a loss to understand how "new DomXmpParser()" works from the
command line but fails when called by a JUnit test in Eclipse.

--
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)







-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: ExtractMetadata error

2017-03-09 Thread Thad Humphries
Here's my code. As I said, it is throwing an exception at "new
DomXmpParser()" and I have no idea why:

  protected JSONObject getPdfMetadata(byte [] buffer)
  throws IOException, XmpParsingException, JSONException {
ByteArrayInputStream bais = new ByteArrayInputStream(buffer);

JSONObject json = new JSONObject();
PDDocument document = null;
try {
  document = PDDocument.load(bais);
  PDDocumentCatalog catalog = document.getDocumentCatalog();
  PDMetadata meta = catalog.getMetadata();

  if (meta != null) {
DomXmpParser xmpParser = new DomXmpParser();  // throws exception
XMPMetadata metadata = xmpParser.parse(meta.createInputStream());

DublinCoreSchema dc = metadata.getDublinCoreSchema();
if (dc != null) {
  JSONObject dcj = new JSONObject();
  dcj.put("Title", dc.getTitle());
  dcj.put("Description", dc.getDescription());
  ...
  json.put("Dublin", dcj);
}
  ...

My goal is to return a JSON formatted string to a browser, and display the
fomatted metadata to the user. So for now I'm getting around this
DomXmpParser exception from DomXmpParser by simply converting the metadata
to JSON with JSON-java (https://github.com/stleary/JSON-java), and
untangling the namespace, etc. on browser side:

PDMetadata meta = catalog.getMetadata();

  if (meta != null) {
InputStream is = meta.exportXMPMetadata();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int read = 0;
byte [] bytes = new byte[8*1024];
while ((read = is.read(bytes)) != -1) {
  baos.write(bytes, 0, read);
}
String string = new String(baos.toByteArray());
json = XML.toJSONObject(string);
...


On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries 
wrote:

> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
> example, it works. However when I put the same code into my class, it
> throws an exception when I call "DomXmpParser xmpParser = new
> DomXmpParser();"  The trace is:
>
> java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.
> setFeature(Ljava/lang/String;Z)V
> at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81)
> at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
> MetadataExtractor.java:170)
> at com. jthad.util.image.TestMetadataExtractor.testPdf0(
> TestMetadataExtractor.java:41)
> ...
>
> Line 81 in DomXmpParser.java is
>
> dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl;,
> true);
>
> I am at a loss to understand how "new DomXmpParser()" works from the
> command line but fails when called by a JUnit test in Eclipse.
>
> --
> "Hell hath no limits, nor is circumscrib'd In one self-place; but where we
> are is hell, And where hell is, there must we ever be" --Christopher
> Marlowe, *Doctor Faustus* (v. 121-24)
>



-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)


ExtractMetadata error

2017-03-08 Thread Thad Humphries
When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata example,
it works. However when I put the same code into my class, it throws an
exception when I call "DomXmpParser xmpParser = new DomXmpParser();"  The
trace is:

java.lang.AbstractMethodError:
javax.xml.parsers.DocumentBuilderFactory.setFeature(Ljava/lang/String;Z)V
at org.apache.xmpbox.xml.DomXmpParser.(DomXmpParser.java:81)
at
com.jthad.util.image.MetadataExtractor.getPdfMetadata(MetadataExtractor.java:170)
at
com. 
jthad.util.image.TestMetadataExtractor.testPdf0(TestMetadataExtractor.java:41)
...

Line 81 in DomXmpParser.java is

dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl;,
true);

I am at a loss to understand how "new DomXmpParser()" works from the
command line but fails when called by a JUnit test in Eclipse.

-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)