Hi, > Am 09.07.2015 um 18:13 schrieb Tilman Hausherr <[email protected]>: > > Am 09.07.2015 um 15:35 schrieb Allison, Timothy B.: >> From my perspective, it would be great to have a general xmp parser that >> also allows for some variance from spec (PDFBOX-2855). We've been using >> jempbox for pdfs as well as images over on Tika, and it has worked well for >> us. >> >> I'd prefer to continue using your xmp parser, but I understand if you need >> to limit what you're willing to take on. >> >> I'll take a look at xmlgraphics, and I'll discuss the fallback option with >> Tika devs about moving jempbox into Tika. > > I had a quick look at xmlgraphics xmp, it would also required extra > implementation. > > I don't mind having it in xmpbox (we have some non-PDF stuff at other places > too), we just need a schema definition. Or the most complex possible file > with that namespace. "All" there is to do then is to add a file in > org.apache.xmpbox.schema.
would it be possible to get the XMP files causing the exception so we have something to test with? BR Maruan > > Tilman > >> >> Thank you. >> >> Cheers, >> >> Tim >> >> -----Original Message----- >> From: Maruan Sahyoun [mailto:[email protected]] >> Sent: Thursday, July 09, 2015 4:56 AM >> To: [email protected] >> Subject: Re: DomXmpParser: namespace not found >> >> Hi, >> >>> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <[email protected]>: >>> >>> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.: >>>> All, >>>> Apologies for the idiocy I'm about to reveal (well, that won't be a >>>> revelation to anyone, really), but is there an obvious solution for this >>>> kind of error: >>>> >>>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a >>>> definition for the namespace http://ns.adobe.com/lightroom/1.0/ >>>> at >>>> org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848) >>>> at >>>> org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290) >>>> at >>>> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234) >>>> at >>>> org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198) >>>> at >>>> org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105) >>>> at >>>> org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59) >>>> >>>> On a handful of image files in our test docs on Tika, I'm getting this >>>> with: >>>> >>>> http://ns.adobe.com/lightroom/1.0/ >>>> http://ns.adobe.com/exif/1.0/aux/ >>>> >>> These namespaces are not supported by xmpbox. We've had this problem with >>> another namespace (I can't remember which one), and it wasn't possible to >>> support it because we couldn't find a schema definition. >>> >>> But you say these are image files. So this isn't about pdf xmp. >> xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend >> it to support other PDF standard meta data requirements as well as generic >> XMP use cases to again have a generic XMP library. OTOH there is >> org.apache.xmlgraphics.xmp >> >> WDYT? >> >> BR >> Maruan >> >> >>> Tilman >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

