[ 
https://issues.apache.org/jira/browse/PDFBOX-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937694#comment-17937694
 ] 

Tilman Hausherr edited comment on PDFBOX-2913 at 3/23/25 2:15 PM:
------------------------------------------------------------------

Wow this issue is now almost 10 years old. I've tried a few things over the 
years but was never successful, but I should write down my thoughts / 
observations.

Unlike other xmpbox changes I made over the years this won't be a few lines. 
This rds thing is partly supported but not as a schema. "<rdf:value>" isn't 
supported at all. 

[^xmp673189-ok.xml] is another file with "<rdf:value>" that doesn't fail. But 
it doesn't work properly either, when debugging why this worked, I looked at 
this part
{code:xml}
         <desc:FileName rdf:parseType="Resource">
            <rdf:value>E:\Pam_Ward\INS Forms-EB-2004\WIP 
XFT\I-102_v5.xft</rdf:value>
            <desc:ref>/template/subform[1]</desc:ref>
         </desc:FileName>
{code}
and it returns "E:\Pam_Ward\INS Forms-EB-2004\WIP 
XFT\I-102_v5.xft/template/subform[1]". This happens because this line is called
{code:java}
manageSimpleType(xmp, property, Types.Text, container);
{code}
If I delete the "<rdf:value>A</rdf:value>" it will still fail, because 
xmpidq:Scheme isn't implemented.
It's mentioned here: 
https://pdfa.org/wp-content/uploads/2011/08/tn0008_predefined_xmp_properties_in_pdfa-1_2008-03-20.pdf
I tried it to add it as a schema but this doesn't work, it has to be a 
AbstractSimpleProperty.
I have a look at all the 250000 files if the digitalcorpora corpus, none of 
them has xmpidq. We could try to implement it, but I'm not sure how, this isn't 
a full schema, it's a single property. Implementing it as a property (similar 
to the GPS property) made it fail elsewhere. Implementing a minimal schema file 
also didn't help.


was (Author: tilman):
Wow this issue is now almost 10 years old. I've tried a few things over the 
years but was never successful, but I should write down my thoughts / 
observations.

Unlike other xmpbox changes I made over the years this won't be a few lines. 
This rds thing is partly supported but not as a schema. "<rdf:value>" isn't 
supported at all. 

[^xmp673189-ok.xml] is another file with "<rdf:value>" that doesn't fail. But 
it doesn't work properly either, when debugging why this worked, I looked at 
this part
{code:xml}
         <desc:FileName rdf:parseType="Resource">
            <rdf:value>E:\Pam_Ward\INS Forms-EB-2004\WIP 
XFT\I-102_v5.xft</rdf:value>
            <desc:ref>/template/subform[1]</desc:ref>
         </desc:FileName>
{code}
and it returns "E:\Pam_Ward\INS Forms-EB-2004\WIP 
XFT\I-102_v5.xft/template/subform[1]". This happens because this line is called
{code:java}
manageSimpleType(xmp, property, Types.Text, container);
{code}
If I delete the "<rdf:value>A</rdf:value>" it will still fail, because 
xmpidq:Scheme isn't implemented.
It's mentioned here: 
https://pdfa.org/wp-content/uploads/2011/08/tn0008_predefined_xmp_properties_in_pdfa-1_2008-03-20.pdf
I tried it to add it as a schema but this doesn't work, it has to be a 
AbstractSimpleProperty.
I have a look at all the 250000 files if the digitalcorpora corpus, none of 
them has xmpidq. We could try to implement it, but I'm not sure how, this isn't 
a full schema, it's a single property.

> DomXmpParser fails on property containing qualifier
> ---------------------------------------------------
>
>                 Key: PDFBOX-2913
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2913
>             Project: PDFBox
>          Issue Type: Bug
>          Components: XmpBox
>    Affects Versions: 1.8.10
>            Reporter: Petras
>            Priority: Major
>         Attachments: qualified_li.xmp, screenshot-1.png, xmp673189-ok.xml
>
>
> According to XMP specification properties may have qualifiers. In our 
> scenario we used {{xmp:Identifier}} element from XMP Basic Schema holding an 
> array of text strings. An array item may be qualified with {{xmpidq:Scheme}}:
> {code:xml}
> <rdf:Description rdf:about=""
>                  xmlns:xmp="http://ns.adobe.com/xap/1.0/";
>                  xmlns:xmpidq="http://ns.adobe.com/xmp/Identifier/qual/1.0/";>
>   <xmp:Identifier>
>     <rdf:Bag>
>       <rdf:li rdf:parseType="Resource">
>         <rdf:value>A</rdf:value>
>         <xmpidq:Scheme>http://archyvai.lt/pdf-ltud/2013/level/</xmpidq:Scheme>
>       </rdf:li>
>     </rdf:Bag>
>   </xmp:Identifier>
> </rdf:Description>
> {code}
> {{DomXmpParser}} fails when parsing XMP containing such qualifiers:
> {code}
> org.apache.xmpbox.xml.XmpParsingException: Schema is not set in this document 
> : http://www.w3.org/1999/02/22-rdf-syntax-ns#
>       at 
> org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:787)
>       at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiDescription(DomXmpParser.java:508)
>       at 
> org.apache.xmpbox.xml.DomXmpParser.parseLiElement(DomXmpParser.java:449)
>       at org.apache.xmpbox.xml.DomXmpParser.manageArray(DomXmpParser.java:407)
>       at 
> org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:309)
>       at 
> org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:267)
>       at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:199)
>       at 
> org.apache.xmpbox.TestXMPWithDefinedSchemas.main(TestXMPWithDefinedSchemas.java:66)
> ...
> {code}
> It appears it failed on {{rdf:value}} element as 
> {{org.apache.xmpbox.type.TypeMapping}} class is not aware about 
> {{http://www.w3.org/1999/02/22-rdf-syntax-ns#}} standard namespace.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to