[ 
https://issues.apache.org/jira/browse/PDFBOX-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-6106:
------------------------------------
    Description: 
The verify my work in PDFBOX-6099 I did a mass test with parsing XMP, 
serializing the data if the parsing worked, and then trying to parse again.

This was successfully parsed:
{code:xml}
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d' bytes='647'?>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
         xmlns:iX='http://ns.adobe.com/iX/1.0/'>
        <rdf:Description about=''
                         xmlns='http://ns.adobe.com/pdf/1.3/'
                         xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
                         pdf:CreationDate='2004-01-30T17:21:50Z'
                         pdf:ModDate='2004-01-30T17:21:50Z'
                         pdf:Producer='Acrobat Distiller 5.0.5 (Windows)'/>
        <rdf:Description about=''
                         xmlns='http://ns.adobe.com/xap/1.0/'
                         xmlns:xap='http://ns.adobe.com/xap/1.0/'
                         xap:CreateDate='2004-01-30T17:21:50Z'
                         xap:ModifyDate='2004-01-30T17:21:50Z'
                         xap:MetadataDate='2004-01-30T17:21:50Z'/>
</rdf:RDF><?xpacket end='r'?>
{code}
The serialized result wasn't parsed:
{code:xml}
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
        <rdf:RDF xmlns:pdf="http://ns.adobe.com/pdf/1.3/";
                 xmlns:xap="http://ns.adobe.com/xap/1.0/";
                 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
                <rdf:Description xmlns="http://ns.adobe.com/pdf/1.3/";
                                 about=""
                                 rdf:about="">
                        
<pdf:CreationDate>2004-01-30T17:21:50Z</pdf:CreationDate>
                        <pdf:ModDate>2004-01-30T17:21:50Z</pdf:ModDate>
                        <pdf:Producer>Acrobat Distiller 5.0.5 
(Windows)</pdf:Producer>
                </rdf:Description>
                <rdf:Description xmlns="http://ns.adobe.com/xap/1.0/";
                                 about=""
                                 rdf:about="">
                        
<xap:CreateDate>2004-01-30T17:21:50+00:00</xap:CreateDate>
                        
<xap:MetadataDate>2004-01-30T17:21:50+00:00</xap:MetadataDate>
                        
<xap:ModifyDate>2004-01-30T17:21:50+00:00</xap:ModifyDate>
                </rdf:Description>
        </rdf:RDF>
</x:xmpmeta><?xpacket end="r"?>
{code}
with the error "No type defined for 
\{http://ns.adobe.com/pdf/1.3/}CreationDate"; and yes, the error message is 
correct. So why wasn't it detected in the first file? The cause is in 
PDFBOX-2318, the item is accepted as a text type. In the second parsing a 
different check is it and the problem is caught when in strict mode. Solution: 
throw an exception for the first parsing when the property is an xml attribute 
and not a child.

  was:
The verify my work in PDFBOX-6099 I did a mass test with parsing XMP, 
serializing the data if the parsing worked, and then trying to parse again.

This was successfully parsed:
{code:xml}
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d' bytes='647'?>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
         xmlns:iX='http://ns.adobe.com/iX/1.0/'>
        <rdf:Description about=''
                         xmlns='http://ns.adobe.com/pdf/1.3/'
                         xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
                         pdf:CreationDate='2004-01-30T17:21:50Z'
                         pdf:ModDate='2004-01-30T17:21:50Z'
                         pdf:Producer='Acrobat Distiller 5.0.5 (Windows)'/>
        <rdf:Description about=''
                         xmlns='http://ns.adobe.com/xap/1.0/'
                         xmlns:xap='http://ns.adobe.com/xap/1.0/'
                         xap:CreateDate='2004-01-30T17:21:50Z'
                         xap:ModifyDate='2004-01-30T17:21:50Z'
                         xap:MetadataDate='2004-01-30T17:21:50Z'/>
</rdf:RDF><?xpacket end='r'?>
{code}
The serialized result wasn't parsed:
{code:xml}
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
        <rdf:RDF xmlns:pdf="http://ns.adobe.com/pdf/1.3/";
                 xmlns:xap="http://ns.adobe.com/xap/1.0/";
                 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
                <rdf:Description xmlns="http://ns.adobe.com/pdf/1.3/";
                                 about=""
                                 rdf:about="">
                        
<pdf:CreationDate>2004-01-30T17:21:50Z</pdf:CreationDate>
                        <pdf:ModDate>2004-01-30T17:21:50Z</pdf:ModDate>
                        <pdf:Producer>Acrobat Distiller 5.0.5 
(Windows)</pdf:Producer>
                </rdf:Description>
                <rdf:Description xmlns="http://ns.adobe.com/xap/1.0/";
                                 about=""
                                 rdf:about="">
                        
<xap:CreateDate>2004-01-30T17:21:50+00:00</xap:CreateDate>
                        
<xap:MetadataDate>2004-01-30T17:21:50+00:00</xap:MetadataDate>
                        
<xap:ModifyDate>2004-01-30T17:21:50+00:00</xap:ModifyDate>
                </rdf:Description>
        </rdf:RDF>
</x:xmpmeta><?xpacket end="r"?>
{code}
with the error "No type defined for {http://ns.adobe.com/pdf/1.3/}CreationDate"; 
and yes, the error message is correct. So why wasn't it detected in the first 
file? The cause is in PDFBOX-2318, the item is accepted as a text type. In the 
second parsing a different check is it and the problem is caught when in strict 
mode. Solution: throw an exception for the first parsing when the property is 
an xml attribute and not a child.


> No type defined for {http://ns.adobe.com/pdf/1.3/}CreationDate  not always 
> detected
> -----------------------------------------------------------------------------------
>
>                 Key: PDFBOX-6106
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-6106
>             Project: PDFBox
>          Issue Type: Bug
>          Components: XmpBox
>    Affects Versions: 2.0.35, 3.0.6 PDFBox
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>            Priority: Minor
>             Fix For: 2.0.36, 3.0.7 PDFBox, 4.0.0
>
>
> The verify my work in PDFBOX-6099 I did a mass test with parsing XMP, 
> serializing the data if the parsing worked, and then trying to parse again.
> This was successfully parsed:
> {code:xml}
> <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d' bytes='647'?>
> <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
>          xmlns:iX='http://ns.adobe.com/iX/1.0/'>
>       <rdf:Description about=''
>                        xmlns='http://ns.adobe.com/pdf/1.3/'
>                        xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
>                        pdf:CreationDate='2004-01-30T17:21:50Z'
>                        pdf:ModDate='2004-01-30T17:21:50Z'
>                        pdf:Producer='Acrobat Distiller 5.0.5 (Windows)'/>
>       <rdf:Description about=''
>                        xmlns='http://ns.adobe.com/xap/1.0/'
>                        xmlns:xap='http://ns.adobe.com/xap/1.0/'
>                        xap:CreateDate='2004-01-30T17:21:50Z'
>                        xap:ModifyDate='2004-01-30T17:21:50Z'
>                        xap:MetadataDate='2004-01-30T17:21:50Z'/>
> </rdf:RDF><?xpacket end='r'?>
> {code}
> The serialized result wasn't parsed:
> {code:xml}
> <?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
> <x:xmpmeta xmlns:x="adobe:ns:meta/">
>       <rdf:RDF xmlns:pdf="http://ns.adobe.com/pdf/1.3/";
>                xmlns:xap="http://ns.adobe.com/xap/1.0/";
>                xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
>               <rdf:Description xmlns="http://ns.adobe.com/pdf/1.3/";
>                                about=""
>                                rdf:about="">
>                       
> <pdf:CreationDate>2004-01-30T17:21:50Z</pdf:CreationDate>
>                       <pdf:ModDate>2004-01-30T17:21:50Z</pdf:ModDate>
>                       <pdf:Producer>Acrobat Distiller 5.0.5 
> (Windows)</pdf:Producer>
>               </rdf:Description>
>               <rdf:Description xmlns="http://ns.adobe.com/xap/1.0/";
>                                about=""
>                                rdf:about="">
>                       
> <xap:CreateDate>2004-01-30T17:21:50+00:00</xap:CreateDate>
>                       
> <xap:MetadataDate>2004-01-30T17:21:50+00:00</xap:MetadataDate>
>                       
> <xap:ModifyDate>2004-01-30T17:21:50+00:00</xap:ModifyDate>
>               </rdf:Description>
>       </rdf:RDF>
> </x:xmpmeta><?xpacket end="r"?>
> {code}
> with the error "No type defined for 
> \{http://ns.adobe.com/pdf/1.3/}CreationDate"; and yes, the error message is 
> correct. So why wasn't it detected in the first file? The cause is in 
> PDFBOX-2318, the item is accepted as a text type. In the second parsing a 
> different check is it and the problem is caught when in strict mode. 
> Solution: throw an exception for the first parsing when the property is an 
> xml attribute and not a child.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to