[ 
https://issues.apache.org/jira/browse/PDFBOX-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040857#comment-18040857
 ] 

Tilman Hausherr edited comment on PDFBOX-2378 at 11/26/25 4:49 PM:
-------------------------------------------------------------------

After a ChatGPT session that brought me nowhere ("In java, when using 
javax.xml.Transformer to output xmp, how can I make sure that every namespace 
is in the output?"), I realized that the initial rdf is somewhat "detached" 
from the rest / not used later. I added this near the end of 
{{DomXmpParser.parse()}}:
{code:java}
NamedNodeMap attributes = rdfRdf.getAttributes();
for (int i = 0; i < attributes.getLength(); ++i)
{
    Node item = attributes.item(i);
    if (XMLConstants.XMLNS_ATTRIBUTE_NS_URI.equals(item.getNamespaceURI()))
    {
        System.out.println("NamespaceURI0: " + " " + item.getLocalName() + " " 
+ item.getNodeName() + " " + item.getNodeValue());
    }
}
{code}
the output:
{noformat}
NamespaceURI0:  pdfaExtension xmlns:pdfaExtension 
http://www.aiim.org/pdfa/ns/extension/
NamespaceURI0:  pdfaProperty xmlns:pdfaProperty 
http://www.aiim.org/pdfa/ns/property#
NamespaceURI0:  pdfaSchema xmlns:pdfaSchema http://www.aiim.org/pdfa/ns/schema#
NamespaceURI0:  rdf xmlns:rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
{noformat}
So this does contain the two missing URLs. The next step will be to keep these 
and add them in the {{XmpSerializer.createRdfElement()}} with 
{{xmpmeta.setAttributeNS()}}.



was (Author: tilman):
After a ChatGPT session that brought me nowhere ("In java, when using 
javax.xml.Transformer to output xmp, how can I make sure that every namespace 
is in the output?"), I realized that the initial rdf is somewhat "detached" 
from the rest / not used later. I added this near the end of 
{{DomXmpParser.parse()}}:
{code:java}
System.out.println("xmp.getAllSchemas(): " + xmp.getAllSchemas());
NamedNodeMap attributes = rdfRdf.getAttributes();
for (int i = 0; i < attributes.getLength(); ++i)
{
    Node item = attributes.item(i);
    if (XMLConstants.XMLNS_ATTRIBUTE_NS_URI.equals(item.getNamespaceURI()))
    {
        System.out.println("NamespaceURI0: " + " " + item.getLocalName() + " " 
+ item.getNodeName() + " " + item.getNodeValue());
    }
}
{code}
the output:
{noformat}
NamespaceURI0:  pdfaExtension xmlns:pdfaExtension 
http://www.aiim.org/pdfa/ns/extension/
NamespaceURI0:  pdfaProperty xmlns:pdfaProperty 
http://www.aiim.org/pdfa/ns/property#
NamespaceURI0:  pdfaSchema xmlns:pdfaSchema http://www.aiim.org/pdfa/ns/schema#
NamespaceURI0:  rdf xmlns:rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
{noformat}
So this does contain the two missing URLs. The next step will be to keep these 
and add them in the {{XmpSerializer.createRdfElement()}} with 
{{xmpmeta.setAttributeNS()}}.


> XMPBox removes namespaces on serialization
> ------------------------------------------
>
>                 Key: PDFBOX-2378
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2378
>             Project: PDFBox
>          Issue Type: Bug
>          Components: XmpBox
>    Affects Versions: 1.8.7, 2.0.29, 3.0.0 PDFBox, 4.0.0
>            Reporter: Vadimo
>            Assignee: Tilman Hausherr
>            Priority: Minor
>             Fix For: 2.0.36, 3.0.7 PDFBox, 4.0.0
>
>         Attachments: zf_extension.pdfbox.xmp
>
>
> {code:title=Bar.java|borderStyle=none}
> InputStream zfExtensionIs = 
> getClass().getResourceAsStream("/zf_extension.pdfbox.xmp");
> DomXmpParser builder = new DomXmpParser();
>          zfDefaultXmp = builder.parse(zfExtensionIs);
> PdfaExtensionHelper.populateSchemaMapping(zfDefaultXmp);
> new XmpSerializer().serialize(zfDefaultXmp, new 
> FileOutputStream("target/out.xmp.xml"), true);
> {code}
> the incoming file 
> {code:xml|title=incoming.xml|borderStyle=none}
>       <rdf:Description 
> xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/";
>          xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"; 
> xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#"; rdf:about="">
> {code}
> outgoing file 
> {code:xml|title=resulting.xml|borderStyle=none}
> <rdf:Description xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"; 
> rdf:about="">
> {code}
> why are the two namespaces gone?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to