[
https://issues.apache.org/jira/browse/PDFBOX-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040857#comment-18040857
]
Tilman Hausherr edited comment on PDFBOX-2378 at 11/26/25 4:49 PM:
-------------------------------------------------------------------
After a ChatGPT session that brought me nowhere ("In java, when using
javax.xml.Transformer to output xmp, how can I make sure that every namespace
is in the output?"), I realized that the initial rdf is somewhat "detached"
from the rest / not used later. I added this near the end of
{{DomXmpParser.parse()}}:
{code:java}
NamedNodeMap attributes = rdfRdf.getAttributes();
for (int i = 0; i < attributes.getLength(); ++i)
{
Node item = attributes.item(i);
if (XMLConstants.XMLNS_ATTRIBUTE_NS_URI.equals(item.getNamespaceURI()))
{
System.out.println("NamespaceURI0: " + " " + item.getLocalName() + " "
+ item.getNodeName() + " " + item.getNodeValue());
}
}
{code}
the output:
{noformat}
NamespaceURI0: pdfaExtension xmlns:pdfaExtension
http://www.aiim.org/pdfa/ns/extension/
NamespaceURI0: pdfaProperty xmlns:pdfaProperty
http://www.aiim.org/pdfa/ns/property#
NamespaceURI0: pdfaSchema xmlns:pdfaSchema http://www.aiim.org/pdfa/ns/schema#
NamespaceURI0: rdf xmlns:rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
{noformat}
So this does contain the two missing URLs. The next step will be to keep these
and add them in the {{XmpSerializer.createRdfElement()}} with
{{xmpmeta.setAttributeNS()}}.
was (Author: tilman):
After a ChatGPT session that brought me nowhere ("In java, when using
javax.xml.Transformer to output xmp, how can I make sure that every namespace
is in the output?"), I realized that the initial rdf is somewhat "detached"
from the rest / not used later. I added this near the end of
{{DomXmpParser.parse()}}:
{code:java}
System.out.println("xmp.getAllSchemas(): " + xmp.getAllSchemas());
NamedNodeMap attributes = rdfRdf.getAttributes();
for (int i = 0; i < attributes.getLength(); ++i)
{
Node item = attributes.item(i);
if (XMLConstants.XMLNS_ATTRIBUTE_NS_URI.equals(item.getNamespaceURI()))
{
System.out.println("NamespaceURI0: " + " " + item.getLocalName() + " "
+ item.getNodeName() + " " + item.getNodeValue());
}
}
{code}
the output:
{noformat}
NamespaceURI0: pdfaExtension xmlns:pdfaExtension
http://www.aiim.org/pdfa/ns/extension/
NamespaceURI0: pdfaProperty xmlns:pdfaProperty
http://www.aiim.org/pdfa/ns/property#
NamespaceURI0: pdfaSchema xmlns:pdfaSchema http://www.aiim.org/pdfa/ns/schema#
NamespaceURI0: rdf xmlns:rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
{noformat}
So this does contain the two missing URLs. The next step will be to keep these
and add them in the {{XmpSerializer.createRdfElement()}} with
{{xmpmeta.setAttributeNS()}}.
> XMPBox removes namespaces on serialization
> ------------------------------------------
>
> Key: PDFBOX-2378
> URL: https://issues.apache.org/jira/browse/PDFBOX-2378
> Project: PDFBox
> Issue Type: Bug
> Components: XmpBox
> Affects Versions: 1.8.7, 2.0.29, 3.0.0 PDFBox, 4.0.0
> Reporter: Vadimo
> Assignee: Tilman Hausherr
> Priority: Minor
> Fix For: 2.0.36, 3.0.7 PDFBox, 4.0.0
>
> Attachments: zf_extension.pdfbox.xmp
>
>
> {code:title=Bar.java|borderStyle=none}
> InputStream zfExtensionIs =
> getClass().getResourceAsStream("/zf_extension.pdfbox.xmp");
> DomXmpParser builder = new DomXmpParser();
> zfDefaultXmp = builder.parse(zfExtensionIs);
> PdfaExtensionHelper.populateSchemaMapping(zfDefaultXmp);
> new XmpSerializer().serialize(zfDefaultXmp, new
> FileOutputStream("target/out.xmp.xml"), true);
> {code}
> the incoming file
> {code:xml|title=incoming.xml|borderStyle=none}
> <rdf:Description
> xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"
> xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"
> xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#" rdf:about="">
> {code}
> outgoing file
> {code:xml|title=resulting.xml|borderStyle=none}
> <rdf:Description xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"
> rdf:about="">
> {code}
> why are the two namespaces gone?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]