New build will be available here soon (check the date):
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/preflight-app/2.0.21-SNAPSHOT/
I have also created an issue with VeraPDF:
https://github.com/veraPDF/veraPDF-library/issues/1085
Tilman
Am 04.06.2020 um 08:02 schrieb Tilman Hausherr:
Am 03.06.2020 um 22:41 schrieb Ralf Heydenreich:
Hi Tilman,
thanks for your fast answer. So, should I create a new bug report or is
this done by you? Thanks also for the workaround, I'll try it ASAP.
I've created one:
https://issues.apache.org/jira/browse/PDFBOX-4860
Tilman
Regards,
Ralf.
Am 03.06.2020 um 18:59 schrieb Tilman Hausherr:
Hi,
Metadata is there. PDFBox preflight says it is OK. VeraPDF and
PDF-Tools says it isn't. (So this is a new PDFBox bug)
CreatorTool and CreateDate and Producer are there twice. Solution
could be to get the Schema instead of creating it.
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta
xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"
rdf:about="">
<pdfaid:part>1</pdfaid:part>
<pdfaid:conformance>A</pdfaid:conformance>
</rdf:Description>
<rdf:Description xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
rdf:about="">
<pdf:Producer>LibreOffice 5.4</pdf:Producer>
</rdf:Description>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/"
rdf:about="">
<xmp:CreatorTool>Writer</xmp:CreatorTool>
<xmp:CreateDate>2020-05-31T20:54:08+02:00</xmp:CreateDate>
</rdf:Description>
<rdf:Description xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
rdf:about="">
<pdf:Keywords/>
<pdf:Producer>LibreOffice 5.4</pdf:Producer>
</rdf:Description>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/"
rdf:about="">
<xmp:ModifyDate>2020-06-03T08:38:15+02:00</xmp:ModifyDate>
<xmp:CreateDate>2020-05-31T20:54:08+02:00</xmp:CreateDate>
<xmp:CreatorTool>Writer</xmp:CreatorTool>
<xmp:MetadataDate>2020-06-03T08:38:15+02:00</xmp:MetadataDate>
</rdf:Description>
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/"
rdf:about="">
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Default Title</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>PDFBox</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">Default Subject</rdf:li>
</rdf:Alt>
</dc:description>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta><?xpacket end="w"?>
PDF-Tools:
https://www.pdf-online.com/osa/validate.aspx
Validating file "Testdok-modified-003.pdf" for conformance level
pdfa-1a
Multiple occurrences of property 'pdf:Producer'.
Multiple occurrences of property 'xmp:CreateDate'.
Multiple occurrences of property 'xmp:CreatorTool'.
The value of the key ToUnicode is invalid but must be compliant to the
ToUnicode CMap format.
The document does not conform to the requested standard.
The document contains fonts without appropriate character to unicode
mapping information (ToUnicode maps).
The document's meta data is either missing or inconsistent or corrupt.
The document does not conform to the PDF/A-1a standard.
Done.
PDF-Tools also considers the original file not to be valid.
Tilman
Am 03.06.2020 um 08:43 schrieb Ralf Heydenreich:
Hi Tilman,
the xpacket is written now, but the PDF isn't recognized as PDF/A and
has no metadata. Meanwhile, I've tried the CreatePDFA example which
works. But there a new PDF is created, maybe this is the
difference. The
new modified PDF is available under
https://files.fakturama.info/rhefiles/Testdok-modified-003.pdf.
Regards,
Ralf.
Am 03.06.2020 um 08:35 schrieb Tilman Hausherr:
Hi,
The position doesn't matter, but the xpacket thing does.
the example has
serializer.serialize(metadata, baos, false);
while the CreatePDFA.java example has
serializer.serialize(xmp, baos, true);
please try changing that.
Tilman
Am 03.06.2020 um 08:23 schrieb Ralf Heydenreich:
Hi Tilman,
now the metadata are written to the PDF, but maybe at the wrong
position. If I compare it to the original PDF the XMP stream is
written
in the beginning of the document. The original contains the
stream at
the end, but starting with a "xpacket" tag. This tag is missing from
modified document. Don't know if it's relevant. At least my PDF
viewer
shows the modified document without metadata and doesn't recognize
it as
a PDF/A. I've uploaded the newly modified document in
https://files.fakturama.info/rhefiles/Testdok-modified-002.pdf.
Regards,
Ralf.
Am 31.05.2020 um 21:56 schrieb Tilman Hausherr:
Hi,
In your "after" file the existing metadata is lost. From what I
see,
the example creates metadata from scratch instead of changing the
existing one.
What might work is something like
PDMetadata meta = catalog.getMetadata();
DomXmpParser xmpParser = new DomXmpParser();
XMPMetadata metadata = xmpParser.parse(meta.createInputStream());
(I took this from the ExtractMetadata example)
then use this "metadata" object. Please try that and tell whether
that
works. If not, please do again share the "after" file.
Please do also give feedback if this works. I'm thinking of
adding a
comment in the example.
Tilman
Am 31.05.2020 um 21:31 schrieb Ralf Heydenreich:
Hi,
the before file can be found here:
https://files.fakturama.info/rhefiles/Testdok.pdf
the modified file is this:
https://files.fakturama.info/rhefiles/Testdok-modified.pdf
Cheers,
Ralf.
Am 31.05.2020 um 21:27 schrieb Tilman Hausherr:
Please share the "before" and the "after" PDF, upload them to a
sharehoster.
Tilman
Am 31.05.2020 um 21:22 schrieb Ralf Heydenreich:
Hi all,
I've just checked the AddMetadataFromDocInfo example from pdfbox
repository (see
https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/AddMetadataFromDocInfo.java?revision=1792647&view=markup).
I've created a PDF/A-3 document with LibreOffice. Then I've used
that
PDF as input for the example program. The resulting document
contains
the changed metadata, but it's not a PDF/A-3 anymore. What can I
do to
leave the document as PDF/A-3?
Thanks in advance,
Ralf.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org