Am 28.03.21 um 19:27 schrieb sahy...@fileaffairs.de:
Am Sonntag, dem 28.03.2021 um 18:47 +0200 schrieb Tilman Hausherr:
Am 28.03.2021 um 18:44 schrieb sahy...@fileaffairs.de:
Am Sonntag, dem 28.03.2021 um 16:36 +0200 schrieb Tilman Hausherr:
I don't have an opinion on XMP because I don't use it.
As XMP is needed for getting/setting metadata esp. since PDF 2.0
there
needs to be support for it - not neccesarily from us directly i.e.
we
could integrate a different lib.
I'll revert the work done in PDFBOX-5128 and we get back to it
after
3.0 - WDYT?
No, why revert? As far as I understand it, it makes possible that
XMPs
with non standard schemas can still be parsed so that people can
retrieve the standard stuff, so that is very useful.
it's still very limited - I can keep it but as long as the XMP doesn't
conform to the (strict) initial parsing rules it will still fail. The
idea to revert was because of getting time to work on it (if we decide
to do so) or otherwise keep it in the state it has been before i.e.
targeted to PDF/A-1 conforming XMPs.
I'm going to start a vote about the future of preflight after the release of the
first RC for 3.0.0. Depending on the output we should think about a vote about
the future of xmpbox as well.
Let us see what happens and decide afterwards.
Andreas
BR
Maruan
Tilman
BR
Maruan
Re preflight, I agree with you. It was great but it has hit a
dead end,
and VeraPDF is better because it is more flexible.
Tilman
Am 28.03.2021 um 15:52 schrieb Andreas Lehmkuehler:
Am 28.03.21 um 15:00 schrieb sahy...@fileaffairs.de:
Fellow colleagues,
there was some discussion about the ability of XMPBox to
parse
arbritary XMP which lead to PDFBOX-5128.
Now, after digging into the code and after reading through
the
various
specs for XMP and PDF/A as it stands now XMPBox in it's
current
implementation is too restricted from the start as it not
only per
default (although there is a way around it) only supports
parsing
predefined XMP schemas restricted to the ones defined in
PDF/A-1
but
also does some validation in the parsing phase.
Exactly the point where I stopped some time ago, when trying to
just
expand the parser ;-)
Now, in order to get to an implementation for arbritary XMP
that
needs
to change with the validation for PDF/A-1 put on top. We
could use
the
existing implementation in a generalized way, use an existing
Java
XMP
parser such as Adobes XMPCore or approach it in a layered
fashion
XML -
RDF -> XMP with supporting libs for that.
The other option would be to keep XMPBox as is and for
general
purpose
add a general parser into the project or simply refer to
XMPCore.
That leads me to the question about the benefit of having a
general
purpose (ASL licensed) XMP lib as part of PDFBox? Thoughts?
It replaced JempBox when preflight was added to PDFBox, saying
that,
it was a more or less historical reason.
I myself never needed that XMP-stuff. It is used by TIKA and
preflight
and maybe others.
I have to admit that I already thought about the future of
preflight.
I've planned to come up with that topic after releasing 3.0.0,
but
why
waiting.
Preflight is part of PDFBox but is practically not maintained.
Preflight support is limited to A1B and I don't see anybody who
plans
to extend it. VeraPDF has a lot more to offer and is open
source as
well, so maybe a better alternative ...
How about removing preflight with 4.0.0? This would remove the
one
and
only hard dependency of XMPBox, so that it would be easier to
decide
if we really need to maintain out own XMP lib.
Andreas
---------------------------------------------------------------
------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
-----------------------------------------------------------------
----
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org