Re: about Scripts.txt

2024-03-31 Thread Andreas Lehmkühler




Am 31.03.24 um 14:19 schrieb Tilman Hausherr:
I ran a test a few days ago, the build worked, but I'm wondering what 
should be the effect?
New unidode versions just add new unicode mappings. I guess most of them 
are unrelated to OTF but who knows maybe some are useful.


However, I don't expect any real improvement, just a up to date version 
of the scripts file


Andreas



Tilman

On 31.03.2024 14:01, Andreas Lehmkühler wrote:

Hi,

thanks for the pointer.

AFAIU unicodes this should be just an update with additional values, 
so that IMHO there isn't any reason not to update the file.


WDYT?

Andreas

Am 25.03.24 um 19:38 schrieb Dieter von Holten:

hi there



while browsing through the sources i came across OpenTypeScript.java, 
which

loads the

resource-file Scripts.txt

The file contains a list of circa 2700 Unicode codepoints.

The file is version 10.0.0 of 2017-03-11 .



A reference points to a newer version of this file:



http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt



which is version 15.1.0 of 2023-07-28, it contains circa 3000 Unicode
codepoints.



i propose to investigate, if this newer file can be included in 
PdfBox and

works for older Jdk-versions, as supported for PdfBox 2.



MfG

DvH










-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: about Scripts.txt

2024-03-31 Thread Andreas Lehmkühler

Hi,

thanks for the pointer.

AFAIU unicodes this should be just an update with additional values, so 
that IMHO there isn't any reason not to update the file.


WDYT?

Andreas

Am 25.03.24 um 19:38 schrieb Dieter von Holten:

hi there



while browsing through the sources i came across  OpenTypeScript.java, which
loads the

resource-file Scripts.txt

The file contains a list of circa 2700 Unicode codepoints.

The file is version 10.0.0 of 2017-03-11 .



A reference points to a newer version of this file:



 http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt



which is version 15.1.0 of 2023-07-28, it contains circa 3000 Unicode
codepoints.



i propose to investigate, if this newer file can be included in PdfBox and
works for older Jdk-versions, as supported for PdfBox 2.



MfG

DvH










-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.30 vs 2.0.31 reloaded

2024-03-28 Thread Andreas Lehmkühler

Cool!

@Tilman thanks again

Am 27.03.24 um 20:47 schrieb Tilman Hausherr:
During the tika regression tests it turned out that there is a longer 
PDF list, so I ran the tests again with that longer list. The good news 
is that all is perfect, no new exceptions, and no loss of content.


Tilman

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.30_vs_2.0.31_1.tar.xz


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: NVD update hangs during release build

2024-03-24 Thread Andreas Lehmkühler

Looks like they solved their issue. I've reactivated the plugin.

Andreas

Am 21.03.24 um 18:55 schrieb Andreas Lehmkühler:
I've simply deactivated the plugin as proposed, so that we can do the 
release.


I don't like the Idea to go back to an old version. I'm pretty sure 
someone will fix that issue as we aren't the only ones using that plugin.


Andreas

Am 21.03.24 um 18:13 schrieb sahy...@fileaffairs.de:

OK - can replicate the issue too. works for me locally up to
dependency-check-maven 8.4.3 - would that be an option?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:38 +0100 schrieb Tilman Hausherr:

add

-Ppedantic

Tilman

On 21.03.2024 17:28, sahy...@fileaffairs.de wrote:

which mvn cmd do in need to issue to trigger the check? mvn clean
install didn't for me. Am I missing something?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:24 +0100 schrieb Tilman
Hausherr:

Jeremy Long wrote something that I haven't really understood.
Maybe
it
means building the NVD archive on a separate system and then
transferring it.

https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2011824975

However a leter message in the same issue made more sense, I'm
testing
locally with

https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/


Tilman

On 21.03.2024 09:48, sahy...@fileaffairs.de wrote:

Mhmm - is there a way to build locally and test the NVD update?

Ran it on a different project I have for a client locally and
NVD
update worked without issues and without an API key.

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 08:36 +0100 schrieb Tilman
Hausherr:

I meant adding true to the  part.

Something isn't ok with NVD, maybe it got worse since then:
https://blog.fefe.de/?ts=9b0740e0
https://www.heise.de/news/Sicherheitsforscher-genervt-Luecken-Datenbank-NVD-seit-Wochen-unvollstaendig-9656574.html

Tilman

On 20.03.2024 22:05, Andreas Lehmkühler wrote:

Am 20.03.24 um 21:16 schrieb Tilman Hausherr:

If you still have the time, you could add a "skip" for
that
plugin;
the last successful build was this morning and no library
changes
were made since then. (and we still have a few days to
find
out
if
any libraries are now considered risky)

Good idea, but -Ddependency-check.skip=true doesn't work
either, it
still tries to update :-(

I'm going to continue tomorrow 

Andreas


Tilman

On 20.03.2024 21:13, Tilman Hausherr wrote:

Seems it's a general problem:
https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851


it also hangs on my local machine now, I don't have an
API
key.

Tilman


On 20.03.2024 20:57, Andreas Lehmkühler wrote:

Hi,

I'm trying to cut the 2.0.31 release but it always
hangs
when
the
build tries to update the NVD data.

Last week when I built the 3.0.2 release I had a
similar
effect.
The update was very slow but in the end it came to an
end
worked.

Now, nothing happens, the last words are

[INFO] [WARNING] An NVD API Key was not provided - it
is
highly
recommended to use an NVD API key as the update can
take
a
VERY
long time without an API Key

nothing more after that. It simply hangs

I've requested an api key, got one and now I'm trying
to
get
it
work, but it doesn't.

I've tried

* the mvn option -DnvdApiKey=
* define a server "nvd" in .m2/settings.xml including
the
key
and
add -DnvdApiServerId=nvd  to the commandline
* define the environment variable NVD_API_KEY and add
-DnvdApiKeyEnvironmentVariable=NVD_API_KEY to the
commandline

Nothing works, I've always got those famous words: An
NVD
API
Key
was not provide 


Any idea to get around this?

Andreas

P.S.: I'm on linux using coretto-8.332 and mvn 3.9.3


-



To unsubscribe, e-mail:
dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


---


--
To unsubscribe, e-mail:
dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


-



To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


---


--
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-


To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---

--
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-

To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h.

[ANNOUNCE] Apache PDFBox 2.0.31 released

2024-03-24 Thread Andreas Lehmkühler

The Apache PDFBox community is pleased to announce the release of
Apache PDFBox version 2.0.31 The release is available for download at:

https://pdfbox.apache.org/download.html

See the full release notes below for details about this release.

Release Notes -- Apache PDFBox -- Version 2.0.31

Introduction


The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is an incremental bugfix release based on the earlier 2.0.30 
release. It contains

a couple of fixes and small improvements.

For more details on these changes and all the other fixes and improvements
included in this release, please refer to the following issues on the
PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX.

Bug

[PDFBOX-2725] - [PATCH] Split pdf lose accessibility tags
[PDFBOX-5375] - Allow creating of PDFXObjectImage without accessing to 
the image stream
[PDFBOX-5713] - PfbParser fails to parse PFB font with multiple binary 
records.

[PDFBOX-5715] - Lines vanish when printing on MacOS
[PDFBOX-5718] - java.lang.IllegalArgumentException: Provided dictionary 
is not of type 'COSName{OCG}'
[PDFBOX-5721] - The embedded font DroidSansFallbackFull reports an error 
when parsing, and finally uses lastResortFont, resulting in garbled fonts.

[PDFBOX-5723] - COSName caches already cached hashCode
[PDFBOX-5727] - Font operation takes a long time with 3.0.1
[PDFBOX-5728] - NullPointerException in TTFSubsetter.buildPostTable()
[PDFBOX-5732] - Problem converting PDF to image 
(java.awt.color.CMMException: Can not access specified profile)

[PDFBOX-5735] - Set the default value for PDNonTerminalField
[PDFBOX-5737] - java.lang.ArrayIndexOutOfBoundsException Bug Report
[PDFBOX-5738] - Wrong colors in PDF since PDFBOX-5488
[PDFBOX-5740] - Java 7 support on 2.0
[PDFBOX-5751] - Convert to image exception
[PDFBOX-5754] - PDF conversion in this format is very slow. Is there any 
room for optimization?

[PDFBOX-5763] - IllegalArgumentException: -Infinity is not a finite number
[PDFBOX-5772] - Inconsistent signature page handling when signing in 
existing signature fields

[PDFBOX-5773] - Add leading "0" for octal values in MacOSRomanEncoding
[PDFBOX-5776] - DataFormatException: invalid distance too far back
[PDFBOX-5778] - Grayscale JPEG rendered multicolor
[PDFBOX-5781] - OutOfMemoryError in FileSystemFontsProvider.scanFonts
[PDFBOX-5782] - NPE in PageDrawer.getPaint()
[PDFBOX-5785] - Issue with embedded Font and descendant Font
[PDFBOX-5787] - LCMS error 13: Mismatched alpha channels

New Feature

[PDFBOX-5768] - Enable Native Markdown Extraction in Apache PDFBox

Improvement

[PDFBOX-5762] - When splitting, keep page destinations that are part of 
target document(s)

[PDFBOX-5783] - Replace Exception with some repair attempt

Task

[PDFBOX-5739] - Add test for PDFBOX-3347
[PDFBOX-5741] - Add test for PDFBOX-4106

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by a SHA512 checksum and a PGP signature
that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://www.apache.org/dist/pdfbox/KEYS.

About Apache PDFBox
---

Apache PDFBox is an open source Java library for working with PDF documents.
This project allows creation of new PDF documents, manipulation of existing
documents and the ability to extract content from documents. Apache PDFBox
also includes several command line utilities. Apache PDFBox is published
under the Apache License, Version 2.0.

For more information, visit https://pdfbox.apache.org/

About The Apache Software Foundation


Established in 1999, The Apache Software Foundation provides organizational,
legal, and financial support for more than 100 freely-available,
collaboratively-developed Open Source projects. The pragmatic Apache License
enables individual and commercial users to easily deploy Apache software;
the Foundation's intellectual property framework limits the legal exposure
of its 2,500+ contributors.

For more information, visit https://www.apache.org/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 2.0.31

2024-03-24 Thread Andreas Lehmkühler

Am 21.03.24 um 18:51 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 2.0.31.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Timo Allison
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.31

2024-03-23 Thread Andreas Lehmkühler



Am 22.03.24 um 08:29 schrieb Tilman Hausherr:

On 22.03.2024 06:53, Andreas Lehmkühler wrote:
Is this a showstopper, shall I cancel the release? 


No

Or do we just live with another/the last release with that issue? 


I prefer it to be fixed


Of course. I was referring to the former releases, which are broken too 
w.r.t. to the sources zip.


Andreas



Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.31

2024-03-21 Thread Andreas Lehmkühler




Am 21.03.24 um 20:07 schrieb Tim Allison:

In the parent pom.xml in the zip file, there's a "release" submodule
specified. However, there's no release directory in the src zip that would
match: https://svn.apache.org/repos/asf/pdfbox/tags/2.0.31/release/

Is that expected?

Hmmm, of course not. Thanks for the pointer.

I've rearranged the structure in [1] and never realized that the empty 
"release" subproject won't show up in the sources-zip. Obviously nobody 
tried to build one of the last releases from the sources-zip.


However, I'm going to look into this.

Is this a showstopper, shall I cancel the release? Or do we just live 
with another/the last release with that issue?



[1] https://issues.apache.org/jira/browse/PDFBOX-5699




On Thu, Mar 21, 2024 at 1:53 PM Andreas Lehmkühler 
wrote:


Hi,

a candidate for the PDFBox 2.0.31 release is avaiable at:

  https://dist.apache.org/repos/dist/dev/pdfbox/2.0.31/

The release candidate is a zip archive of the sources in:

  https://svn.apache.org/repos/asf/pdfbox/tags/2.0.31/

The SHA-512 checksum of the archive is

c231ccebf918b8aa0dc80d3162fc88ff4ab78d586bcead0ef0cc44a6cab4f6d455112497ad866901e3948a6c76320d19487c3be7e7c1e66c5e2733de82fe3f09.

Please vote on releasing this package as Apache PDFBox 2.0.31.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

  [ ] +1 Release this package as Apache PDFBox 2.0.31
  [ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org






-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: NVD update hangs during release build

2024-03-21 Thread Andreas Lehmkühler
I've simply deactivated the plugin as proposed, so that we can do the 
release.


I don't like the Idea to go back to an old version. I'm pretty sure 
someone will fix that issue as we aren't the only ones using that plugin.


Andreas

Am 21.03.24 um 18:13 schrieb sahy...@fileaffairs.de:

OK - can replicate the issue too. works for me locally up to
dependency-check-maven 8.4.3 - would that be an option?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:38 +0100 schrieb Tilman Hausherr:

add

-Ppedantic

Tilman

On 21.03.2024 17:28, sahy...@fileaffairs.de wrote:

which mvn cmd do in need to issue to trigger the check? mvn clean
install didn't for me. Am I missing something?

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 17:24 +0100 schrieb Tilman
Hausherr:

Jeremy Long wrote something that I haven't really understood.
Maybe
it
means building the NVD archive on a separate system and then
transferring it.

https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2011824975

However a leter message in the same issue made more sense, I'm
testing
locally with

https://dependency-check.github.io/DependencyCheck_Builder/nvd_cache/


Tilman

On 21.03.2024 09:48, sahy...@fileaffairs.de wrote:

Mhmm - is there a way to build locally and test the NVD update?

Ran it on a different project I have for a client locally and
NVD
update worked without issues and without an API key.

BR
Maruan

Am Donnerstag, dem 21.03.2024 um 08:36 +0100 schrieb Tilman
Hausherr:

I meant adding true to the  part.

Something isn't ok with NVD, maybe it got worse since then:
https://blog.fefe.de/?ts=9b0740e0
https://www.heise.de/news/Sicherheitsforscher-genervt-Luecken-Datenbank-NVD-seit-Wochen-unvollstaendig-9656574.html

Tilman

On 20.03.2024 22:05, Andreas Lehmkühler wrote:

Am 20.03.24 um 21:16 schrieb Tilman Hausherr:

If you still have the time, you could add a "skip" for
that
plugin;
the last successful build was this morning and no library
changes
were made since then. (and we still have a few days to
find
out
if
any libraries are now considered risky)

Good idea, but -Ddependency-check.skip=true doesn't work
either, it
still tries to update :-(

I'm going to continue tomorrow 

Andreas


Tilman

On 20.03.2024 21:13, Tilman Hausherr wrote:

Seems it's a general problem:
https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851




it also hangs on my local machine now, I don't have an
API
key.

Tilman


On 20.03.2024 20:57, Andreas Lehmkühler wrote:

Hi,

I'm trying to cut the 2.0.31 release but it always
hangs
when
the
build tries to update the NVD data.

Last week when I built the 3.0.2 release I had a
similar
effect.
The update was very slow but in the end it came to an
end
worked.

Now, nothing happens, the last words are

[INFO] [WARNING] An NVD API Key was not provided - it
is
highly
recommended to use an NVD API key as the update can
take
a
VERY
long time without an API Key

nothing more after that. It simply hangs

I've requested an api key, got one and now I'm trying
to
get
it
work, but it doesn't.

I've tried

* the mvn option -DnvdApiKey=
* define a server "nvd" in .m2/settings.xml including
the
key
and
add -DnvdApiServerId=nvd  to the commandline
* define the environment variable NVD_API_KEY and add
-DnvdApiKeyEnvironmentVariable=NVD_API_KEY to the
commandline

Nothing works, I've always got those famous words: An
NVD
API
Key
was not provide 


Any idea to get around this?

Andreas

P.S.: I'm on linux using coretto-8.332 and mvn 3.9.3


-



To unsubscribe, e-mail:
dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


---


--
To unsubscribe, e-mail:
dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


-



To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail:
dev-h...@pdfbox.apache.org


---


--
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-


To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


---

--
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-

To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



---
--
To unsubscribe, 

[VOTE] Release Apache PDFBox 2.0.31

2024-03-21 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 2.0.31 release is avaiable at:

https://dist.apache.org/repos/dist/dev/pdfbox/2.0.31/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/2.0.31/

The SHA-512 checksum of the archive is 
c231ccebf918b8aa0dc80d3162fc88ff4ab78d586bcead0ef0cc44a6cab4f6d455112497ad866901e3948a6c76320d19487c3be7e7c1e66c5e2733de82fe3f09.


Please vote on releasing this package as Apache PDFBox 2.0.31.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 2.0.31
[ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: NVD update hangs during release build

2024-03-20 Thread Andreas Lehmkühler




Am 20.03.24 um 21:16 schrieb Tilman Hausherr:
If you still have the time, you could add a "skip" for that plugin; the 
last successful build was this morning and no library changes were made 
since then. (and we still have a few days to find out if any libraries 
are now considered risky)
Good idea, but -Ddependency-check.skip=true doesn't work either, it 
still tries to update :-(


I'm going to continue tomorrow 

Andreas



Tilman

On 20.03.2024 21:13, Tilman Hausherr wrote:

Seems it's a general problem:
https://github.com/jeremylong/DependencyCheck/issues/6515#issuecomment-2009879851

it also hangs on my local machine now, I don't have an API key.

Tilman


On 20.03.2024 20:57, Andreas Lehmkühler wrote:

Hi,

I'm trying to cut the 2.0.31 release but it always hangs when the 
build tries to update the NVD data.


Last week when I built the 3.0.2 release I had a similar effect. The 
update was very slow but in the end it came to an end worked.


Now, nothing happens, the last words are

[INFO] [WARNING] An NVD API Key was not provided - it is highly 
recommended to use an NVD API key as the update can take a VERY long 
time without an API Key


nothing more after that. It simply hangs

I've requested an api key, got one and now I'm trying to get it work, 
but it doesn't.


I've tried

* the mvn option -DnvdApiKey=
* define a server "nvd" in .m2/settings.xml including the key and add 
-DnvdApiServerId=nvd  to the commandline
* define the environment variable NVD_API_KEY and add 
-DnvdApiKeyEnvironmentVariable=NVD_API_KEY to the commandline


Nothing works, I've always got those famous words: An NVD API Key was 
not provide 



Any idea to get around this?

Andreas

P.S.: I'm on linux using coretto-8.332 and mvn 3.9.3


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



NVD update hangs during release build

2024-03-20 Thread Andreas Lehmkühler

Hi,

I'm trying to cut the 2.0.31 release but it always hangs when the build 
tries to update the NVD data.


Last week when I built the 3.0.2 release I had a similar effect. The 
update was very slow but in the end it came to an end worked.


Now, nothing happens, the last words are

[INFO] [WARNING] An NVD API Key was not provided - it is highly 
recommended to use an NVD API key as the update can take a VERY long 
time without an API Key


nothing more after that. It simply hangs

I've requested an api key, got one and now I'm trying to get it work, 
but it doesn't.


I've tried

* the mvn option -DnvdApiKey=
* define a server "nvd" in .m2/settings.xml including the key and add 
-DnvdApiServerId=nvd  to the commandline
* define the environment variable NVD_API_KEY and add 
-DnvdApiKeyEnvironmentVariable=NVD_API_KEY to the commandline


Nothing works, I've always got those famous words: An NVD API Key was 
not provide 



Any idea to get around this?

Andreas

P.S.: I'm on linux using coretto-8.332 and mvn 3.9.3


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.31 release

2024-03-19 Thread Andreas Lehmkühler

@Tilman, thanks again for running the regression tests.

I'm going to cut the release tomorrow in about 24 hours for now.

Andreas

Am 15.03.24 um 19:34 schrieb Tilman Hausherr:

Regression tests result:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.30_vs_2.0.31.tar.xz

Nothing to do, only improvements.

Tilman

On 14.03.2024 22:06, Andreas Lehmkühler wrote:

Hi,

now that 3.0.2 is out of the door I'd like to continue with a new 2.0 
release.


How about cutting a 2.0.31 release next Wednesday?

Any objections or is there something we should add/fix first?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 2.0.31 release

2024-03-14 Thread Andreas Lehmkühler

Hi,

now that 3.0.2 is out of the door I'd like to continue with a new 2.0 
release.


How about cutting a 2.0.31 release next Wednesday?

Any objections or is there something we should add/fix first?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 3.0.2 released

2024-03-14 Thread Andreas Lehmkühler

The Apache PDFBox community is pleased to announce the release of
Apache PDFBox version 3.0.2. The release is available for download at:

https://pdfbox.apache.org/download.html

See the full release notes below for details about this release.

Release Notes -- Apache PDFBox -- Version 3.0.2

Introduction


The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is an incremental bugfix release based on the earlier 3.0.1 
release. It contains

a couple of fixes and small improvements.

A migration guide is available at 
https://pdfbox.apache.org/3.0/migration.html. It is
still a work in progress and we are happy to include any valuable 
feedback from our

community.

For more details on these changes and all the other fixes and improvements
included in this release, please refer to the following issues on the
PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX.

Bug

[PDFBOX-2725] - [PATCH] Split pdf lose accessibility tags
[PDFBOX-5375] - Allow creating of PDFXObjectImage without accessing to 
the image stream

[PDFBOX-5704] - char not rendered
[PDFBOX-5714] - PDFBox 3.0 regression: duplicate references in 
dictionary values

[PDFBOX-5715] - Lines vanish when printing on MacOS
[PDFBOX-5717] - NullPointerException calling 
saveIncrementalForExternalSigning
[PDFBOX-5721] - The embedded font DroidSansFallbackFull reports an error 
when parsing, and finally uses lastResortFont, resulting in garbled fonts.

[PDFBOX-5722] - Wrong scope for maven dependencies
[PDFBOX-5723] - COSName caches already cached hashCode
[PDFBOX-5724] - CharStringCommand.equals() does not conform to the 
contract of Object.equals

[PDFBOX-5727] - Font operation takes a long time with 3.0.1
[PDFBOX-5728] - NullPointerException in TTFSubsetter.buildPostTable()
[PDFBOX-5730] - The expected SubstFormat for ExtensionSubstFormat1 
subtable is 108 but should be 1
[PDFBOX-5732] - Problem converting PDF to image 
(java.awt.color.CMMException: Can not access specified profile)
[PDFBOX-5733] - lookupType is to be replaced by extensionLookupType in 
type 7 lookup table

[PDFBOX-5735] - Set the default value for PDNonTerminalField
[PDFBOX-5737] - java.lang.ArrayIndexOutOfBoundsException Bug Report
[PDFBOX-5738] - Wrong colors in PDF since PDFBOX-5488
[PDFBOX-5742] - Split result PDFs broken
[PDFBOX-5744] - EOFException while readMultipleSubstitutionSubtable()
[PDFBOX-5745] - EOFException while readSingleLookupSubTable()
[PDFBOX-5748] - Cannot get overlayPDF working on command line interface
[PDFBOX-5751] - Convert to image exception
[PDFBOX-5752] - Font errors after copying a page to another document
[PDFBOX-5754] - PDF conversion in this format is very slow. Is there any 
room for optimization?

[PDFBOX-5757] - streamCacheCreateFunction not passed to PDFParser
[PDFBOX-5758] - ExceptionInInitializerError when unmapping is not supported
[PDFBOX-5760] - NPE in FIlter.decode() when called with empty list
[PDFBOX-5763] - IllegalArgumentException: -Infinity is not a finite number
[PDFBOX-5764] - Wrong chunksize when using a ByteBuffer to initialize a 
RandomAccessReadBuffer
[PDFBOX-5772] - Inconsistent signature page handling when signing in 
existing signature fields

[PDFBOX-5773] - Add leading "0" for octal values in MacOSRomanEncoding
[PDFBOX-5775] - importPage destroys annotations
[PDFBOX-5776] - DataFormatException: invalid distance too far back
[PDFBOX-5778] - Grayscale JPEG rendered multicolor
[PDFBOX-5781] - OutOfMemoryError in FileSystemFontsProvider.scanFonts
[PDFBOX-5782] - NPE in PageDrawer.getPaint()

New Feature

[PDFBOX-5768] - Enable Native Markdown Extraction in Apache PDFBox

Improvement

[PDFBOX-5729] - GsubWorkerForDevanagari and GsubWorkerForGujarati created
[PDFBOX-5762] - When splitting, keep page destinations that are part of 
target document(s)

[PDFBOX-5783] - Replace Exception with some repair attempt

Task

[PDFBOX-5739] - Add test for PDFBOX-3347
[PDFBOX-5741] - Add test for PDFBOX-4106

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by SHA512 checksums and a PGP signature
that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://www.apache.org/dist/pdfbox/KEYS.

About Apache PDFBox
---

Apache PDFBox is an open source Java library for working with PDF documents.
This project allows creation of new PDF documents, manipulation of existing
documents and the ability to extract content from documents. Apache PDFBox
also includes several command line utilities. Apache PDFBox is published
under the Apache License, Version 2.0.

For more information, visit https://pdfbox.apache.org/

About The Apache Software Foundation

Re: [VOTE] Release Apache PDFBox 3.0.2

2024-03-14 Thread Andreas Lehmkühler



Am 11.03.24 um 20:24 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 3.0.2.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Timo Boehme
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 3.0.2

2024-03-11 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 3.0.2 release is available at:

https://dist.apache.org/repos/dist/dev/pdfbox/3.0.2/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/3.0.2/

The SHA-512 checksum of the archive is 
d2eaaa4e7a139b00d79d7518ca66ee2c33300dbeed11c05554413e478b2a76814a7404a9467cb2dc3502840259188965a3483342c7d44e3280b68649aec670f8.


Please vote on releasing this package as Apache PDFBox 3.0.2.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 3.0.2
[ ] -1 Do not release this package because...

Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.2 release?

2024-03-10 Thread Andreas Lehmkühler

Hi,

as there aren't any objection I'm going to cut the release tomorrow or 
the day after tomorrow.


Andreas

Am 04.03.24 um 07:54 schrieb Andreas Lehmkühler:

Hi,

the import content issue seems to be solved, see PDFBOX-5752 and 
PDFBOX-5775.


How about cutting a 3.0.2 release in a week from now?

Any objections or is there something we should add/fix first?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.2 release?

2024-03-09 Thread Andreas Lehmkühler




Am 08.03.24 um 13:34 schrieb Tilman Hausherr:

On 08.03.2024 07:13, Tilman Hausherr wrote:

regression test result:

https://home.snafu.de/tilman/tmp/reports_pdfbox_3.0.1_vs_3.0.2.tar.xz

Thanks for running the regression tests.


Re exceptions:

- The OOM can't be reproduced

- The two others are related to the zip bomb protection and (probably) a 
recent change (PDFBOX-5704)

I've found a solution for that case, see PDFBOX-5783


Andreas


Re text extraction:

commoncrawl3/TQ/TQVMNMW5ACPU3CZL46OBNGWMPSSXC5MO: that file is a mess 
anyway


commoncrawl3/Y2/Y2PVHNL43FBNKZRAJTSX5J5BLLHMCNLY: same

bug_trackers/pdf.js/pdf.js-11651-0.pdf: might be related to the 
exception I mentioned, the stack trace looks similar. The result is that 
a broken font is no longer replaced. It can be fixed by catching the 
exception when fontFile.createView() is called in PDFOntFactory and 
returning null.


bug_trackers/poppler-gitlab/poppler-748-0.tgz-1.pdf: messy file. But 
there is an NPE on page 2, that can be fixed easily


commoncrawl3/JP/JPO3LX6ABADSDNC5BIX3KZJBRFT5BIEQ: messy file

commoncrawl3/4L/4L2UKWSZNPXPSGS3OTXQZBBKJH6XF7G4: same

Tilman




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 3.0.2 release?

2024-03-03 Thread Andreas Lehmkühler

Hi,

the import content issue seems to be solved, see PDFBOX-5752 and 
PDFBOX-5775.


How about cutting a 3.0.2 release in a week from now?

Any objections or is there something we should add/fix first?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.1 / RandomAccessReadBuffer bug?

2024-02-08 Thread Andreas Lehmkühler

Done, I've fixed the issue.

@David thanks for the report

Andreas

Am 08.02.24 um 07:58 schrieb Andreas Lehmkühler:

Hi David,

thanks for the bug report. You are right and the proposed solution seems 
to be a valid fix.


I've created https://issues.apache.org/jira/browse/PDFBOX-5764 to handle 
it.


Andreas

Am 08.02.24 um 00:00 schrieb david.kl...@atlas.cz:

Hello


I think that this is not correct in some cases:


   public RandomAccessReadBuffer(ByteBuffer input) {

 chunkSize = input.capacity();


IMHO input.limit() shoud be used instead of input.capacity().


When it matters: I have a ByteArrayOutputStream to that is written a PDF
document. Later I want to open the PDF document using PDFBox again. If 
I use

part of the internal buffer directly (without copying), eg.
ByteBuffer.wrap(bos.getInternalBuffer(), 0, bos.size()), I get exception
like that:


java.lang.IllegalArgumentException: newPosition > limit: (31556 > 20960)


 at
java.base/java.nio.Buffer.createPositionException(Buffer.java:352)

 at java.base/java.nio.Buffer.position(Buffer.java:327)

 at
java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1551)

 at
java.base/java.nio.ByteBuffer.position(ByteBuffer.java:285)

 at
org.apache.pdfbox.io.RandomAccessReadBuffer.seek(RandomAccessReadBuffer.java
:187)

 at
org.apache.pdfbox.pdfparser.COSParser.getStartxrefOffset(COSParser.java:506)

 at
org.apache.pdfbox.pdfparser.COSParser.retrieveTrailer(COSParser.java:259)

 at
org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:107)

 at
org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:171)

 at
org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:136)

 at org.apache.pdfbox.Loader.loadPDF(Loader.java:466)

 at org.apache.pdfbox.Loader.loadPDF(Loader.java:369)


31556 is the buffer capacity, 20960 is its limit


I think the buffer should not be read beyond its limit.


Regards

David Klika




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.1 / RandomAccessReadBuffer bug?

2024-02-07 Thread Andreas Lehmkühler

Hi David,

thanks for the bug report. You are right and the proposed solution seems 
to be a valid fix.


I've created https://issues.apache.org/jira/browse/PDFBOX-5764 to handle it.

Andreas

Am 08.02.24 um 00:00 schrieb david.kl...@atlas.cz:

Hello

  


I think that this is not correct in some cases:

  


   public RandomAccessReadBuffer(ByteBuffer input) {

 chunkSize = input.capacity();

  


IMHO input.limit() shoud be used instead of input.capacity().

  


When it matters: I have a ByteArrayOutputStream to that is written a PDF
document. Later I want to open the PDF document using PDFBox again. If I use
part of the internal buffer directly (without copying), eg.
ByteBuffer.wrap(bos.getInternalBuffer(), 0, bos.size()), I get exception
like that:

  


java.lang.IllegalArgumentException: newPosition > limit: (31556 > 20960)

  


 at
java.base/java.nio.Buffer.createPositionException(Buffer.java:352)

 at java.base/java.nio.Buffer.position(Buffer.java:327)

 at
java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1551)

 at
java.base/java.nio.ByteBuffer.position(ByteBuffer.java:285)

 at
org.apache.pdfbox.io.RandomAccessReadBuffer.seek(RandomAccessReadBuffer.java
:187)

 at
org.apache.pdfbox.pdfparser.COSParser.getStartxrefOffset(COSParser.java:506)

 at
org.apache.pdfbox.pdfparser.COSParser.retrieveTrailer(COSParser.java:259)

 at
org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:107)

 at
org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:171)

 at
org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:136)

 at org.apache.pdfbox.Loader.loadPDF(Loader.java:466)

 at org.apache.pdfbox.Loader.loadPDF(Loader.java:369)

  


31556 is the buffer capacity, 20960 is its limit

  


I think the buffer should not be read beyond its limit.

  


Regards

David Klika




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: About https://issues.apache.org/jira/browse/PDFBOX-5704

2024-01-21 Thread Andreas Lehmkühler

I've fixed the issue in the trun k and the 3.0-branch.

The fix is limited to incorrect CID font definitions.

Let us know if there are any other cases and provide a sample.

Andreas

Am 21.01.24 um 11:23 schrieb Andreas Lehmkühler:
I had a look and found s solution based on the pdf.js implementation. 
I'm going to commit it once I've improved the code, for now it is still 
some kind of hacky.


And yes, PDFBOX-5704 is related to this proposal.

Please follow up on PDFBOX-5704

@Mike thanks for the valuable input

Andreas

Am 19.01.24 um 08:03 schrieb Andreas Lehmkühler:

Hi,

I'm not sure if both issues are similar. However, your proposal is an 
interesting idea and I guess it shouldn't be that hard to implement it.



Thanks for the input, I'm going to have a look.

Andreas

Am 19.01.24 um 04:49 schrieb Mike Li:

Hello team,

I recently encountered the problem that PDFBox cannot render Chinese, 
the problem is very similar to 
https://issues.apache.org/jira/browse/PDFBOX-5704.


In this case, the attached PDF file embedded a CCF font file, the 
correct font type/subtype should be /CIDFontType0 and /CIDFontType0C 
and should declare property /FontFile3. But it wrongly declared the 
subfont as a truetype, and it makes PDFBox uses TTF parser to parse 
the font file stream based on the declared type.


According to the spec, PDFBox does it right, but from the perspective 
of use, this looks more like a "bug", though this file would display 
good in other most used PDF readers (Adobe, Foxit, pdfjs etc.)


I have many years of working experience in PDF generation (iText, 
PDFBox, etc.), and I know that after a PDF is generated, as long as 
it can be displayed correctly in Adobe Reader, then it is considered 
correct. If another program cannot display it correctly, it will be 
considered a bug in other program. It's not fair, but it's reality. 
Many low-quality PDF generation tools/libraries are still widely used.


In pdf.js,  it will parse the font file first, and prefer the font 
type in font file rather than the type declared in font dictionary.

https://github.com/mozilla/pdf.js/blob/1cdbcfef821c7f6e81ea22fe68a8b815bca01c4e/src/core/fonts.js#L1052

So my question is "Is that possible that PDFBox provide some font 
processing workaround logic to handle such case?"


Thanks
Mike




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: About https://issues.apache.org/jira/browse/PDFBOX-5704

2024-01-21 Thread Andreas Lehmkühler
I had a look and found s solution based on the pdf.js implementation. 
I'm going to commit it once I've improved the code, for now it is still 
some kind of hacky.


And yes, PDFBOX-5704 is related to this proposal.

Please follow up on PDFBOX-5704

@Mike thanks for the valuable input

Andreas

Am 19.01.24 um 08:03 schrieb Andreas Lehmkühler:

Hi,

I'm not sure if both issues are similar. However, your proposal is an 
interesting idea and I guess it shouldn't be that hard to implement it.



Thanks for the input, I'm going to have a look.

Andreas

Am 19.01.24 um 04:49 schrieb Mike Li:

Hello team,

I recently encountered the problem that PDFBox cannot render Chinese, 
the problem is very similar to 
https://issues.apache.org/jira/browse/PDFBOX-5704.


In this case, the attached PDF file embedded a CCF font file, the 
correct font type/subtype should be /CIDFontType0 and /CIDFontType0C 
and should declare property /FontFile3. But it wrongly declared the 
subfont as a truetype, and it makes PDFBox uses TTF parser to parse 
the font file stream based on the declared type.


According to the spec, PDFBox does it right, but from the perspective 
of use, this looks more like a "bug", though this file would display 
good in other most used PDF readers (Adobe, Foxit, pdfjs etc.)


I have many years of working experience in PDF generation (iText, 
PDFBox, etc.), and I know that after a PDF is generated, as long as it 
can be displayed correctly in Adobe Reader, then it is considered 
correct. If another program cannot display it correctly, it will be 
considered a bug in other program. It's not fair, but it's reality. 
Many low-quality PDF generation tools/libraries are still widely used.


In pdf.js,  it will parse the font file first, and prefer the font 
type in font file rather than the type declared in font dictionary.

https://github.com/mozilla/pdf.js/blob/1cdbcfef821c7f6e81ea22fe68a8b815bca01c4e/src/core/fonts.js#L1052

So my question is "Is that possible that PDFBox provide some font 
processing workaround logic to handle such case?"


Thanks
Mike




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: About https://issues.apache.org/jira/browse/PDFBOX-5704

2024-01-18 Thread Andreas Lehmkühler

Hi,

I'm not sure if both issues are similar. However, your proposal is an 
interesting idea and I guess it shouldn't be that hard to implement it.



Thanks for the input, I'm going to have a look.

Andreas

Am 19.01.24 um 04:49 schrieb Mike Li:

Hello team,

I recently encountered the problem that PDFBox cannot render Chinese, the 
problem is very similar to https://issues.apache.org/jira/browse/PDFBOX-5704.

In this case, the attached PDF file embedded a CCF font file, the correct font 
type/subtype should be /CIDFontType0 and /CIDFontType0C and should declare 
property /FontFile3. But it wrongly declared the subfont as a truetype, and it 
makes PDFBox uses TTF parser to parse the font file stream based on the 
declared type.

According to the spec, PDFBox does it right, but from the perspective of use, this looks 
more like a "bug", though this file would display good in other most used PDF 
readers (Adobe, Foxit, pdfjs etc.)

I have many years of working experience in PDF generation (iText, PDFBox, 
etc.), and I know that after a PDF is generated, as long as it can be displayed 
correctly in Adobe Reader, then it is considered correct. If another program 
cannot display it correctly, it will be considered a bug in other program. It's 
not fair, but it's reality. Many low-quality PDF generation tools/libraries are 
still widely used.

In pdf.js,  it will parse the font file first, and prefer the font type in font 
file rather than the type declared in font dictionary.
https://github.com/mozilla/pdf.js/blob/1cdbcfef821c7f6e81ea22fe68a8b815bca01c4e/src/core/fonts.js#L1052

So my question is "Is that possible that PDFBox provide some font processing 
workaround logic to handle such case?"

Thanks
Mike




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report January 2024 due

2024-01-09 Thread Andreas Lehmkühler

I've submitted the report as is.

Thanks for the reviews.

Andreas

Am 08.01.24 um 08:14 schrieb Andreas Lehmkühler:

Hi,

find attached a quick draft of the board report we're expected to submit 
this month. It's based upon the report wizard template which can be 
found at [1]


Any comments or additions are appreciated ...


## Description:
The mission of PDFBox is the creation and maintenance of software 
related to

Java library for working with PDF documents

## Project Status:
Current project status: ongoing with moderate activity
Issues for the board: none

## Membership Data:
Apache PDFBox was founded 2009-10-21 (14 years ago)
There are currently 21 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
- No new committers. Last addition was Joerg O. Henne on 2017-10-09.

## Project Activity:
Recent releases:

     3.0.1 was released on 2023-11-30.
     2.0.30 was released on 2023-11-04.
     3.0.0 was released on 2023-08-17.

## Community Health:
- there is a steady stream of contributions, bug reports and questions 
on the mailing lists
- we released the first minor release of our new 3.0.x line to fix some 
regression issues. A couple of improvements and further fixes were 
included as well.
- the development of the current trunk version 4.0.0 is an ongoing 
effort, e.g. we switched to Log4j2 and did some major refactorings



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Apache PDFBox Board Report January 2024 due

2024-01-07 Thread Andreas Lehmkühler

Hi,

find attached a quick draft of the board report we're expected to submit 
this month. It's based upon the report wizard template which can be 
found at [1]


Any comments or additions are appreciated ...


## Description:
The mission of PDFBox is the creation and maintenance of software 
related to

Java library for working with PDF documents

## Project Status:
Current project status: ongoing with moderate activity
Issues for the board: none

## Membership Data:
Apache PDFBox was founded 2009-10-21 (14 years ago)
There are currently 21 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
- No new committers. Last addition was Joerg O. Henne on 2017-10-09.

## Project Activity:
Recent releases:

3.0.1 was released on 2023-11-30.
2.0.30 was released on 2023-11-04.
3.0.0 was released on 2023-08-17.

## Community Health:
- there is a steady stream of contributions, bug reports and questions 
on the mailing lists
- we released the first minor release of our new 3.0.x line to fix some 
regression issues. A couple of improvements and further fixes were 
included as well.
- the development of the current trunk version 4.0.0 is an ongoing 
effort, e.g. we switched to Log4j2 and did some major refactorings



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 3.0.1 released

2023-11-30 Thread Andreas Lehmkühler

The Apache PDFBox community is pleased to announce the release of
Apache PDFBox version 3.0.1. The release is available for download at:

https://pdfbox.apache.org/download.html

See the full release notes below for details about this release.

Release Notes -- Apache PDFBox -- Version 3.0.1

Introduction


The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is an incremental bugfix release based on the earlier 3.0.0 
release. It contains a couple of fixes and small improvements.


A migration guide is available at 
https://pdfbox.apache.org/3.0/migration.html. It is still a work in 
progress and we are happy to include any valuable feedback from our 
community.


For more details on these changes and all the other fixes and 
improvements included in this release, please refer to the following 
issues on the PDFBox issue tracker at 
https://issues.apache.org/jira/browse/PDFBOX.


Sub-task
[PDFBOX-5663] - Implement "about" dialog

Bug
[PDFBOX-5350] - Regression unicode mapping in Korean document
[PDFBOX-5649] - NPE in DomXmpParser.parseLiDescription
[PDFBOX-5654] - Avoid NPE when processing CFF2 based fonts
[PDFBOX-5658] - IllegalArgumentException: Dimensions (width=458477041 
height=26) are too large

[PDFBOX-5662] - Can not see checkbox check
[PDFBOX-5665] - NPE when converting pdf to image.
[PDFBOX-5666] - error encountered in splitting pdf using ver 3.0.0
[PDFBOX-5668] - NullPointerException in XMPMetadata.getSchema()
[PDFBOX-5672] - PDFToImage might not correctly detect unsupported image 
formats

[PDFBOX-5673] - Refactor Stream operations and operations on collections
[PDFBOX-5681] - ConcurrentModificationException in getObjectsByType() in 3.x
[PDFBOX-5682] - Long/permanent hang in PDFBox 3.x
[PDFBOX-5684] - Font cache isn't effective on my machine, always rebuilds
[PDFBOX-5687] - PDFBox 3.0 OSGi bundle requires sun.java2d.cmm.kcms package
[PDFBOX-5689] - Many new warnings "newGlyph ... newValue: ... is trying 
to override the oldValue" after upgrade to V3.0.0

[PDFBOX-5694] - PDF to Image conversion results in different converted image
[PDFBOX-5696] - COSStream lost, becomes a COSDictionary
[PDFBOX-5702] - Text in a certain font is lost when converting pdf to image
[PDFBOX-5706] - Incorrect colors in image from PDFs (DCTDecode)
[PDFBOX-5707] - Avoid NPE when accessing the elements of a COSArray
[PDFBOX-5712] - Stackoverflow in split
[PDFBOX-5713] - PfbParser fails to parse PFB font with multiple binary 
records.
[PDFBOX-5718] - java.lang.IllegalArgumentException: Provided dictionary 
is not of type 'COSName{OCG}'


New Feature

[PDFBOX-5670] - Allow repeatable subcommands in the command line tools
[PDFBOX-5683] - Inconsistent/incomplete PDF rendering

Improvement

[PDFBOX-4892] - Improve code quality (4)
[PDFBOX-5664] - 3.0.0: PDFCloneUtility needs a protected constructor to 
be useable outside of PDFBox when using Java 9 JPMS

[PDFBOX-5685] - Reduce number of copies to lower memory footprint
[PDFBOX-5693] - Consolidate bouncycastle configuration
[PDFBOX-5699] - Consistent scm.url values for pom.xml
[PDFBOX-5703] - use comparison operators for enums
[PDFBOX-5705] - update log4j dependency to 2.21.0
[PDFBOX-5711] - Loader: add support for java.nio.file.Path

Test

[PDFBOX-5667] - Can't create test for ExtractText command line tool

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by SHA512 checksums and a PGP signature
that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://www.apache.org/dist/pdfbox/KEYS.

About Apache PDFBox
---

Apache PDFBox is an open source Java library for working with PDF documents.
This project allows creation of new PDF documents, manipulation of existing
documents and the ability to extract content from documents. Apache PDFBox
also includes several command line utilities. Apache PDFBox is published
under the Apache License, Version 2.0.

For more information, visit https://pdfbox.apache.org/

About The Apache Software Foundation


Established in 1999, The Apache Software Foundation provides organizational,
legal, and financial support for more than 100 freely-available,
collaboratively-developed Open Source projects. The pragmatic Apache License
enables individual and commercial users to easily deploy Apache software;
the Foundation's intellectual property framework limits the legal exposure
of its 2,500+ contributors.

For more information, visit https://www.apache.org/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 3.0.1

2023-11-30 Thread Andreas Lehmkühler



Am 27.11.23 um 17:46 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 3.0.1.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Timo Boehme
   +1 Tim Allison
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 3.0.1

2023-11-27 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 3.0.1 release is available at:

https://dist.apache.org/repos/dist/dev/pdfbox/3.0.1/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/3.0.1/

The SHA-512 checksum of the archive is 
8ca8f3297ec04efaa23ab6d9ca421c1b39d8fb2de392e0f7b5aa6e7053eac75066e8b2872dc6b6847a0194b557aa8570de7f1d1a122fcf3888bf9ed21eae0257.


Please vote on releasing this package as Apache PDFBox 3.0.1.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 3.0.1
[ ] -1 Do not release this package because...

Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.1 release?

2023-11-26 Thread Andreas Lehmkühler




Am 26.11.23 um 09:45 schrieb Tilman Hausherr:

I looked at some suspicious differences:

N76NZUPHNGNM6TCEHWSLSDA5UKNH5C7D.pdf page 20: this one got better 
(akunavām instead of akunav m)

172096.pdf page 5: also better (bullet points)

poppler-11994-0.pdf: I couldn't reproduce the OOM. Maybe it's temporary, 
maybe it's a tika bug.

I wasn't able to reproduce the OOM either





I'm going to cut the release tomorrow

@Tilman thanks for running the tests again

Andreas



Tilman



On 26.11.2023 04:41, Tilman Hausherr wrote:

Done:

https://home.snafu.de/tilman/tmp/reports_pdfbox_3.0.0_vs_3.0.1.tar.xz

Tilman

On 22.11.2023 08:08, Andreas Lehmkühler wrote:

Hi,

after fixing the latest regressions I'd like to cut the 3.0.1 release 
next Monday/Tuesday.


WDYT?

@Tim, @Tilman do you have the time to run the extraction tests 3.0.0 
vs 3.0.1 ?


Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 3.0.1 release?

2023-11-21 Thread Andreas Lehmkühler

Hi,

after fixing the latest regressions I'd like to cut the 3.0.1 release 
next Monday/Tuesday.


WDYT?

@Tim, @Tilman do you have the time to run the extraction tests 3.0.0 vs 
3.0.1 ?


Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Converting obfuscate PDFs to PS and back to PDF

2023-11-07 Thread Andreas Lehmkühler




Am 03.11.23 um 17:24 schrieb Tilman Hausherr:

https://www.danisch.de/blog/2023/10/31/aktennotiz-zu-pdftotext-bei-vermurksten-zeichensaetzen/

The text is in german but what he says that he was able to extract text 
from obfuscated PDFs by converting them to PostScript and then back to 
PDF. I didn't test this myself but I suspect that the conversion to 
PostScript dumps the /ToUnicode stream, and that it is rebuilt from the 
font itself when the conversion is done.

The information has to be somehwere otherwise such "conversion" won't work.

@Tilman did you try to contact the author to ask for an example?

Andreas


Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 2.0.30 released

2023-11-05 Thread Andreas Lehmkühler

The Apache PDFBox community is pleased to announce the release of
Apache PDFBox version 2.0.30. The release is available for download at:

https://pdfbox.apache.org/download.html

See the full release notes below for details about this release.

Release Notes -- Apache PDFBox -- Version 2.0.30

Introduction


The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is an incremental bugfix release based on the earlier 2.0.29 
release. It contains

a couple of fixes and small improvements.

For more details on these changes and all the other fixes and improvements
included in this release, please refer to the following issues on the
PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX.

Bug

[PDFBOX-5350] - Regression unicode mapping in Korean document
[PDFBOX-5359] - Operators "q" and "Q" should also preserve text matrices
[PDFBOX-5623] - Signature Image not Rendered starting with PDFBox 2.0.23 
+ patch provided

[PDFBOX-5627] - Fonts are not subsetted when saving incrementally
[PDFBOX-5628] - Bug in PDFMergerUtility#mergeFields
[PDFBOX-5639] - Password protected PDF opens in GUI apps but PDFbox says 
invalid password
[PDFBOX-5642] - Wrong error message "2.4.1 : Invalid Color space, The 
operator "rg" can't be used with CMYK Profile"

[PDFBOX-5644] - Make FDF annotations more compliant with the specification
[PDFBOX-5649] - NPE in DomXmpParser.parseLiDescription
[PDFBOX-5651] - Regression: NoSuchElementException in PDFXrefStreamParser
[PDFBOX-5653] - The PageDrawer.strokePath method is blocked, and cpu100%
[PDFBOX-5654] - Avoid NPE when processing CFF2 based fonts
[PDFBOX-5658] - IllegalArgumentException: Dimensions (width=458477041 
height=26) are too large

[PDFBOX-5662] - Can not see checkbox check
[PDFBOX-5665] - NPE when converting pdf to image.
[PDFBOX-5668] - NullPointerException in XMPMetadata.getSchema()
[PDFBOX-5672] - PDFToImage might not correctly detect unsupported image 
formats

[PDFBOX-5684] - Font cache isn't effective on my machine, always rebuilds
[PDFBOX-5694] - PDF to Image conversion results in different converted image
[PDFBOX-5702] - Text in a certain font is lost when converting pdf to image
[PDFBOX-5706] - Incorrect colors in image from PDFs (DCTDecode)

New Feature

[PDFBOX-5683] - Inconsistent/incomplete PDF rendering

Improvement

[PDFBOX-4892] - Improve code quality (4)
[PDFBOX-5630] - Add PDRectangle#TABLOID paper size
[PDFBOX-5631] - Support version 0.5 of MaximumProfileTable
[PDFBOX-5632] - loca-table isn't mandatory for TTF/OTF-fonts using CFF 
outlines

[PDFBOX-5636] - Implement PDF 2.0 dash phase clarification
[PDFBOX-5637] - Add getter and setter for the CO array under PDAcroForm
[PDFBOX-5645] - Make UTC timezone static
[PDFBOX-5650] - Facilitate migration to PDFBox 3.0
[PDFBOX-5693] - Consolidate bouncycastle configuration
[PDFBOX-5699] - Consistent scm.url values for pom.xml
[PDFBOX-5703] - use comparison operators for enums

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by a SHA512 checksum and a PGP signature
that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://www.apache.org/dist/pdfbox/KEYS.

About Apache PDFBox
---

Apache PDFBox is an open source Java library for working with PDF documents.
This project allows creation of new PDF documents, manipulation of existing
documents and the ability to extract content from documents. Apache PDFBox
also includes several command line utilities. Apache PDFBox is published
under the Apache License, Version 2.0.

For more information, visit https://pdfbox.apache.org/

About The Apache Software Foundation


Established in 1999, The Apache Software Foundation provides organizational,
legal, and financial support for more than 100 freely-available,
collaboratively-developed Open Source projects. The pragmatic Apache License
enables individual and commercial users to easily deploy Apache software;
the Foundation's intellectual property framework limits the legal exposure
of its 2,500+ contributors.

For more information, visit https://www.apache.org/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 2.0.30

2023-11-05 Thread Andreas Lehmkühler



Am 01.11.23 um 20:23 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 2.0.30.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.30

2023-11-04 Thread Andreas Lehmkühler

Hi,

just a friendly reminder, the vote ends in about 12 hours from now.

Andreas

Am 01.11.23 um 20:23 schrieb Andreas Lehmkühler:

Hi,

a candidate for the PDFBox 2.0.30 release is available at:

     https://dist.apache.org/repos/dist/dev/pdfbox/2.0.30/

The release candidate is a zip archive of the sources in:

     https://svn.apache.org/repos/asf/pdfbox/tags/2.0.30/

The SHA-512 checksum of the archive is 
c1e66695af16396f6a36d02972270651a4630b36799e1fe13262c5748b18cfcbb46829c847ab4993832018f5f8a0546eb468cafdb36019314e275351569d52cc.


Please vote on releasing this package as Apache PDFBox 2.0.30.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

     [ ] +1 Release this package as Apache PDFBox 2.0.30
     [ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.30 build issues

2023-11-01 Thread Andreas Lehmkühler
I did it. I fixed most of the issues so that I was able to cut the 
2.0.30 release. There is one issue left, as the antrun plugin still 
doesn't work automatically so that I had to trigger it manually at the 
end of the process.


I'm going to investigate that one later, for today I'm done. I'll port 
those changes to th 3.0 branch and the trunk version as well, but no 
today ...


Sorry for the svn noise.

Andreas

Am 01.11.23 um 14:32 schrieb Andreas Lehmkühler:
I've fixed an issue due to a major update of the maven-antrun-plugin and 
ran into the next one :-(


Stay tuned ...

Am 01.11.23 um 13:32 schrieb Andreas Lehmkühler:

Hi,

I've solved the build issue and restarted the release process. It 
looks good ...


The changes from PDFBOX-5699 introduced the issue. Moving the scm 
definition to the parent pom was correct, but the maven release plugin 
stumbled upon the fact the we were holding the parent pom in its own 
subdirectory, thje tagging of the release failed.


I've fixed that by moving everything from the parent pom to the main 
pom in the root directory. Finally the prepare step of the release works.


But it looks like the second step doesn't :-(

I'll have a look ...

Am 30.10.23 um 20:02 schrieb Andreas Lehmkühler:

Hi,

I've experiencing some issues with the release build for 2.0.30. I 
have an idea on how to fix it, but it will take some time, so that 
I'm going to postpone the release for a couple of days.


Sorry for the svn noise.

Cheers
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 2.0.30

2023-11-01 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 2.0.30 release is available at:

https://dist.apache.org/repos/dist/dev/pdfbox/2.0.30/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/2.0.30/

The SHA-512 checksum of the archive is 
c1e66695af16396f6a36d02972270651a4630b36799e1fe13262c5748b18cfcbb46829c847ab4993832018f5f8a0546eb468cafdb36019314e275351569d52cc.


Please vote on releasing this package as Apache PDFBox 2.0.30.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 2.0.30
[ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.30 build issues

2023-11-01 Thread Andreas Lehmkühler
I've fixed an issue due to a major update of the maven-antrun-plugin and 
ran into the next one :-(


Stay tuned ...

Am 01.11.23 um 13:32 schrieb Andreas Lehmkühler:

Hi,

I've solved the build issue and restarted the release process. It looks 
good ...


The changes from PDFBOX-5699 introduced the issue. Moving the scm 
definition to the parent pom was correct, but the maven release plugin 
stumbled upon the fact the we were holding the parent pom in its own 
subdirectory, thje tagging of the release failed.


I've fixed that by moving everything from the parent pom to the main pom 
in the root directory. Finally the prepare step of the release works.


But it looks like the second step doesn't :-(

I'll have a look ...

Am 30.10.23 um 20:02 schrieb Andreas Lehmkühler:

Hi,

I've experiencing some issues with the release build for 2.0.30. I 
have an idea on how to fix it, but it will take some time, so that I'm 
going to postpone the release for a couple of days.


Sorry for the svn noise.

Cheers
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.30 build issues

2023-11-01 Thread Andreas Lehmkühler

Hi,

I've solved the build issue and restarted the release process. It looks 
good ...


The changes from PDFBOX-5699 introduced the issue. Moving the scm 
definition to the parent pom was correct, but the maven release plugin 
stumbled upon the fact the we were holding the parent pom in its own 
subdirectory, thje tagging of the release failed.


I've fixed that by moving everything from the parent pom to the main pom 
in the root directory. Finally the prepare step of the release works.


But it looks like the second step doesn't :-(

I'll have a look ...

Am 30.10.23 um 20:02 schrieb Andreas Lehmkühler:

Hi,

I've experiencing some issues with the release build for 2.0.30. I have 
an idea on how to fix it, but it will take some time, so that I'm going 
to postpone the release for a couple of days.


Sorry for the svn noise.

Cheers
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



2.0.30 build issues

2023-10-30 Thread Andreas Lehmkühler

Hi,

I've experiencing some issues with the release build for 2.0.30. I have 
an idea on how to fix it, but it will take some time, so that I'm going 
to postpone the release for a couple of days.


Sorry for the svn noise.

Cheers
Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.30/3.0.1 release?

2023-10-29 Thread Andreas Lehmkühler




Am 28.10.23 um 18:10 schrieb Tilman Hausherr:

It's really just the name - I tested 2.0.29 against 2.0.30.SNAPSHOT.
Thanks for the confirmation. I can't see any issue in the results so 
that I'm planing to cut the 2.0.30 release tomorrow



Btw this new SO question looks like a bug to me:
https://stackoverflow.com/questions/77376559/pdfbox-version-3-0-0-splitter-class-nullpointerexception-at-org-apache-pdfbox-co
Yes, that is an issue, but it is limited to the 3.0 branch and the trunk 
version. I've created https://issues.apache.org/jira/browse/PDFBOX-5707 
to deal with it.


Andreas



Tilman


On 28.10.2023 17:02, Andreas Lehmkühler wrote:



Am 28.10.23 um 13:12 schrieb Tilman Hausherr:

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.29_vs_3.0.0.tar.xz
Thanks for running the test, but I'm a little bit puzzled about the 
file name. According to the stacktraces in A and B I guess you've 
compared 2.0.29 and 2.0.30 and not 2.0.29 and 3.0.0?






Tilman

On 23.10.2023 19:11, Tilman Hausherr wrote:

+1 for both.

I can do a regression test for 2.0.29 / 2.0.30 but not today, but 
hopefully I'll start until saturday.


I don't expect any surprised because I did a regression test not 
long ago in connection with the extraction of Korean documents.


Tilman

On 22.10.2023 19:37, Andreas Lehmkühler wrote:

Hi,

I'd like to cut the 2.0.30 release in a week from now, on Monday or 
Tuesday.


A week later I'd like to go for the first 3.0 bugfix release 3.0.1

WDYT?

@Tim, @Tilman do you have the time to run the extraction tests?

Andreas 





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.30/3.0.1 release?

2023-10-28 Thread Andreas Lehmkühler




Am 28.10.23 um 13:12 schrieb Tilman Hausherr:

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.29_vs_3.0.0.tar.xz
Thanks for running the test, but I'm a little bit puzzled about the file 
name. According to the stacktraces in A and B I guess you've compared 
2.0.29 and 2.0.30 and not 2.0.29 and 3.0.0?






Tilman

On 23.10.2023 19:11, Tilman Hausherr wrote:

+1 for both.

I can do a regression test for 2.0.29 / 2.0.30 but not today, but 
hopefully I'll start until saturday.


I don't expect any surprised because I did a regression test not long 
ago in connection with the extraction of Korean documents.


Tilman

On 22.10.2023 19:37, Andreas Lehmkühler wrote:

Hi,

I'd like to cut the 2.0.30 release in a week from now, on Monday or 
Tuesday.


A week later I'd like to go for the first 3.0 bugfix release 3.0.1

WDYT?

@Tim, @Tilman do you have the time to run the extraction tests?

Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.30/3.0.1 release?

2023-10-25 Thread Andreas Lehmkühler




Am 23.10.23 um 19:11 schrieb Tilman Hausherr:

+1 for both.

I can do a regression test for 2.0.29 / 2.0.30 but not today, but 
hopefully I'll start until saturday.

I'm not in a hurry, take your time.


I don't expect any surprised because I did a regression test not long 
ago in connection with the extraction of Korean documents.


Tilman

On 22.10.2023 19:37, Andreas Lehmkühler wrote:

Hi,

I'd like to cut the 2.0.30 release in a week from now, on Monday or 
Tuesday.


A week later I'd like to go for the first 3.0 bugfix release 3.0.1

WDYT?

@Tim, @Tilman do you have the time to run the extraction tests?

Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 2.0.30/3.0.1 release?

2023-10-22 Thread Andreas Lehmkühler

Hi,

I'd like to cut the 2.0.30 release in a week from now, on Monday or Tuesday.

A week later I'd like to go for the first 3.0 bugfix release 3.0.1

WDYT?

@Tim, @Tilman do you have the time to run the extraction tests?

Andreas


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: log4j2 revert

2023-10-21 Thread Andreas Lehmkühler

@Axel thanks for the patch @Tilman thanks for applying it.

Be aware that I changed the builds for the trunk and for 3.0 so that the 
absence of the mentioned log no logger breaks the build. There should be 
a warning instead if the log is missing


Andreas

Am 21.10.23 um 14:13 schrieb Tilman Hausherr:
Thanks, I'm testing it and will commit after that. Nice find. I'm 
wondering why this worked for so many years.


Tilman

On 21.10.2023 08:02, axh wrote:

Hi,

I opened PDFBOX-5705 
<https://issues.apache.org/jira/browse/PDFBOX-5705> for this and 
created a patch. Everything seems to work, but please verify in 
Jenkins. Sorry that Jira somehow changed priority to „important“ 
without me noticing. My internet connection is somewhat unreliable 
today and every click in Jira takes several tries to get through so I 
will just leave it at that if it’s ok for you.


Axel


Am 20.10.2023 um 09:52 schrieb axh :

Ah, this is weird. Log4J2 currently isn’t used at all in the PDFBox 
code base. But when running tests and examples, log4j-core and 
log4j-jcl are included to reroute commons logging to log4j2 which is 
then used to set the output format and create the log file you 
mentioned. It seems that with the updated version, the commons 
logging to log4j bridge simply isn’t loaded anymore.


This is also something that will get better once we switch directly 
to log4j.


I’ll keep you updated when I find out why log4j-jcl isn’t loaded.

Axel


Am 20.10.2023 um 08:11 schrieb Tilman Hausherr :

Yes, although the log file isn't part of the distribution (or is 
it?) I wondered why it wasn't there. And then I noticed that the 
logging didn't work anymore, i.e. the typical output format wasn't 
there in the console. And not in the file either. And the same 
happened at work with another software of mine.


@Axel the "Files differ" lines are not a problem, this always 
happens. I check these manually or with a modified code and my own 
"expected" files.


Tilman

On 20.10.2023 07:55, Andreas Lehmkühler wrote:


Am 20.10.23 um 07:17 schrieb axh:
Hm… I just did a clean checkout of trunk and did mvn clean verify 
and everything passes, both with log4j2.version set to 2.20.0 and 
2.21.0. I can however see file differences reported in the log 
like this:
The buidl itself works fine after the update. The Jenkins build 
adds another step to the end which fails. An expected log file is 
missing:


ERROR: Step ?Archive the artifacts? failed: No artifacts found that 
match the file pattern "pdfbox/target/pdfbox.log". Configuration 
error?



See [1] for further details

[1] 
https://ci-builds.apache.org/job/PDFBox/job/PDFBox-trunk/1823/console



[INFO] Running 
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormTest
Files differ: 
/Users/axelhowind/IdeaProjects/pdfbox/pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/MultilineFields.pdf-1.png

/Users/axelhowind/IdeaProjects/pdfbox/pdfbox/target/test-output/MultilineFields.pdf-1.png
Rendering of target/test-output/MultilineFields.pdf failed or is 
not identical to expected rendering in 
src/test/resources/org/apache/pdfbox/pdmodel/interactive/form 
directory
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time 
elapsed: 1.027 s -- in 
org.apache.pdfbox.pdmodel.interactive.form.MultilineFieldsTest


But these are not reported as test failures. In the test code, I 
can see that this is by design:


// compare rendering
if (!TestPDFToImage.doTestFile(pdf, IN_DIR.getAbsolutePath(), 
OUT_DIR.getAbsolutePath()))

{
 // don't fail, rendering is different on different systems, 
result must be viewed manually
 System.err.println("Rendering of " + pdf + " failed or is not 
identical to expected rendering in " + IN_DIR + " directory");

}
What exactly does "it no longer works" mean? Is it related to the 
above, or is it the build failures reported by Jenkins on the list?


Axel


Am 20.10.2023 um 06:50 schrieb axh :

Hi,

just saw your message here. As I just started on replacing 
commons-logging by log4j, I will also look into this. I also 
overlooked that there’s already a property for the log4j version. 
Will update the patch I just submitted and then see if I can find 
out what’s causing the test failure with 2.21.0.


Axel

Am 19.10.2023 um 19:06 schrieb Tilman Hausherr 
:


I have reverted the change to the log4j2 version. It no longer 
works. I'll wait a bit if there is an issue about it, there was 
nothing on the mailing list today.


Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




---

Re: log4j2 revert

2023-10-19 Thread Andreas Lehmkühler




Am 20.10.23 um 07:17 schrieb axh:

Hm… I just did a clean checkout of trunk and did mvn clean verify and 
everything passes, both with log4j2.version set to 2.20.0 and 2.21.0. I can 
however see file differences reported in the log like this:
The buidl itself works fine after the update. The Jenkins build adds 
another step to the end which fails. An expected log file is missing:


ERROR: Step ?Archive the artifacts? failed: No artifacts found that 
match the file pattern "pdfbox/target/pdfbox.log". Configuration error?



See [1] for further details

[1] https://ci-builds.apache.org/job/PDFBox/job/PDFBox-trunk/1823/console




[INFO] Running org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormTest
Files differ: 
/Users/axelhowind/IdeaProjects/pdfbox/pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/MultilineFields.pdf-1.png
   
/Users/axelhowind/IdeaProjects/pdfbox/pdfbox/target/test-output/MultilineFields.pdf-1.png
Rendering of target/test-output/MultilineFields.pdf failed or is not identical 
to expected rendering in 
src/test/resources/org/apache/pdfbox/pdmodel/interactive/form directory
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.027 s 
-- in org.apache.pdfbox.pdmodel.interactive.form.MultilineFieldsTest

But these are not reported as test failures. In the test code, I can see that 
this is by design:

// compare rendering
if (!TestPDFToImage.doTestFile(pdf, IN_DIR.getAbsolutePath(), 
OUT_DIR.getAbsolutePath()))
{
 // don't fail, rendering is different on different systems, result must be 
viewed manually
 System.err.println("Rendering of " + pdf + " failed or is not identical to expected 
rendering in " + IN_DIR + " directory");
}
What exactly does "it no longer works" mean? Is it related to the above, or is 
it the build failures reported by Jenkins on the list?

Axel


Am 20.10.2023 um 06:50 schrieb axh :

Hi,

just saw your message here. As I just started on replacing commons-logging by 
log4j, I will also look into this. I also overlooked that there’s already a 
property for the log4j version. Will update the patch I just submitted and then 
see if I can find out what’s causing the test failure with 2.21.0.

Axel


Am 19.10.2023 um 19:06 schrieb Tilman Hausherr :

I have reverted the change to the log4j2 version. It no longer works. I'll wait 
a bit if there is an issue about it, there was nothing on the mailing list 
today.

Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org






-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-19 Thread Andreas Lehmkühler



@Maruan, thanks for starting this initative :-)



Am 11.10.23 um 07:53 schrieb sahy...@fileaffairs.de:

Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0)

Sounds good to me.



Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.
I agree, we need some sort of plan for the next version to avoid another 
big bang release. I don't have to be that formal, but we shall agree on 
bigger changes to be added to the next major release



Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
IMHO XMP-support is not essential but optional so that it is a good idea 
to use some existing lib instead of implementing our own one.



- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)

That might be a big thing ...


- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)

I'm not sure which details maybe be important, but let us start a discussion


- review and add some more PDF 2.0 capabilities

In most cases this can be done in little steps


- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox

Good ideas as well 


...

That list is already long and I think would be too much given above
idea of release planning.

;-)


With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.
First of all, what is your definition of a LTS version? Of course is a 
long term version, but what is long and when does such version reach EOL?


Why did you choose 2.0 as LTS? 2.0.0 was released in 2016, doesn't that 
already qualify as LTS? 2.0 requires java 6, a very old version.
Why not choose 3.0 as LTS? It requires java 8, a more or less old 
version but still widely used and the last version before they start 
removing apis. 3.0 is the last version including preflight.
We should discuss that in a separate thread, juts wanted to share my 
thoughts as a starter





Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 4.0 and development plans

2023-10-13 Thread Andreas Lehmkühler




Am 13.10.23 um 04:40 schrieb axh:

Hi,

I suggest to also revisit logging. Last week I opened an issue for that (PDFBOX-5695 
), but it seems everybody is 
tired by this subject and no none even looked at it. Nonetheless, please take a look. 
The last time a switch to a logging facade was proposed (and rejected) has been 10 
years ago. I think it is worth reconsidering, and a new major release would be the 
right time to do a change like that. More details in the issue.

Please don't give upto early on us. We are all volunteers with limited 
time and different priorities.



Whatever the project decides, I am willing to contribute the required patch(es).

We highly appreciate that.

I personally don't have the pressure to switch the logging framework but 
I see it is long overdue to overhauil that part of PDFBox.


I tend to agree with Tilman and I'd like to use log4j2. I hope I'll find 
some time to comment on your proposal at the next weekend.



Andreas



Cheers,
Axel


Am 11.10.2023 um 07:53 schrieb sahy...@fileaffairs.de:

Dear colleagues,

with 3.0 being released and 4.0 being started I'd like to start
discussing what the major plans are for 4.0. And maybe in a way that
the release can be made faster than what we had for 3.0. (maybe size it
in a way that we can do the dev stuff by spring 2024 and then release
in summer 2024 followed by a 4.1 release to add to that instead of
doing a big bang like 3.0)

Shall we share some ideas via the mailing list or start a page on our
website (I think ml is easier to do). We can still document the major
initiatives as soon as we have agreed in a blog post.

Here are my current thoughts (some of which might also be backported to
3.0) in no particular order

- appareance stream handlers for interactive form widgets (similar to
what we have for annotations) also allowing one to add their own
handler
- replacement or at least new base for XMPBox (current thought is to
have a new base parser and add if possible XMPBox current end user api
on top - might be able to reuse xmlgraphics XMP lib). Would allow to
better deal with XMPs which are not standard and make it easier to add
to existing XMPs low level.
- then we had the discussion about an event handler/listener similar to
what fop provides so one can listen to corrections/repairs done under
the hood (I know that we can only lay the ground for that as this is a
major undertaking given all the places where we correct things)
- enhance the parsing to keep the information about incremental
versions (better debugging, trace of changes done ...)
- review and add some more PDF 2.0 capabilities
- better text formatting/language support (maybe by including fop parts
or looking into using HarfBuzz)
- I'd also like to discuss reaching out to fop to look at integrating
some of their font handling into fontbox
...

That list is already long and I think would be too much given above
idea of release planning.

With regards to versioning I'd like to propose that we have 2.0 as LTS
and 4.x being the next LTS.

Thoughts
BR
Maruan



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org






-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report October 2023 due

2023-10-10 Thread Andreas Lehmkühler
Thanks for the feedback. Maybe "minor" is the wrong word. I had severe 
issues in my mind which prevent users from using the new version.


I'm going to paraphrase that sentence

Andreas

Am 10.10.23 um 19:40 schrieb Tilman Hausherr:
+1 although I see PDFBOX-5696 and PDFBOX-5666 as more than "minor" 
because these are so weird and surprising.


Tilman

On 08.10.2023 18:53, Andreas Lehmkühler wrote:

Hi,

find attached a quick draft of the board report we're expected to 
submit this month. It's based upon the report wizard template which 
can be found at [1]


Any comments or additions are appreciated ...


## Description:
The mission of PDFBox is the creation and maintenance of software 
related to

Java library for working with PDF documents

## Project Status:
Current project status: Ongoing with moderate activity
Issues for the board: There are no issues requiring board attention at 
this time



## Membership Data:
Apache PDFBox was founded 2009-10-21 (14 years ago)
There are currently 21 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
- No new committers. Last addition was Joerg O. Henne on 2017-10-09.

## Project Activity:
Recent releases:

    3.0.0 was released on 2023-08-17.
    2.0.29 was released on 2023-07-01.
    2.0.28 was released on 2023-04-13.

## Community Health:
- there is a steady stream of contributions, bug reports and questions 
on the mailing lists
- finally the new major release 3.0.0 was released after 7 years of 
development
- there are some minor issues with the release but nothing serious. I 
expect the first bugfix release 3.0.1 in a couple of weeks
- the development of 4.0.0 already started with two fundamental 
changes. We switched to java 11 as minimum requirement and removed the 
sub project preflight due to inactivity





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Recent build failures

2023-10-01 Thread Andreas Lehmkühler

Hi,

I've downgraded the version of the download-maven-plugin in the trunk to 
see if it is related to the recent build failures.


I can't reproduce the issue at home. The expected sha512 hash is still 
correct so that I assume an issue with the plugin itself.


Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Multithreading PDFRenderer

2023-09-20 Thread Andreas Lehmkühler




Am 18.09.23 um 14:00 schrieb Arno Dietsche:

Hi,

We are using pdfbox 3.0.0 as part of our project which aims at finding 
discrepancies between two similar documents created by external services. One 
thing we use it for is to render the pages of those documents to images and 
compare the rendered images. Those documents can be very large and therfore we 
are trying to optimize our resource usage. So we want to parallize the page 
rendering if possible. This leads to my question in relation to the PDFRenderer 
class (v3.0.0):

PDFBox is officially supposed not to be thread safe, but we removed some 
of the limitations and tried to make new features thread safe.



In the past we could observe problems with this multithreaded approach. And I 
understand that PDDocument is not thread safe, but what if I get all the PDPage 
objects first and then render them multithreaded? Essentially if the method 
PDFRenderer.renderImage(int pageIndex, float scale, ImageType imageType, 
RenderDestination destination) is passed the PDPPage object directly and not 
the pageIndex, it would not be needed to get the PDpage object from the 
PDPageTree. Do you know of possible limitations regarding multithreading the 
remainder of this renderImage method?

I guess that adding the PDPage instance to that method won't change that 
much as 3.0.0 uses an ondemand parser and most likely the related PDPage 
objects are't fully loaded so that the parser has to dereference most of 
the objects in question during rendering. But good news is, that part 
should be thread safe.
Our own debugger is multithreaded and at the beginning of the 
implemtation of the ondemand parser I stumbled upon that and had to make 
the new IO classes thread safe.


Saying that, I'd like to encourage you to give it a try, but no 
guarantee from our side ;-)



Andreas


To clarify I am currently testing this with a subclass of PDFRenderer so I 
could add this method: renderImage(PDPage page, float scale, ImageType 
imageType, RenderDestination destination)

Thank you very much for your time and help


Best regards / Mit freundlichen Grüßen
Arno Dietsche

brainsphere informationworks GmbH
Elsenheimerstrasse 41
80687 Muenchen
Germany

Telefon:  +49 89 203004-830
Telefax:  +49 89 203004-849

Sitz der Gesellschaft: Muenchen
Registergericht: Amtsgericht Muenchen HRB 154535
Geschaeftsfuehrer: Hans-Joerg Kamm, Volker Mattes

Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte 
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail 
irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und 
vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte 
Weitergabe dieser E-Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of the material in this e-mail is strictly 
forbidden.




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: CipherInputStream may not be closed

2023-09-10 Thread Andreas Lehmkühler




Am 08.09.23 um 17:32 schrieb axh:

Hi Anna-Katharina,

what version are you using? In the current 3.0, the stream is closed 
(implicitly) by using the try-with-resources syntax 
(https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html):

try (CipherInputStream cis = new CipherInputStream(data, cipher))
{
 …
}
According to Git Blame, try-with-resources has been used at that point since 
2017, so there should be no problem. Disclaimer: I am not a maintainer, I just 
sometimes contribute code.

In PDFBox 2.0.x the stream is closed in a finally block.

I guess we are fine here.

Andreas



Axel



Am 08.09.2023 um 14:08 schrieb Anna-Katharina Wickert 
:

Hei dear maintainers,

For a benchmark [1], we randomly sampled JCA usages to decide if the API usage 
is a violation of any API usage constraint.
We believe we found one for the JCA class CipherInputStream.
The call to *close* is missing for the call sequence to *CipherInputStream*. 
Thus, the input stream including the ressources of the stream are not released. 
[More Details in the JDK 17 
documentation](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/javax/crypto/CipherInputStream.html)
The instance that we sampled is located in:
- file: 
pdfbox/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/encryption/SecurityHandler.java
- method: private void encryptDataAES256(InputStream data, OutputStream output, 
boolean decrypt) throws IOException
- line: 379

To the best of my knowledge, this JCA usage does not result in a vulnerability 
(directly). However, it violates the API constraint discussed above. Therefore, 
we consider adding this usage as a violation into the benchmark.

Best,
Anna-Katharina Wickert
For the CamBench team

[1] https://github.com/CROSSINGTUD/CamBench





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 1.8.x End-Of-Life (EOL) Announcement

2023-08-19 Thread Andreas Lehmkühler

The Apache PDFBox Team would like to inform you that PDFBox 1.8.17
is the last release of the 1.8 branch, which has reached its end of life 
and won't be longer officially supported.


The current community mainly maintains the 2.0.x branch and the brand 
new 3.0.x branch. We recommend everyone to upgrade at least to the 2.0.x 
branch for the best experience.


[1] https://pdfbox.apache.org/2.0/migration.html
[2] https://pdfbox.apache.org/3.0/migration.html


Thanks,
The Apache PDFBox Team

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 3.0.0 released

2023-08-18 Thread Andreas Lehmkühler
The Apache PDFBox community is pleased to announce the release of Apache 
PDFBox 3.0.0. It is available for download at:


https://pdfbox.apache.org/download.html

The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is the new major release 3.0.0 of PDFBox. This release contains a 
lot of improvements, fixes and refactorings. The API is supposed to be 
stable.


A migration guide is available at

https://pdfbox.apache.org/3.0/migration.html.

It is still a work in progress and we are happy to include any valuable 
feedback from our community.


For more details on these changes and all the other fixes and 
improvements included in this release, please refer to the following 
issues on the PDFBox issue tracker at


https://issues.apache.org/jira/browse/PDFBOX.

The full release notes are available at:

https://www.apache.org/dist/pdfbox/3.0.0/RELEASE-NOTES.txt


The Apache PDFBox website can be found at:

https://pdfbox.apache.org/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 3.0.0

2023-08-17 Thread Andreas Lehmkühler




Am 14.08.23 um 20:29 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 3.0.0.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Timo Boehme
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: New PDFBox 3.0 branch

2023-08-15 Thread Andreas Lehmkühler
Thanks for thje hint. The benchmark package isn't part of the reactor 
build and wasn't updated. I've fixed that


Andreas

Am 15.08.23 um 20:00 schrieb Tilman Hausherr:

One small thing re trunk: the benchmark package still has the old version.

Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 3.0.0

2023-08-15 Thread Andreas Lehmkühler

Hi,

when updating the trunk I just realized that pdfbox-io is missing in the 
dist area within svn. This is our fifth 3.0 release and interestingly no 
one missed that part before :-o


However, IMHO that is no show stopper. The artifact was created during 
the build and deployed to nexus. I've added the missing piece to the 
dist area including the signature and hash.


Saying that, the vote is still open unless someone objects to my conclusion.

Andreas

Am 14.08.23 um 20:29 schrieb Andreas Lehmkühler:

Hi,

a candidate for the PDFBox 3.0.0 release is available at:

     https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0/

The release candidate is a zip archive of the sources in:

     https://svn.apache.org/repos/asf/pdfbox/tags/3.0.0/

The SHA-512 checksum of the archive is 
279f283f8f97e3adb5e58546f6242b495eef26dacfc256129f790064a73934f16ceb0a7a9164293d506fc0fff462783d296b844611ed18e12b9de0f1724294b5.


Please vote on releasing this package as Apache PDFBox 3.0.0.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

     [ ] +1 Release this package as Apache PDFBox 3.0.0
     [ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



New PDFBox 3.0 branch

2023-08-14 Thread Andreas Lehmkühler

Hi,

due to the preparations for the final release of PDFBox 3.0.0 I've 
created a new branch "3.0" in svn [1].


I've created a job in jenkins to build that branch as well [2]

Andreas

[1] https://svn.apache.org/viewvc/pdfbox/branches/3.0/
[2] https://ci-builds.apache.org/job/PDFBox/job/PDFBox-3.0.x/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.0 rebuild

2023-08-14 Thread Andreas Lehmkühler
sorry for the noise, I've struggled with the scm connection. There were 
two and I changed the one which wasn't used .


However, finally I've build the release and the vote is open.

Thanks for your patience

Andreas

Am 14.08.23 um 19:28 schrieb Andreas Lehmkühler:

Hi,

something totally went wrong when building the final release. I ended up 
in a 4.0.0-SNAPSHOT release ?!?!?


I have no idea what went wrong. However, I'm going to rollback the 
release and rebuild the whole thing


Stay tuned ;-)

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 3.0.0

2023-08-14 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 3.0.0 release is available at:

https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/3.0.0/

The SHA-512 checksum of the archive is 
279f283f8f97e3adb5e58546f6242b495eef26dacfc256129f790064a73934f16ceb0a7a9164293d506fc0fff462783d296b844611ed18e12b9de0f1724294b5.


Please vote on releasing this package as Apache PDFBox 3.0.0.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 3.0.0
[ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 3.0.0 rebuild

2023-08-14 Thread Andreas Lehmkühler

Hi,

something totally went wrong when building the final release. I ended up 
in a 4.0.0-SNAPSHOT release ?!?!?


I have no idea what went wrong. However, I'm going to rollback the 
release and rebuild the whole thing


Stay tuned ;-)

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.0 final release

2023-08-13 Thread Andreas Lehmkühler

I've had a look at the remaining 4 exceptions.

I can't repoduce the OutOfMemory and the 
ConcurrentModificationException. Maybe both are somehow related to TIKA 
or the test environment.


The IllegalArgumentException is thrown due to an issue with a font. I'm 
not sure if it is really a font bug or some issue with the code.


However, IMHO those issues aren't show stoppers for the planned final 
3.0 release :-)


Andreas




Am 13.08.23 um 13:19 schrieb Andreas Lehmkühler:

@Tilman thanks for running the regression tests

I had a look at the new exceptions.

6 out of 10 files were throwing the same NoSuchElementException in 
PDFXrefStreamParser. It's a regression and both 2.x and 3.x were 
affected. I've applied a fix, see [1]


I'm going to have a look at the remaing 4 exceptions.

Andreas

[1] https://issues.apache.org/jira/browse/PDFBOX-5651

Am 13.08.23 um 09:24 schrieb Tilman Hausherr:

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.29_vs_3.0.0.tar.xz

I had only a short look but I'm optimistic. Some differences may be 
because of the XMP bug.


Tilman



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.0 final release

2023-08-13 Thread Andreas Lehmkühler

@Tilman thanks for running the regression tests

I had a look at the new exceptions.

6 out of 10 files were throwing the same NoSuchElementException in 
PDFXrefStreamParser. It's a regression and both 2.x and 3.x were 
affected. I've applied a fix, see [1]


I'm going to have a look at the remaing 4 exceptions.

Andreas

[1] https://issues.apache.org/jira/browse/PDFBOX-5651

Am 13.08.23 um 09:24 schrieb Tilman Hausherr:

https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.29_vs_3.0.0.tar.xz

I had only a short look but I'm optimistic. Some differences may be 
because of the XMP bug.


Tilman



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: TimeZone + Calendar

2023-08-09 Thread Andreas Lehmkühler

Hi,

see inline

Am 08.08.23 um 19:30 schrieb Daniel Gredler:

Hi,

I think I've said this before, but thanks again for such a great library!
I'm thinking about submitting a patch to improve PDFBox, but wanted to get
the team's thoughts first.

What I've seen is that sometimes under very heavy load across multiple
threads, PDF creation stalls due to lock contention, because
`TTFDataStream.readInternationalDate()` uses
`TimeZone.getTimeZone(String)`, which is synchronized. I assume that the
inverse operation,
`TTFSubsetter.writeLongDateTime(DataOutputStream,Calendar)`, has a similar
issue, though I have not seen it myself.
Good catch. PDFBox is not supposed to be thread-safe, but it is always a 
good idea to eliminate obvious issues.



If these classes were using the newer `java.time` APIs, they could simply
use the `ZoneOffset.UTC` constant... but they are using the older
`Calendar` API (and exposing this fact), so `TimeZone` must be used. So a
few questions from my side:

Is there a plan to move off of the older time APIs and onto the newer
`java.time` APIs, perhaps as part of version 3.0? (e.g. `Calendar` ->
`ZonedDateTime`) Either way, I'm pretty sure such a breaking change is not
contemplated for the 2.x release series, correct?
No plan so far. There were some discussions in the past but other things 
very more important or more interesting ;-) There were some small 
changes in the trunk.


We already released a beta of 3.0 which implies a stable api so that we 
must not change that. Furthermore I'm planing to cut the final release 
next Monday and I guess there isn't enough time to do such changes at all.
2.x is a no go as well, as it relies on java 6 and the java.time api 
requires java 8



`TimeZone` is not technically thread-safe, but there only a couple of
rarely-used dangerous mutators, AFAIK (`setRawOffset(int)`,
`setID(String)`). Would it be OK to keep a static final UTC `TimeZone` and
just reuse that, even though `TimeZone` is only 95% immutable?
I concur with Tilman, this would be a small but good solution for 3.0 
and 2.x as well


Let's discuss the topic after releasing 3.0. Maybe it is a good idea to 
target 4.x for such a change.



If not, what about keeping a static final UTC `TimeZone` constant and
creating a clone each time it is needed, to avoid the synchronization?
(`Calendar.getTimeZone()` makes liberal use of `clone()`, for example).

If that doesn't work, what about subclassing `TimeZone` and overriding the
mutators with methods that just throw `UnsupportedOperationException`s, and
using the subclass for the static final UTC constant? Commons Lang 3 does
something similar with `GmtTimeZone` and the `FastTimeZone` utility class
here:

https://github.com/apache/commons-lang/blob/bf5865ae915ececcdbfa7a473b0d708e3e235bcf/src/main/java/org/apache/commons/lang3/time/FastTimeZone.java#L38

Thanks for the feedback!

Daniel



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 3.0.0 final release

2023-08-06 Thread Andreas Lehmkühler

Hi,

@Tilman thanks for the feedback

I'm planing to cut the final release next Monday, a week from now.

Andreas

Am 25.07.23 um 20:16 schrieb Tilman Hausherr:

I'm surprised that there hasn't been any feedback.

But 3 weeks from now would be ok. 3.0.0.beta-1 was released 2 weeks ago, 
that would mean 5 weeks total, and most people don't go on vacation for 
more than 3 weeks.


+1

Tilman

On 24.07.2023 19:22, Andreas Lehmkühler wrote:

Hi,

the first beta of PDFBox 3.0.0 is out of the door and I'm wondering 
when to do the final release.


I'm in favor of doing the final release soon, let's say in 3 weeks 
from now. Or should we wait a little bit longer for some feedback on 
the beta version. Is there maybe anything you want to do first?


WDYT?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBox 3.0.0 final release

2023-07-24 Thread Andreas Lehmkühler

Hi,

the first beta of PDFBox 3.0.0 is out of the door and I'm wondering when 
to do the final release.


I'm in favor of doing the final release soon, let's say in 3 weeks from 
now. Or should we wait a little bit longer for some feedback on the beta 
version. Is there maybe anything you want to do first?


WDYT?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 3.0.0-beta1 released

2023-07-14 Thread Andreas Lehmkühler
The Apache PDFBox community is pleased to announce the release of the 
first beta release for Apache PDFBox 3.0.0. It is available for download at:


https://pdfbox.apache.org/download.html

The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is the first beta release candidate for the upcoming major release 
3.0.0 of PDFBox. This release contains a lot of improvements, fixes and 
refactorings. The API is supposed to be stable.


A migration guide is available at 
https://pdfbox.apache.org/3.0/migration.html. It is still a work in 
progress and we are happy to include any valuable feedback from our 
community.


For more details on these changes and all the other fixes and 
improvements included in this release, please refer to the following 
issues on the PDFBox issue tracker at 
https://issues.apache.org/jira/browse/PDFBOX.



The full release notes are available at:

https://www.apache.org/dist/pdfbox/3.0.0-beta1/RELEASE-NOTES.txt


The Apache PDFBox website can be found at:

https://pdfbox.apache.org/


-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 3.0.0-beta1

2023-07-14 Thread Andreas Lehmkühler



Am 11.07.23 um 07:55 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 3.0.0-beta1.


   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Apache PDFBox Board Report July 2023 due

2023-07-12 Thread Andreas Lehmkühler

Hi,

I've submitted the report as proposed, thanks for the reviews

Andreas

Am 11.07.23 um 08:16 schrieb Andreas Lehmkühler:

Hi,

find attached a quick draft of the board report we're expected to submit 
this month. It's based upon the report wizard template which can be 
found at [1]


Any comments or additions are appreciated ...


## Description:
The mission of PDFBox is the creation and maintenance of software 
related to Java library for working with PDF documents


## Project Status:
Current project status: Ongoing with moderate activity
Issues for the board: There are no issues requiring board attention at 
this time


## Membership Data:
Apache PDFBox was founded 2009-10-21 (14 years ago)
There are currently 21 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members. Last addition was Matthäus Mayer on 2017-10-16.
- No new committers. Last addition was Joerg O. Henne on 2017-10-09.

## Project Activity:
Recent releases:

     2.0.29 was released on 2023-07-01.
     2.0.28 was released on 2023-04-13.
     2.0.27 was released on 2022-09-29.

## Community Health:
- there is a steady stream of contributions, bug reports and questions 
on the mailing lists

- there are a lot of refactorings, improvements and bugfixes
- 2.0.29 was released a few days ago
- the new release consists of small improvements and bug fixes. Two of 
the latter fix two regressions introduced/revealed in the former 2.0.28 
release

- a vote for the first beta version of PDFBox 3.0.0 is ongoing

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 3.0.0-beta1

2023-07-10 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 3.0.0-beta1 release is available at:

https://dist.apache.org/repos/dist/dev/pdfbox/3.0.0-beta1/

The release candidate is a zip archive of the sources in:

https://svn.apache.org/repos/asf/pdfbox/tags/3.0.0-beta1/

The SHA-512 checksum of the archive is 
07a697c6d31854a74eb0452b792644da33fe5e0f3954040465498869059d8a47b11285e6c1472ab8f7c0be76373b86cfd0d1d5963fc1ed9c08ffbad1aadc5651.


Please vote on releasing this package as Apache PDFBox 3.0.0-beta1.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

[ ] +1 Release this package as Apache PDFBox 3.0.0-beta1
[ ] -1 Do not release this package because...


Here is my +1

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBOx 3.0.0-beta1 release, next attempt

2023-07-10 Thread Andreas Lehmkühler

Hi,

I've some issues with my build enviroment, the signing doesn't work, and 
I don't have any why. It worked fine the other day when building 2.0.29.


I've to investigate first and therefore postpone the release for another 
day.


Andreas


Am 06.07.23 um 19:56 schrieb Andreas Lehmkühler:

Hi,

now that the 2.0.29 is out I'd like to cut the first beta of 3.0.0.

How about next Monday? Or is there anything we have to do first and 
maybe wait another week or two?


WDYT?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



PDFBOx 3.0.0-beta1 release, next attempt

2023-07-06 Thread Andreas Lehmkühler

Hi,

now that the 2.0.29 is out I'd like to cut the first beta of 3.0.0.

How about next Monday? Or is there anything we have to do first and 
maybe wait another week or two?


WDYT?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[ANNOUNCE] Apache PDFBox 2.0.29 released

2023-07-01 Thread Andreas Lehmkühler

The Apache PDFBox community is pleased to announce the release of
Apache PDFBox version 2.0.29. The release is available for download at:

https://pdfbox.apache.org/download.html

See the full release notes below for details about this release.

Release Notes -- Apache PDFBox -- Version 2.0.29

Introduction


The Apache PDFBox library is an open source Java tool for working with 
PDF documents.


This is an incremental bugfix release based on the earlier 2.0.28 
release. It contains

a couple of fixes and small improvements.

For more details on these changes and all the other fixes and improvements
included in this release, please refer to the following issues on the
PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX.

Bug

[PDFBOX-4010] - A (rotated) barcode is missing from a pdf when printed
[PDFBOX-5587] - NullPointerException in PDTrueTypeFont.java getPath( )
[PDFBOX-5591] - Parsing of XMP metadata without optional xmpmeta element
[PDFBOX-5593] - Avoid division by 0 in shading function interpolation
[PDFBOX-5596] - MyPageDrawer#getPaint may produce 
UnsupportedOperationException

[PDFBOX-5601] - Barcode corrupted when printing document
[PDFBOX-5604] - The text in some fonts is lost when converting pdf to image
[PDFBOX-5606] - PDFTextStripper runs out of memory in 2.0.28 but not in 
2.0.27 same code
[PDFBOX-5609] - all values in the signature dictionary shall be direct 
objects

[PDFBOX-5611] - Glyphs not rendered
[PDFBOX-5612] - PDF with mangled font rendering in some environments
[PDFBOX-5614] - RadioButtons disappear when printing PDF
[PDFBOX-5620] - BitsPerComponent 16 not allowed in PDF/A-1b
[PDFBOX-5621] - NullPointerException in PDFStreamEngine.showText
[PDFBOX-5624] - Infinte loop when parsing Type1 font

Improvement

[PDFBOX-5571] - Add duplex and tray parameters to PrintPDF
[PDFBOX-5598] - Create command line utility to extract XMP data
[PDFBOX-5605] - Improve Opaque PDFRenderer example

Task

[PDFBOX-4932] - Implement /RunLengthDecode encoder
[PDFBOX-5595] - Slight regression on corrupt bug tracker file
[PDFBOX-5625] - move and update bc from jdk15on to jdk15to18

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by a SHA512 checksum and a PGP signature
that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://www.apache.org/dist/pdfbox/KEYS.

About Apache PDFBox
---

Apache PDFBox is an open source Java library for working with PDF documents.
This project allows creation of new PDF documents, manipulation of existing
documents and the ability to extract content from documents. Apache PDFBox
also includes several command line utilities. Apache PDFBox is published
under the Apache License, Version 2.0.

For more information, visit https://pdfbox.apache.org/

About The Apache Software Foundation


Established in 1999, The Apache Software Foundation provides organizational,
legal, and financial support for more than 100 freely-available,
collaboratively-developed Open Source projects. The pragmatic Apache License
enables individual and commercial users to easily deploy Apache software;
the Foundation's intellectual property framework limits the legal exposure
of its 2,500+ contributors.

For more information, visit https://www.apache.org/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[RESULT][VOTE] Release Apache PDFBox 2.0.29

2023-07-01 Thread Andreas Lehmkühler



Am 28.06.23 um 18:54 schrieb Andreas Lehmkühler:

Please vote on releasing this package as Apache PDFBox 2.0.29.



   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: [VOTE] Release Apache PDFBox 2.0.29

2023-06-30 Thread Andreas Lehmkühler

Hi,

is there anybody else who is able to spend some cycles on looking into 
this release? There is at least one vote missing and about 24 hours to 
go ...


Thanks in advance

Andreas

Am 28.06.23 um 18:54 schrieb Andreas Lehmkühler:

Hi,

a candidate for the PDFBox 2.0.29 release is available at:

    https://dist.apache.org/repos/dist/dev/pdfbox/2.0.29/

The release candidate is a zip archive of the sources in:

    https://svn.apache.org/repos/asf/pdfbox/tags/2.0.29/

The SHA-512 checksum of the archive is 
d33146e9c9a74de57e9a24a1bbf1967a145f6b4883814533b003115ff0c65930a4a4bac427be3af18b07ce08a7afa08bf19d1dbc7b0a79c788bb02429de38d77.


Please vote on releasing this package as Apache PDFBox 2.0.29.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

    [ ] +1 Release this package as Apache PDFBox 2.0.29
    [ ] -1 Do not release this package because...


Here is my +1

Andreas



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[VOTE] Release Apache PDFBox 2.0.29

2023-06-28 Thread Andreas Lehmkühler

Hi,

a candidate for the PDFBox 2.0.29 release is available at:

    https://dist.apache.org/repos/dist/dev/pdfbox/2.0.29/

The release candidate is a zip archive of the sources in:

    https://svn.apache.org/repos/asf/pdfbox/tags/2.0.29/

The SHA-512 checksum of the archive is 
d33146e9c9a74de57e9a24a1bbf1967a145f6b4883814533b003115ff0c65930a4a4bac427be3af18b07ce08a7afa08bf19d1dbc7b0a79c788bb02429de38d77.


Please vote on releasing this package as Apache PDFBox 2.0.29.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

    [ ] +1 Release this package as Apache PDFBox 2.0.29
    [ ] -1 Do not release this package because...


Here is my +1

Andreas



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox 2.0.29 release?

2023-06-25 Thread Andreas Lehmkühler

OK,


I'm planning to cut the release next Wednesday, in 3 days from now, if 
nobody objects.



Andreas


Am 25.06.23 um 19:05 schrieb Tilman Hausherr:

No need to IMHO

Tilman



--- Original-Nachricht ---
Von: Andreas Lehmkühler
Betreff: Re: PDFBox 2.0.29 release?
Datum: 25. Juni 2023, 15:16
An: dev@pdfbox.apache.org




@Tilman <mailto:@Tilman> thanks for fixing this

Should we run another test before cutting the release?

Andreas

Am 03.06.23 um 05:53 schrieb Tilman Hausherr:

Thank you. This is related to PDFBOX-5606. parseNextToken() is closing
the content stream if an error occurs, but it sometimes calls itself.
Because of the closed content stream the method returns null, which is
reported with the position. Trying to get the position on a closed
stream throws the exception.

Tilman

On 02.06.2023 17:08, Tim Allison wrote:

Reports are here:


<https://corpora.tika.apache.org/base/reports/pdfbox-2.0.29-pre-rc1-reports.tgz>


One new exception which is reproducible with pure PDFBox app's
ExtractText.

<https://corpora.tika.apache.org/base/docs/govdocs1/819/819127.pdf>

Exception in thread "main" org.apache.tika.exception.TikaException:
Unable
to extract PDF content
at

<http://org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:130> )

at<http://org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:212>

)

at


<http://org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298>
)

at


<http://org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298>
)

at


<http://org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:199>
)

at


<http://org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:164>
)

at

<http://org.apache.tika.cli.TikaCLI.handleRecursiveJson(TikaCLI.java:518>
)

at<http://org.apache.tika.cli.TikaCLI.process(TikaCLI.java:489> )
at<http://org.apache.tika.cli.TikaCLI.main(TikaCLI.java:256> )
Caused by:<http://java.io.IOException> : Stream closed
at


<http://java.base/java.io.PushbackInputStream.ensureOpen(PushbackInputStream.java:75>
)

at


<http://java.base/java.io.PushbackInputStream.read(PushbackInputStream.java:132>
)

at


<http://org.apache.pdfbox.pdfparser.InputStreamSource.read(InputStreamSource.java:47>
)

at


<http://org.apache.pdfbox.pdfparser.BaseParser.skipSpaces(BaseParser.java:1257>
)

at


<http://org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:138>
)

at


<http://org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:548>
)

at


<http://org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:516>
)

at


<http://org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155>
)

at


<http://org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:155>
)

at


<http://org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:363>
)

at

<http://org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:137>
)

at


<http://org.apache.tika.parser.pdf.AbstractPDF2XHTML.processPages(AbstractPDF2XHTML.java:1370>
)

at


org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:238)

at

<http://org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:108> )

On Wed, May 31, 2023 at 1:41 PM Tilman Hausherr 
<mailto:thaush...@t-online.de> >

wrote:


Yes please

Thanks

Tilman

On 31.05.2023 17:15, Tim Allison wrote:

+1

Let me know when/if I should run the text extraction regression tests.

On Thu, May 25, 2023 at 12:32 PM sahy...@fileaffairs.de

<mailto:sahy...@fileaffairs.de> <

sahy...@fileaffairs.de <mailto:sahy...@fileaffairs.de> > wrote:


+1

Maruan

Am Mittwoch, dem 24.05.2023 um 07:48 +0200 schrieb Andreas
Lehmkuehler:

Hi,

I tend to release 2.0.29 soon due to the regression which was solved
with
PDFBOX-5606.

WDYT?

Andreas



-

To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org

<mailto:dev-unsubscr...@pdfbox.apache.org>

For additional commands, e-mail: dev-h...@pdfbox.apache.org

<mailto:dev-h...@pdfbox.apache.org>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org

<mailto:dev-unsubscr...@pdfbox.apache.org>

For additional commands, e-mail: dev-h...@pdfbox.apache.org

<mailto:dev-h...@pdfbox.apache.org>



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org

<mailto:dev-unsubscr...@pdfbox.apache.org>

For additional commands, e-mail: dev-h...@pdfbox.apache.org

<mailto:dev-h...@pdfbox.apache.org>




---

Re: PDFBox 2.0.29 release?

2023-06-25 Thread Andreas Lehmkühler

@Tilman thanks for fixing this

Should we run another test before cutting the release?

Andreas

Am 03.06.23 um 05:53 schrieb Tilman Hausherr:
Thank you. This is related to PDFBOX-5606. parseNextToken() is closing 
the content stream if an error occurs, but it sometimes calls itself. 
Because of the closed content stream the method returns null, which is 
reported with the position. Trying to get the position on a closed 
stream throws the exception.


Tilman

On 02.06.2023 17:08, Tim Allison wrote:

Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.29-pre-rc1-reports.tgz 



One new exception which is reproducible with pure PDFBox app's 
ExtractText.


https://corpora.tika.apache.org/base/docs/govdocs1/819/819127.pdf

Exception in thread "main" org.apache.tika.exception.TikaException: 
Unable

to extract PDF content
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:130)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:212)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:199)

at
org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:164) 


at org.apache.tika.cli.TikaCLI.handleRecursiveJson(TikaCLI.java:518)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:489)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:256)
Caused by: java.io.IOException: Stream closed
at
java.base/java.io.PushbackInputStream.ensureOpen(PushbackInputStream.java:75) 

at 
java.base/java.io.PushbackInputStream.read(PushbackInputStream.java:132)

at
org.apache.pdfbox.pdfparser.InputStreamSource.read(InputStreamSource.java:47) 

at 
org.apache.pdfbox.pdfparser.BaseParser.skipSpaces(BaseParser.java:1257)

at
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:138) 


at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:548) 


at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:516) 


at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155) 


at
org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:155) 


at
org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:363) 


at org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:137)
at
org.apache.tika.parser.pdf.AbstractPDF2XHTML.processPages(AbstractPDF2XHTML.java:1370) 


at
org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:238) 


at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:108)

On Wed, May 31, 2023 at 1:41 PM Tilman Hausherr 
wrote:


Yes please

Thanks

Tilman

On 31.05.2023 17:15, Tim Allison wrote:

+1

Let me know when/if I should run the text extraction regression tests.

On Thu, May 25, 2023 at 12:32 PM sahy...@fileaffairs.de <
sahy...@fileaffairs.de> wrote:


+1

Maruan

Am Mittwoch, dem 24.05.2023 um 07:48 +0200 schrieb Andreas 
Lehmkuehler:

Hi,

I tend to release 2.0.29 soon due to the regression which was solved
with
PDFBOX-5606.

WDYT?

Andreas

- 


To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.26 release

2022-04-07 Thread Andreas Lehmkühler
Yes, please

Thanks in advance
Andreas

07.04.2022 11:44:38 Tim Allison :

> Sounds great! Should I rerun the regression tests today?
> 
> On Thu, Apr 7, 2022 at 1:41 AM Andreas Lehmkuehler  wrote:
> 
>> Hi,
>> 
>> sorry for the delay.  I'm planning to cut the 2.0.26 release next
>> Saturday, the
>> day after tomorrow, if nobody objects.
>> 
>> Andreas
>> 
>> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is
>> out
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.19?

2020-02-11 Thread Andreas Lehmkühler
I'm planning to cut the release next Monday.

@Tim please run the regression tests if possible

Thanks in advance
Andreas

Am 7. Februar 2020 01:22:34 MEZ schrieb Tim Allison :
>If you’re up for it, that’d be great! Let me know when I should run the
>regression tests.
>
>Thank you!
>
>On Thu, Feb 6, 2020 at 1:36 PM Andreas Lehmkuehler 
>wrote:
>
>> Am 06.02.20 um 13:14 schrieb Tim Allison:
>> > Hi All,
>> >
>> >We're probably 3ish* weeks away from the next release cycle for
>Apache
>> > Tika.  I realize PDFBox 2.0.18 just came out at the end of
>December.  Are
>> > there any plans/desires for a 2.0.19 release that could make it in
>to the
>> > next Tika?
>> I have no plans so far but how about cutting a release in about 10
>days
>> from now?
>>
>> Andreas
>>
>> >
>> >   Cheers,
>> >
>> >Tim
>> >
>> > *3ish weeks -- as measured by Open Source Standard Time :D
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>>
>>


[RESULT][VOTE] Release Apache PDFBox JBIG2 ImageIO 3.0.3

2019-12-17 Thread Andreas Lehmkühler
Am 14.12.19 um 15:53 schrieb Andreas Lehmkuehler:
> Please vote on releasing this package as Apache PDFBox JBIG2 ImageIO 3.0.3.

   +1 Tilman Hausherr
   +1 Maruan Sahyoun
   +1 Timo Böhme
   +1 Andreas Lehmkühler

Thanks for your support and help!! I'm going to push the release out.

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Fwd: Apache in 2018 - By The Digits

2019-01-01 Thread Andreas Lehmkühler
Hi, 

Sally prepared some digits for 2018 and I was surprised to see one of our 
fellow PDFBox committers among the Top 5 committers as we are a small community 
compared to other ASF projects.

Thanks Tilman for your ongoing efforts to improve PDFBox in the last year, the 
time before that and hopefully in the future!!!

A happy new year to everyone

Cheers, Andreas 


 Ursprüngliche Nachricht 
Von: Sally Khudairi 
Gesendet: 1. Januar 2019 08:22:25 MEZ
An: Apache Announce List 
Betreff: Apache in 2018 - By The Digits

[this announcement is available online at https://s.apache.org/Apache2018Digits 
]

It's been a great year for the Apache community at-large. With nearly 200M 
lines of code under the ASF's stewardship, our ongoing success is the result of 
community-led development "The Apache Way", executed through the collaborative 
efforts of more than 300 Apache projects and their communities. Highlights 
include:

Apache Projects —https://projects.apache.org/
- Total number of projects + sub-projects - 328 (not including Apache Labs 
initiatives)
- Top-Level Projects - 198
- Podlings in the Apache Incubator - 51
- Other groups, including operations/support - 62

Community/People —http://home.apache.org/
- Apache Committers - 7,032 (6,693 active)
- ASF Members (individuals) - 730
- New Members elected - 44


Apache Projects/Code —https://projects.apache.org/statistics.html

3,208 Apache Committers changed 78,493,228 lines of code over 201,220 commits. 
We also  welcomed 4,638 new code contributors and 15,861 new issue/pull request 
contributors. 

Top 5 Apache Code Committers 
- Andrea Cosentino (2,508 commits; 237,224 lines changed)
- Jean-Baptiste Onofré (2,098 commits; 1,208,851 lines changed)
- Duo Zhang (1,956 commits; 809,085 lines changed)
- Mark Thomas (1,823 commits; 179,883 lines changed)
 - Tilman Hausherr (1,736 commits; 81,940 lines changed)

Top 5 Apache Project Repositories by Commits
 - Hadoop
 - HBase
 - Beam
 - Camel
 - Flink

Top 5 Apache Project Repositories by Size (Lines of Code)
 - OpenOffice (7,822,699)
 - NetBeans (7,741,506)
 - Flex (whiteboard: 5,233,722; SDK 3,933,522)
 - Mynewt (documentation: 4,381.072)
 - Hadoop (3,881,797)

"If it didn't happen on-list, it didn't happen." —https://lists.apache.org/

 - Total number of mailing lists 1,131
 - 19,435 authors sent 1,497,005 emails on 505,793 topics

Top 5 most active Apache user@ mailing lists
 - Flink
 - Lucene
 - Ignite
 - Cassandra
 - Kafka

Top 5 most active Apache dev@ mailing lists
 - Beam
 - Ignite
 - Kafka
 - Tomcat
 - James

Contributor License Agreements and Software Grants 
—https://www.apache.org/licenses/

We welcomed an average of 387 new code contributors and 1,250 new people filing 
issues each month. Individuals who are granted write access to the Apache 
repositories must submit an Individual Contributor License Agreement (ICLA). 
Corporations that have assigned employees to work on Apache projects as part of 
an employment agreement may sign a Corporate CLA (CCLA) for contributing 
intellectual property via the corporation. Individuals or corporations donating 
a body of existing software or documentation to one of the Apache projects need 
to execute a formal Software Grant Agreement (SGA) with the ASF. 

 - ICLAs signed - 831
 - CCLAs signed - 35
 - Software Grants submitted - 25

Sponsorship and Individual Support 
—http://apache.org/foundation/contributing.html

Thank you to our hundreds of individual donors and Sponsors whose generous 
support helps offset the ASF's day-to-day operating expenses that include 
Infrastructure, Accounting, Fundraising, Marketing & Publicity, and more.

 - Platinum: Cloudera, Comcast, Facebook, Google, LeaseWeb, Microsoft, Oath, 
Pineapple Fund, and Tencent Cloud.

 - Gold: Anonymous, ARM, Bloomberg, Handshake, Hortonworks, Huawei, IBM, 
Indeed, Pivotal, and Union Investment.

 - Silver: Aetna, Alibaba Cloud Computing, Baidu, Budget Direct, Capital One, 
Cerner, Inspur, ODPi, Private Internet Access, Red Hat, and Target.

 - Bronze: Airport Rentals, Best VPN, The Blog Starter, Bookmakers, Cash Store, 
Casino Bonus, Casino2k, Cloudsoft, Emerio, Footprints Recruiting, 
HostChecka.com, HostingAdvice.com, HostPapa Web Hosting, The Linux Foundation, 
Mobile Slots, Mutuo Kredit AG, Online Holland Casino, RX-M, SCAMS.info, Site 
Builder Report, Talend, The Best VPN, Twitter, and Web Hosting Secret Revealed.

ASF Targeted Sponsors provide the Foundation with contributions for specific 
activities or programs.

 - Targeted Platinum: DLA Piper, Microsoft, Oath, OSU Open Source Labs, and 
Sonatype.

 - Targeted Gold: Atlassian, The CrytpoFund, Datadog, PhoenixNAP, and Quenda.

 - Targeted Silver: Amazon Web Services, HotWax Systems, and Rackspace.

 - Targeted Bronze: Bintray, Education Networks of America, Google, Hopsie, 
No-IP, PagerDuty, Peregrine Computer Consultants Corporation, Sonic.net, 
SURFnet, and Virtru.


Together, our Members, Committers, contributors, 

Re: [VOTE] Release Apache PDFBox JBIG2 ImageIO 3.0.0

2018-02-24 Thread Andreas Lehmkühler
The vote passed successfully. I am going to push the release out when I am back 
home on Monday

Andreas

Am 21. Februar 2018 22:22:42 MEZ schrieb Andreas Lehmkuehler :
>Hi,
>
>a candidate for the PDFBox JBIG2 ImageIO 3.0.0 release is available at:
>
> https://dist.apache.org/repos/dist/dev/pdfbox/jbig2-imageio-3.0.0/
>
>The release candidate is a zip archive of the sources in:
>
> https://github.com/apache/pdfbox-jbig2/tree/jbig2-imageio-3.0.0
>
>The SHA1 checksum of the archive is
>978d3a48f615ee8385a8b7969293fbce7a16dfd2.
>
>Please vote on releasing this package as Apache PDFBox JBIG2 ImageIO
>3.0.0.
>The vote is open for the next 72 hours and passes if a majority of at
>least three +1 PDFBox PMC votes are cast.
>
> [ ] +1 Release this package as Apache PDFBox JBIG2 ImageIO 3.0.0
> [ ] -1 Do not release this package because...
>
>
>Here is my +1
>
>Andreas
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>For additional commands, e-mail: dev-h...@pdfbox.apache.org


Re: JBIG2 ImageIO Plugin Release

2018-01-05 Thread Andreas Lehmkühler
Hi, 

I am planning to do a release this month if my colleagues are ok with that.

Andreas

Am 5. Januar 2018 13:47:35 MEZ schrieb "Petr Slabý" :
>Hi,
>are there any plans to release the JBIG2 ImageIO Plugin (PDFBOX-3906) ?
>
>Personally, I would like to see that happen as soon as possible. We are
>not able to distribute the original Levigo library because of its
>licence, so I am looking forward to distribute the PDFBox JBIG2 plugin
>alongside with our software, using the Apache licence.
>
>I hope you will release the library soon, independently from the PDFBox
>3.0 release.
>
>Best regards,
>Petr.


Re: build problems of today

2018-01-04 Thread Andreas Lehmkühler
Hi,
it looks like at least one of the maven plugins does not work with java7. Maybe 
that is related to the recent Jenkins updates? I can not check that as I am 
still on vacation.
Switching to java8 as build environment should be ok.

Andreas

Am 4. Januar 2018 21:20:51 MEZ schrieb Tilman Hausherr :
>2.0 and trunk couldn't been built today (first change after many days).
>
>After trying different things that were unsuccessful, I set the build
>to 
>use 1.8.0_66-unlimited security. Before, it was 1.7.0_79 (unlimited 
>security).
>
> From my understanding it should work, PDFBox 1.8 is built on jdk 1.7 
>although it is targeted to 1.6.
>
>@Andreas - agreed?
>
>Tilman
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>For additional commands, e-mail: dev-h...@pdfbox.apache.org


Re: 2.0.8?

2017-10-27 Thread Andreas Lehmkühler
Done

Am 27. Oktober 2017 17:14:04 MESZ schrieb Tilman Hausherr 
:
>Am 26.10.2017 um 19:12 schrieb Andreas Lehmkuehler:
>> I'm planing to cut my second attempt next monday, if no one objects. 
>
>
>Hi,
>could you please set up a 2.0.9 target in JIRA ?
>Thanks
>Tilman
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>For additional commands, e-mail: dev-h...@pdfbox.apache.org


Re: 2.0.8?

2017-09-25 Thread Andreas Lehmkühler

> Andreas Lehmkuehler <andr...@lehmi.de> hat am 13. September 2017 um 20:33 
> geschrieben:
> 
> 
> Due to the responses I'm planning to cut the release on Monday the 25th
I'm still working on a solution for PDFBOX-3934 to avoid the regression with 
PDFBOX-3318. Should we postpone the release for a couple of days or a week max? 
Or should I simply revert my changes?

WDYT?

Andreas

> 
> Andreas
> 
> Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> > Good idea, there are already a lot of solved tickets for 2.0.8
> > 
> > @all Is there anything pending which should be included?
> > 
> > How about cutting the release in a week or two from now?
> > 
> > @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> > 
> > Andreas
> > 
> > Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
> >>> I hope there aren't any new regressions.
> >>
> >> Happy to help find them!  :)
> >>
> >> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  
> >> I'd 
> >> like to include it in Tika's last Java 7 release...end of Sept, middle of 
> >> Oct., or whenever 2.0.8 is out. :)
> >>
> >>
> >> -Original Message-
> >> From: Andreas Lehmkühler (JIRA) [mailto:j...@apache.org]
> >> Sent: Monday, September 11, 2017 4:52 PM
> >> To: dev@pdfbox.apache.org
> >> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: 
> >> root 
> >> cannot be null with truncated file
> >>
> >>
> >>  [ 
> >> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161965#comment-16161965
> >>  
> >> ]
> >>
> >> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
> >> -
> >>
> >> Both case are tricky (PDFBOX-3798 is truncated within an object and the 
> >> attached pdf has a truncated xref table), so that I had to improve the 
> >> brute 
> >> force search one more time.
> >> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> >>
> >>
> >> was (Author: lehmi):
> >> Both case are tricky, so that I had to improve the brute force search one 
> >> more 
> >> time.
> >> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> >>
> >>> IllegalArgumentException: root cannot be null with truncated file
> >>> -
> >>>
> >>>  Key: PDFBOX-3928
> >>>  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
> >>>  Project: PDFBox
> >>>   Issue Type: Bug
> >>>   Components: Parsing
> >>> Affects Versions: 2.0.7
> >>> Reporter: Tilman Hausherr
> >>> Assignee: Andreas Lehmkühler
> >>>   Labels: regression
> >>>  Fix For: 2.0.8, 3.0.0
> >>>
> >>>  Attachments: 023505.pdf
> >>>
> >>>
> >>> {code}
> >>> java.lang.IllegalArgumentException: root cannot be null
> >>>  org.apache.pdfbox.pdmodel.PDPageTree.(PDPageTree.java:75)
> >>>  
> >>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
> >>>  org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
> >>>  
> >>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
> >>>  
> >>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
> >>>  java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
> >>>  
> >>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> >>> Source)
> >>>  
> >>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> >>> Source)
> >>>  
> >>> org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
> >>>  
> >>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
> >>>  
> >>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
> >>>  org.apache.pdfbox.debugger.PDFDebugger.main

Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-08-30 Thread Andreas Lehmkühler
> Andreas Lehmkuehler  hat am 28. August 2017 um 20:28 
> geschrieben:
> 
> 
> Am 23.08.2017 um 18:29 schrieb Andreas Lehmkuehler:
> > readded dev@pdfbox
> > 
> > Am 23.08.2017 um 17:23 schrieb Jörg Henne:
> >> Am 19.08.2017 um 17:07 schrieb Andreas Lehmkuehler:
> >>>
> >>> Please provide the following paperwork:
> >>>
> >>> - software-grant, see [1]
> >>> - an iCLA for all potential committers, which aren't apache committers 
> >>> yet, 
> >>> see  [2]
> >>> - a CCLA if necessary, see [3]
> >>>
> >> CCLA: done. Others: pending.
> > Thanks, once the SGA is on file I start with filing the ip-clearance 
> > template
> I've filed a first version of the ip-clearance form, it's not yet online [1]. 
> I 
> guess we have to wait for some automatic website rebuild.
The form is online. I'm planing to proceed this evening

Andreas
> 
> Andreas
> 
> [1] http://incubator.apache.org/ip-clearance/index.html
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-08-26 Thread Andreas Lehmkühler
That was fast. I had a quick look and the non-commercial restriction seems 
problematic. We have to double check with legal first as these files are just 
testfiles and won't be redistributed. 

Andreas

Am 25. August 2017 16:09:23 MESZ schrieb "Jörg Henne" :
>Am 24.08.2017 um 11:25 schrieb Jörg Henne:
>>
>> I just learned that those files contain sample bitstreams contained
>as 
>> hex-dumps within the standard. The standards are copyrighted by 
>> ITU/ISO and contain the following notice:
>>
>>     All rights reserved. No part of this publication may be
>reproduced 
>> or utilized in any form or by any means, electronic or
>>     mechanical, including photocopying and microfilm, without 
>> permission in writing from the ITU.
>>
>> Although I don't think that the authors and publishers intention was 
>> to prevent use of the sample bitstreams for testing purposes, the 
>> statement clearly covers them. WDYT?
>> I'm going to e-mail ITU wuth this question although I am not too 
>> optimistic about getting an answer some within this decade :-)
>I have obtained written permission from an ITU representative allowing 
>us to use the files as intended. Does ASF have an established method of
>
>documenting this permission?
>
>The ITU requested that a license disclaimer/information be included 
>alongside the sample data. I have added it to the code base: 
>https://github.com/levigo/jbig2-imageio/blob/master/src/test/resources/images/README_SAMPLE_DATA_LICENSING.txt
>Does this disclaimer create a conflict with the ASL?
>
>Jörg Henne
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>For additional commands, e-mail: dev-h...@pdfbox.apache.org


Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-08-24 Thread Andreas Lehmkühler
I'll check the docs after the weekend when I'm back.
I don't see any issue with the different signers.

Andreas

Am 24. August 2017 13:53:10 MESZ schrieb "Jörg Henne" :
>Am 23.08.2017 um 21:35 schrieb Andreas Lehmkuehler:
>
>> The CCLA is on file
>The software-grant and another ICLA have been sent.
>Some clarification about the software-grant might be necessary, though.
>
>The software-grant and the CCLA are signed by different legal entities 
>(levigo holding/levigo solutions). This is due to the fact that 
>contributors are employed by levigo solutions whereas the IP to be 
>transferred belongs to the parent holding.
>
>Jörg
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
>For additional commands, e-mail: dev-h...@pdfbox.apache.org


Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-08-24 Thread Andreas Lehmkühler

> Jörg Henne  hat am 24. August 2017 um 10:08 geschrieben:
> 
> 
> 
> Am 23.08.2017 um 18:40 schrieb Andreas Lehmkuehler:
> > readded dev@pdfbox
> >
> > Am 22.08.2017 um 19:14 schrieb Jörg Henne:
> >> Am 19.08.2017 um 17:07 schrieb Andreas Lehmkuehler:
> >>
> >>> The following files don't have a license header:
> >>>
> >> Good catch. Tracked as https://github.com/levigo/jbig2-imageio/issues/46
> >>
> >>> What about the binary test files in src/test/resources/? I assume 
> >>> their license is cleared as well, isn't it?
> >>>
> >> That's what I assumed as well, but upon re-checking, things no longer 
> >> seem to be so clear. I'm tracking this question as 
> >> https://github.com/levigo/jbig2-imageio/issues/48
> >> Maybe you guys can help me with this problem or let me know how you 
> >> deal with it.
> > Is there any jbig2-viewer available?
> In theory, yes, for example XnView supports JBIG2 via jbig2dec.exe. In 
> reality, support for the various cases covered in the test suite is 
> rather spotty: many of the images cannot be decoded with XnView. So, 
> strange as it might seem, I don't know of any reliable stand-alone JBIG2 
> viewer.
> 
> However, obviously those images can be decoded using the plugin. I've 
> attached PNG versions of them to a comment on the above issue: 
> https://github.com/levigo/jbig2-imageio/issues/48#issuecomment-324556311
Cool, I've already thought about converting them myself, but you were faster. 
Thanks. I'll have a look after the weekend as my time will be limited the next 
few days.

> > Are these testfiles somehow special, do they trigger some special 
> > processing within the plugin or are they just a bunch of jbig2 files 
> > and could be replaced by others
> JBIG2 isn't quite as simple as, say, PNG. There are several entropy 
> coding options (Arithmetic/MQ, Huffman) several different segment types 
> and several ways to maintain, refine and reference shape dictionaries. 
> Therefore there a large number of code paths need to be covered in the 
> tests. Since it is rather hard to generate all those possible 
> combinations (no single encoder library will use all of them) the 
> refrerence library provides (provided?) a convenient way of achieving 
> decent test coverage.
OK, so we should try to keep as much as possible of those data.

> >> The files seem to fall into three categories:
> >> 1. Files from the original test suite. While the copyright status of 
> >> the file isn't problematic, the status of the content seems to be 
> >> muddy in some cases.
> >>- Files containing representations of public U.S. government 
> >> documents should be in the public domain: 
> >> https://en.wikipedia.org/wiki/Copyright_status_of_work_by_the_U.S._government
> >>- The same applies to representations of U.S. patents: 
> >> https://en.wikipedia.org/wiki/Copyright_on_the_content_of_patents_and_in_the_context_of_patent_prosecution
> >>  
> >>
> >>- 004.jb2 and 005.jb2 seem problematic but may be covered by some 
> >> exemption.
> >>- amb.bmp no idea
> > amb.bmp seems problematic as it looks like a promo photo of Ally 
> > McBeal aka Callista Flockhart.
> You seem to be more up to speed regarding TV characters. I certainly 
> didn't recognise the person in the photo :-)
Maybe I'm old enough to know that TV-show from the late 90's

> Losing this image would be bad, though, since it is the only halftone 
> region sample bitstream in there.
Maybe, we should think about a README which expains the origin of some/all of 
the test files

> >> 2. Files provided to us with the permission to use them for testing 
> >> purposes
> >>201231100*.jb2 is the only case, seems to be a public U.S. 
> >> document anyway and therefore in the public domain. I have not 
> >> contacted the original provider of the files for the simple reason 
> >> that his or her e-mail address has been lost when the Googlecode site 
> >> went into archived state. >
> >> 3. Files with content so trivial that copyright should not be an 
> >> issue, i.e. fragments of bitstreams, isolated segments, trivial test 
> >> images
> > This isn't a question of copyright but of license and/or privacy.
> The files in this category are sampledata_page(1,2,3).jb2. The content 
> is obviously not a matter of privacy. Regarding the license I am 
> currently asking around whether anyone still knows where this came from 
> (unfortunately we lost some very early RCS history from before we 
> open-sourced the component).
> 
> Jörg

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-08-22 Thread Andreas Lehmkühler

> Jörg Henne  hat am 22. August 2017 um 13:20 geschrieben:
> 
> 
> Am 19.08.2017 um 19:59 schrieb Tilman Hausherr:
> > Am 19.08.2017 um 18:09 schrieb Andreas Lehmkuehler:
> >>>
> >> +1, there is one superfluous "pdfbox". Besides some other minor 
> >> things to be adjusted we have to discuss how the plugin shall be 
> >> integrated.
> >>
> >> IMHO, we should keep it independent, so that we could cut independent 
> >> releases of the plugin and pdfbox. Doing so, we have to reorg our svn 
> >> repository. We have to create a pdfbox directory in trunk and move 
> >> everything to that directory. There will be another directoy jbig2 
> >> for the sources of the plugin. 
> >
> > Is there a need to have independent releases? Maybe for existing 
> > levigo clients with support contracts?
> Honoring those should not be a problem one way or the other. We can 
> always cut our own releases under dedicated version numbers, as we 
> provide dedicated Maven repositories to our customers.
Just to avoid missunderstandings, once the code is under the PDFBox umbrella 
only the PDFBox PMC can cut releases. There can't be any outside the project 
(with the same maven coords and the same package name). That's the reason why 
I'd like to keep it independent so that we can cut a release whenever is't s 
necessary.

> Jörg

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: PDFBox and PDF 2.0

2017-08-14 Thread Andreas Lehmkühler

> Maruan Sahyoun  hat am 11. August 2017 um 18:36 
> geschrieben:
> 
> 
> 
> > Am 11.08.2017 um 18:24 schrieb Tilman Hausherr :
> > 
> > Am 11.08.2017 um 10:07 schrieb Maruan Sahyoun:
> >> Hi,
> >> 
> >> with PDF 2.0 being available it might be time to start to introduce some 
> >> of the features in PDFBox (no rush I think as it will need some time to be 
> >> adopted). One feature I would like to discuss if it would be good to 
> >> introduce a version support so one could say to save as PDF 2.0 file or 
> >> some other version. E.g. one could now use utf-8 encoded text strings 
> >> which will introduce issues in readers not supporting that where some of 
> >> the other changes like new properties will simply be ignored.
> > 
We need to support such 2.0 features as there will be some people who will use 
PDFBox to render such pdfs. And if we add that utf-8 support for reading it 
shouldn't be that hard to add it for writing as well.
I like Maruans idea to add some version support. So that adding some features 
to a pdf could change the version automatically or could trigger an exception.

Andreas

> > We can indicate the version when saving, and we know the version when 
> > loading... Maybe what you mean is to propagate the version to the COS 
> > classes?
> 
> Yes - as there are some areas in PDF 2.0 - such as utf-8 encoded text strings 
> - which we would need to enable e.g. when generating a document from scratch 
> or adding new annotations. Currently if we pass text as an input to some of 
> our setters when writing out the file it will not use utf-8 text strings 
> which is fine as it will ensure that older readers are able to read the 
> content. But if you'd like to 'enforce' 2.0 there is currently now way in 
> doing so.
> 
> For other areas where there is a new 'PDF Object' or property with 2.0 we can 
> wait until there is demand for it and let the developer decide if that shall 
> be used (the same way we handle it today as there is no specific version 
> support in PDFBox i.e. one could declare the file as being 1.4. compliant but 
> use 1.7 features wthout any complaints). The low level lib we are I think 
> that's acceptable.
> 
> Maruan
> 
> > 
> > Tilman
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: OWASP dependency-check

2017-07-11 Thread Andreas Lehmkühler
> Tilman Hausherr  hat am 8. Juli 2017 um 15:40 
> geschrieben:
> 
> 
> https://github.com/jeremylong/dependency-check-gradle#current-release
> 
> Tim Allison pointed us to this on twitter... Should we use it (maybe 
> just in "pedantic" mode, because it needs 400MB in the repository)?
> 
> Or just recommend our users to use it?
> 
> Or should just tika use it?
> 
> It tells whether any components we're using have security risks. This 
> xml segment is to be put into the pom.xml:
> 
>  
>  org.owasp
> dependency-check-maven
>  2.0.0
>  
> true
>  
>  
>  
>  
>  check
>  
>  
>  
>  
> 
> I tried it with a project that linked pdfbox 2.0.0 (has XXE 
> vulnerability) and yes, the build stopped.
Let's add this, but just in "pedantic" mode

Andreas

> Tilman
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Contributing the JBig2 ImageIO Plugin to PDFBox

2017-06-27 Thread Andreas Lehmkühler
Hi Jörg,

> Jörg Henne  hat am 26. Juni 2017 um 15:36 geschrieben:
> 
> 
> Hi all,
> 
> 
> Apache PDFBox currently uses the JBig2 ImageIO-Plugin at 
> https://github.com/levigo/jbig2-imageio as an optional component and 
> recommends the use of it at https://pdfbox.apache.org/2.0/dependencies.html. 
> I am writing this as a representative of the ISV levigo, the owner and 
> publisher of this component. Besides being an open source component we use 
> the component on our own software suite. Over the years we have invested 
> significant time into it and have been maintaining it for many years so that 
> I would consider its code-base reasonably mature and stable. However, we 
> continue to address any bugs reported to us and have accepted several 
> community-provided fixes.
> 
> 
> The plugin in question is currently licensed under the GNU General Public 
> License V3 with other licensing options available, including commercial 
> licensing. Having PDFBox under the ASL and the plugin under a different 
> license has long been a nuisance for PDFBox users which has deterred many 
> users fron using it. On the other hand, many users have a strong need for it 
> as our plugin is (IMHO) still the highest quality pure-Java open source 
> decoder available.
> 
> We would like to change this situation by licensing the plugin under the ASL. 
> At the same time, however, we think that it would make sense to move the code 
> base over to a new home that makes it independent of a single vendor. That's 
> where the ASF and the PDFBox project comes into play :-)
> 
This is good news and higly appreciated!

> We are currently in the very early stages of evaluating such a transition. A 
> few random thoughts:
> 
> - All of those thoughts are subject to the PDFBox community​ being willing to 
> do this and accepting the contribution, obviously.
> 
I can think about 2 possible new homes within the ASF, Apache PDFBox and Apache 
Commons. The first option might be the easier way if it comes to the 
"paperwork".

> - One of the reasons for us to favor the ASF as a new home is that the ASF 
> has strong provisions in place to ensure that a project can thrive without it 
> being dependent on life-support by a single vendor.
> 
+1

> - We need to do proper IP vetting: while the vast majority has been done by 
> levigo there is one other GitHub committer who has provided bug fixes and 
> whom we need to talk to.
> 
Good catch, these are the important bits which have to be resolved first. After 
that you have to provide a Software Grant Agreement, see [1] for details, so 
that we can start the IP clearance process, see [2] and [3]

> - Package names and maven coordinates will have to be updated to reflect the 
> transition
+1

> - After a transition colleagues of mine would continue to contribute to the 
> maintenance of the component. The necessary committer rights would need to be 
> bestowed upon them. I myself have been an Apache committed for many years, 
> albeit almost completely inactive.
> 
As an apache committer you might know that nobody can request committer rights 
but has to be voted in. But that is maybe just a formality. About how many devs 
are we talking here?

> - It would make sense (and is required by the Apache rules) to have 
> additional know-how about the component outside of levigo. I don't know 
> whether there is enough interest in the PDFBox community to ensure this.
> 
Yes, diversity is an important aspect. I'm pretty sure that the code will 
attract other (pdfbox) developers once it is under the apache umbrella. The 
imaging [4] devs might be interested in the code as well.

> So that's it for now, I guess. Please let me know what you think.
I support your plan to integrate the plugin with pdfbox. We, the PDFbox PMC, 
have to discuss that topic first and have to perform a vote, but I guess this 
is just a formality.

Feel free to ask if there are any further questions.

> Jörg Henne
> 

Andreas

[1] http://www.apache.org/licenses/
[2] http://incubator.apache.org/ip-clearance/pdfbox-padaf.html
[3] https://issues.apache.org/jira/browse/PDFBOX-1056
[4] http://commons.apache.org/proper/commons-imaging/

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



RE: 2.0.6 release ?

2017-05-10 Thread Andreas Lehmkühler

> "Allison, Timothy B."  hat am 10. Mai 2017 um 11:42 
> geschrieben:
> 
> 
> Haven't had a chance to look. Reports are here:
> http://162.242.228.174/reports/reports_pdfbox_2_0_6_20170510.tar.gz
Thanks again for running the report again

I had a quick look and there are 2 new exceptions. It seems to be a regression. 
I'm going to dig deeper later when I'm back home

Here a 2 sample pfs, one for each exception
commoncrawl2/YV/YVFDWHF767TEYTT7IVFSLUIJTDF3YP57
commoncrawl2/5W/5WULWDW54DAQ4ORVJSACEE2KCXQ7PQLL

Andreas

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



2.0.6 release ?

2017-05-02 Thread Andreas Lehmkühler
Hi,

I'm planning to cut a 2.0.6 release in about 1 or 2 weeks from now, any 
objections?

Andreas

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Problems updating the website

2017-03-17 Thread Andreas Lehmkühler

> Maruan Sahyoun <sahy...@fileaffairs.de> hat am 17. März 2017 um 12:13 
> geschrieben:
> 
> 
> Hi,
> 
> > Am 17.03.2017 um 12:09 schrieb Maruan Sahyoun <sahy...@fileaffairs.de>:
> > 
> > Hi,
> > 
> >> Am 17.03.2017 um 11:24 schrieb Andreas Lehmkühler <andr...@lehmi.de>:
> >> 
> >> 
> >>> Maruan Sahyoun <sahy...@fileaffairs.de> hat am 17. März 2017 um 11:06 
> >>> geschrieben:
> >>> 
> >>> 
> >>> Hi,
> >>> 
> >>>> Am 17.03.2017 um 07:59 schrieb Andreas Lehmkuehler <andr...@lehmi.de>:
> >>>> 
> >>>> Hi,
> >>>> 
> >>>> I've updated the download section due to the new release. After running 
> >>>> the mvn command to publish the content I saw some unwanted changes. I 
> >>>> can't tell where they came from, e.g. [1]
> >>>> 
> >>> 
> >>> what in particular is unwanted? The complete page, parts of it?
> >> Have a look at the end of that page, the formatting for the formatting 
> >> style example is gone.
> >> 
> > 
> > works fine on my local copy. Even after pulling the latest changes. I'm at 
> > jekyll 3.1.2. I'll do a minor change and push to see if that corrects the 
> > issue.
> 
> that fixed it
Thanks for the fix. I've to investigate on my side to see what went wrong.
> 
> BR
> Maruan
> 
> > 
> > BR
> > Maruan
> > 
> > 
> >>> 
> >>>> @Maruan Any idea what went wrong/I did wrong?
> >>>> 
> >>>> I'm using jekyll 3.0.1 on linux fedora.
> >>>> 
> >>>> BR
> >>>> Andreas
> >>>> 
> >>>> [1] https://pdfbox.apache.org/codingconventions.html
> >>>> 
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> >>>> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >>>> 
> >>> 
> >>> 
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> >>> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >>> 
> >> 
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> >> For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >> 
> > 
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: Problems updating the website

2017-03-17 Thread Andreas Lehmkühler

> Maruan Sahyoun  hat am 17. März 2017 um 11:06 
> geschrieben:
> 
> 
> Hi,
> 
> > Am 17.03.2017 um 07:59 schrieb Andreas Lehmkuehler :
> > 
> > Hi,
> > 
> > I've updated the download section due to the new release. After running the 
> > mvn command to publish the content I saw some unwanted changes. I can't 
> > tell where they came from, e.g. [1]
> > 
> 
> what in particular is unwanted? The complete page, parts of it?
Have a look at the end of that page, the formatting for the formatting style 
example is gone.

> 
> > @Maruan Any idea what went wrong/I did wrong?
> > 
> > I'm using jekyll 3.0.1 on linux fedora.
> > 
> > BR
> > Andreas
> > 
> > [1] https://pdfbox.apache.org/codingconventions.html
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> > 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



  1   2   3   4   >