from:"bodewig"

Hi

I recall the CMS is no more but I haven't followed how to publish the
site now. The docs still talk about the CMS.

I have updated component_releases.properties and the DOAP file for
compress but don't know how to apply the change to the deployed website.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[ANN] Apache Commons Compress 1.21 Released

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

The Apache Commons Team is pleased to announce the release of Apache
Commons Compress 1.21.

Apache Commons Compress software defines an API for working with
compression and archive formats.  These include: bzip2, gzip, pack200,
lzma, xz, Snappy, traditional Unix Compress, DEFLATE, DEFLATE64, LZ4,
Brotli, Zstandard and ar, cpio, jar, tar, zip, dump, 7z, arj.

This release is mostly a bugfix release. Some of the changes to the
ZIP, TAR and 7Z packages fix flaws that were exploitable as denial of
service attacks, see the separate announcment mails.

Compress also contains new features. The pack200 code of the retired
Apache Harmony project is now part of Compress and thus pack200 can
even be used on Java versions later than Java 13 again.

A new TarFile class provides random access to tar archives.

Compress 1.21 is the first release to require Java 8 to build and run.

SevenZFileOptions has a new setting that needs to be enabled
explicitly if SevenZFile should try to recover broken archives - a
feature introduced with Commons Compress 1.19. This is a breaking
change if you relied on the recovery attempt. The change was made to
detect broken archives sooner, and to mitigate the OOM exploit.

Several formats now throw IOExceptions when reading broken archives or
streams that would have caused arbitrary RuntimeExceptions in earlier
versions of Compress.

Source and binary distributions are available for download from the
Apache Commons download site:

https://commons.apache.org/proper/commons-compress/download_compress.cgi

When downloading, please verify signatures using the KEYS file available
at the above location when downloading the release.

Changes in this version include:

New features:
o Add writePreamble to ZipArchiveInputStream. This method could
  write raw data to zip archive before any entry was written to
  the zip archive.
  For most of the time, this is used to create self-extracting
  zip.
  Github Pull Request #127.
  Issue: COMPRESS-550.
  Thanks to Scott Frederick.
o Added support for random access to the TAR packages.
  Github Pull Request #113.
  Issue: COMPRESS-540.
  Thanks to Robin Schimpf.
o Added support for BufferPool in ZstdCompressorInputStream.
  Github Pull Request #165.
  Issue: COMPRESS-565.
  Thanks to Michael L Heuer.
o Commons Compress cannot be built with JDK14 due to Pack200 removal.
  Add Pack200 implementation from Apache Harmony.
  Issue: COMPRESS-507.
  Thanks to Gary Gregory, Apache Harmony.
o Add a new AlwaysWithCompatibility in Zip64Mode, this is a
  compromise for some libraries including 7z and Expand-Archive
  Powershell utility(and likely Excel).

  And we will encode both the LFH offset and Disk Number Start
  in the ZIP64 Extended Information Extra Field - even if only
  the disk number needs to be encoded.

  Github Pull Request #169.
  Issue: COMPRESS-565.
  Thanks to Evgenii Bovykin.
o gzip deflate buffer size is now configurable.
  Issue: COMPRESS-566.
  Thanks to Brett Okken.

Fixed Bugs:
o Fix bugs in random access of 7z. Problems may happen
  in a mixture use of random access and sequential access
  of 7z.
  Github Pull Request #95.
  Issue: COMPRESS-505.
o Fix bugs in random access of 7z. Exceptions are thrown
  when reading the first entry multiple times by random
  access.
  Issue: COMPRESS-510.
o Add '/' to directories with long name in tar. This is to
  resolve the ambiguous behavior of the TarArchiveEntry.getName()
  method between directory with short name and long name.
  Issue: COMPRESS-509.
  Thanks to Petr Vasak.
o Removed the PowerMock dependency.
  Issue: COMPRESS-520.
  Thanks to Robin Schimpf.
o Added improved checks to detect corrupted bzip2 streams and
  throw the expected IOException rather than obscure
  RuntimeExceptions.
  See also COMPRESS-519.
  Issue: COMPRESS-516.
o Improved parsing of X5455_ExtendedTimestamp ZIP extra field.
  Issue: COMPRESS-517.
o ZipArchiveInputStream and ZipFile will now throw an
  IOException rather than a RuntimeException if the zip64 extra
  field of an entry could not be parsed.
  Issue: COMPRESS-518.
o Improved detection of corrupt ZIP archives in ZipArchiveInputStream.
  Issue: COMPRESS-523.
o Added improved checks to detect corrupted deflate64 streams and
  throw the expected IOException rather than obscure
  RuntimeExceptions.
  Issues: COMPRESS-521, COMPRESS-522, COMPRESS-525, COMPRESS-526, and 
COMPRESS-527.
o Add the archive name in the exception in the constructor of
  ZipFile to make it a more specific exception.
  Github Pull Request #102.
  Issue: COMPRESS-515.
  Thanks to ian-lavallee.
o Throw IOException when it encounters a non-number while parsing pax
  header.
  Issue: COMPRESS-530.
o Throw IOException when a a tar archive contains a PAX header
  without any normal entry following it.
  Issue: COMPRESS-531.
o Added improved checks to detect corrupted IMPLODED streams and
  throw the expected IOException rather than obscure

[RESULT] Release Compress 1.21 based on RC1

Hi

with +1s by Gary Gregory, Bruno P. Kinoshita, Peter Lee and myself, the
vote has passed.

I'll publish the artifacts and will announce the release once the
mirrors have caught up - which probably means after a night of sleep for
myself :-)

Many thanks to all who have verified the release candidate

 Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Release Compress 1.21 based on RC1

Making my own vote explicit

   [X] +1 Release these artifacts

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[DISCUSS] Release Compress 1.21 based on RC1

2021-07-10 Thread Stefan Bodewig

On 2021-07-10, Henri Biestro wrote:

> Side note whilst trying to validate RC1:

> On a Mac that used LDAP, user ids and groups are 'long':
> henri.biestro@L-HBIESTRO-1 commons-compress % id
> uid=1447288081(henri.biestro) gid=1024222515

Didn't know that.

> A lot of tar tests will fail in this (probably rare) situation since
> tar entries treat uid/gid need the bigNumberMode != BIGNUMBER_ERROR to
> handle these correctly.

Are there any tests that actually use the uid/gid of the current user?
Compress will no read them by itself, so the only place things could
fail was if we used native tar to create an archive. Is there such a
test? If so we could try to adapt the test in question.

> Should the bigNumberMode depend on the OS/user-id ?

For tests, maybe. But I wouldn't recommend doing so in general as the
resulting archive may not be readable by certain archivers. IMHO this
should always be an explicit user decision.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[DISCUSS] Release Compress 1.21 based on RC1

2021-07-10 Thread Stefan Bodewig

On 2021-07-10, Bruno P. Kinoshita wrote:

> The RELEASE-NOTES.txt for 1.21 starts with "Compress 1.20 now at least
> requires Java 8 to build and run." which is a bit confusing, but not a
> major issue. (Maybe it would be better to say "Compress 1.20 and later
> require Java 8..."?)

It is going to be

"Compress 1.21 is the first release to require Java 8 to build and run."

I've already fixed it locally.

> We also have 2 README files, .txt and .md. We probably want a single
> README file.

README.txt has been there since the initial commit in CVS, I'd
guess. When Commons decided to have uniform README.mds the .txt file has
been left in. It may actually contain information not present in
README.md.

> The README.md contains the download instructions for 1.21, so I think
> that's the one we see on GitHub.

Yes, github prefers markdown over text.

> But I just looked at the source ZIP in the dist area, and it only
> contains README.txt, no README.md.

This is a different kind of readme, it never occured to me to copy the
README from the source tree as they have different purposes. The one in
dist describes the downloadable artifacts, not the project. In addition
I don't believe markdown files will render favorably in a directory
listing.

> Reports look good. The changes report is showing the 1.21 release as
> not released yet. Normally I think that that date is set to the RC
> creation date.

Not sure. I've always done it that way and set the release date just
before publishing the site (and when merging the git tag back to
master). The site will need to be updated anyway (javadocs published for
example).

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Release Compress 1.21 based on RC1

2021-07-09 Thread Stefan Bodewig

On 2021-07-09, Gary Gregory wrote:

> "Details of changes since 1.19 are in the release notes:"

> 1.19 -> 1.20 ;-)

fortunately only the vote mail is wrong.

It even is true, in a way, the release notes even include all changes
since 1.0. :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[VOTE] Release Compress 1.21 based on RC1

2021-07-09 Thread Stefan Bodewig

It's been way too long since the last relase and the number of resolved
issues is huge.

Compress 1.21 RC1 is available for review here:
https://dist.apache.org/repos/dist/dev/commons/compress/
(svn revision 48755)

The tag is here:

https://gitbox.apache.org/repos/asf?p=commons-compress.git;a=tag;h=refs/tags/1.21-RC1
on commit

https://gitbox.apache.org/repos/asf?p=commons-compress.git;a=commit;h=60e3d9f6bef1e431f8738e881c051d706f81e6cf

Maven artifacts are here:

https://repository.apache.org/content/repositories/orgapachecommons-1554/org/apache/commons/commons-compress/1.21/

These are the Maven artifacts and their hashes

c92d9a12547aab475e057955ad815fdfe92ff44c78383fa5af54b089f1bff5525126ef6aef93334f3bfc22e2fef4ad0d969f69384e978a83a55f011a53e7e471
  commons-compress-1.21.jar
2df3e0a78db8a93543f87c94dedeca2f9b007ac9aa65756bd8c68b5342aa2a852e0ee4d01c29723beeaee166b9e2f8aa55ef30401fd8963619fa2946cf85de39
  commons-compress-1.21-javadoc.jar
5d3ae9c9e0500b24feb731adb484964b17d7481d663df8c5483c3b0d870d70555e8d41c3e87e9a6e8ff65873e1593478f4dc8e0918c8991b9b2a99039896d12b
  commons-compress-1.21-sources.jar
c7021651c1311ead8004b7268e4933019450592fa9c184a722c7770c64762c5faf612522c2962d54332410b904de08ea81f4af4211dc0b4f1f2e74cf9c2ab7e8
  commons-compress-1.21-tests.jar
ece1bb7a8d86aee061d288ab0914b571df30ea053c53471159442baf5eb550be27d57608ef8448266d2847ec6d8c44a66130aca4dfe5910389733a38c10f938b
  commons-compress-1.21-test-sources.jar
530a1505ca1e1c4eb9336b7a7cae3116ea9fc81d77d0e2530f1c050a8b5593cd65adc90947f13fb7e10e40db479c54415cc9ad0f58bf5d1f924f3986ed634bfd
  commons-compress-1.21.pom

I have tested this with JDK 8 using Maven 3.6.3

Details of changes since 1.19 are in the release notes:
https://dist.apache.org/repos/dist/dev/commons/compress/RELEASE-NOTES.txt

https://stefan.samaflost.de/staging/commons-compress-1.21/changes-report.html

Site:
https://stefan.samaflost.de/staging/commons-compress-1.21/
  (note Javadocs of 1.21 have not been created right now and this is -
as usual when I cut a release - not the site I'm going to publish
after the release. I'll create a new site once the release date is known).

Japicmp Report (compared to 1.20):
https://stefan.samaflost.de/staging/commons-compress-1.21/japicmp.html

RAT Report:
https://stefan.samaflost.de/staging/commons-compress-1.21/rat-report.html

KEYS:
  https://www.apache.org/dist/commons/KEYS

You will see a bunch of new static code analysis issues related to the
harmony code we've copied. Right now I felt it was more important to get
the release out than to address these issues and make our code deviate
from Harmony's too far. These can be adressed post release by whoever
likes it doing it.

Please review the release candidate and vote.
This vote will close no sooner that 72 hours from now,
i.e. sometime after 17:30 UTC 12-July 2021

  [ ] +1 Release these artifacts
  [ ] +0 OK, but...
  [ ] -0 OK, but really should fix...
  [ ] -1 I oppose this release because...

Thanks!

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] poor test coverage of harmony code

On 2021-07-03, Stefan Bodewig wrote:

> I assume the code originates from
> https://svn.apache.org/repos/asf/harmony/enhanced/java/trunk/classlib/modules/pack200/src/main/
> and I'd look into porting the tests from
> https://svn.apache.org/repos/asf/harmony/enhanced/java/trunk/classlib/modules/pack200/src/test/
> unless anybody else has also started to look into it.

Back at 86% coverage.

The tests are JUnit 3.x style and I have left them at that. I basically
only changed the package names and removed a test that asserted a given
class was compiled for Java 1.4.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] releasing 1.21 soonish?

Hi all

is there anything you want to work on or can we go ahead with cutting a
new Compress release in about a week?

There are some test coverage and javadoc issues that need to get
resolved but other than that at least I do not intend to work on any
changes or new features.

A current build of the site can be found at
https://stefan.samaflost.de/staging/commons-compress-1.21/ if you want
to look at the big number of changes we've accumulated over the past
sixteen months or review the various reports.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] poor test coverage of harmony code

Hi

our current pack200 tests don't seem to cover much of the pack200 code
imported from harmony and the overall test coverage of Compress as a
whole has dropped significantly (from 86% to 61%) as the new package
contains quite a bit of code.

I assume the code originates from
https://svn.apache.org/repos/asf/harmony/enhanced/java/trunk/classlib/modules/pack200/src/main/
and I'd look into porting the tests from
https://svn.apache.org/repos/asf/harmony/enhanced/java/trunk/classlib/modules/pack200/src/test/
unless anybody else has also started to look into it.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] Java 16 and 17-ea

On 2021-07-03, Gary Gregory wrote:

> This is the approach I've taken: I merged the pack200 branch into
> master as is.

Thank you

  Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] Java 16 and 17-ea

2021-07-02 Thread Stefan Bodewig

On 2021-06-12, Gary Gregory wrote:

> Please have a look at the pack200 branch if you want, there are still
> Javadoc TODOs but it's all there.

Just so we get this into this list's archive properly: I've propsed a
few changes in https://github.com/apache/commons-compress/pull/210 but
completely leave it up to you whether you want to apply them or just
merge your branch as it is.

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

2021-07-01 Thread Stefan Bodewig

On 2021-07-01, Torsten Curdt wrote:

>> That certainly doesn't prevent anybody else from trying to find a
>> compromise :-)

> It feels like Optionals could be a compromise.

I must admit I've lost track of the later discussion threads. If you
mean that we'd return Optional<> results, this would become an entirely
different API.

I'd very much like us to get to a compromise that helps our users and
doesn't force us to randomly fix RuntimeExceptions that slipped through
overly optimistic parser code.

So if you want to explore the Optionals idea further this would be
great, but I doubt enough people have seen it.

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] [Poll Non Result] Dealing with uncaught RuntimeExceptions

2021-07-01 Thread Stefan Bodewig

Hi all

there isn't a single option that hasn't at least received two -1s with
eight people indicating their preference. So neither option seems to be
an option that could lead to a compromise.

With this I run out of ideas and will rest my case and not try to find a
generic solution - but rather try to get 1.21 out with no changes in
this area.

That certainly doesn't prevent anybody else from trying to find a
compromise :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] [Poll] Dealing with uncaught RuntimeExceptions

2021-06-30 Thread Stefan Bodewig

On 2021-06-29, Stefan Bodewig wrote:

> Options raised during the thread:

> (1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
> subclass) and throw the IOException

+1

> (2) catch only a subset of all RuntimeExceptions, wrap them in an
> IOException (possibly a subclass) and throw the IOException - allow
> the remaining RuntimeExceptions to fly through

+0

> (3) catch all RuntimeExceptions, wrap them in an specific unchecked
> exception (which one could be discussed later) and throw this one

-0

> (4) don't catch RuntimeExceptions at all, just document broken archives
> can cause arbitrary RuntimeExceptions and code that tries to read
> archives from untrusted sources is expected to deal with them
> itself.

+0

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] [Poll] Dealing with uncaught RuntimeExceptions

2021-06-29 Thread Stefan Bodewig

Hi

I'm sorry, but I'm unable to see what would or would not work for the
people who chimed in. Short of calling for a vote, lets try with a poll
that could show whether there is some sort of solution that is
acceptable to everybody.

Please use +1 to mean "I like this option", +0 to mean "the option is
OK, but I'd prefer a different one", -0 for "I don't like the option but
I can live with it" and -1 for "this option is not acceptable to me.

Options raised during the thread:

(1) catch all RuntimeExceptions, wrap them in an IOException (possibly a
subclass) and throw the IOException

(2) catch only a subset of all RuntimeExceptions, wrap them in an
IOException (possibly a subclass) and throw the IOException - allow
the remaining RuntimeExceptions to fly through

(3) catch all RuntimeExceptions, wrap them in an specific unchecked
exception (which one could be discussed later) and throw this one

(4) don't catch RuntimeExceptions at all, just document broken archives
can cause arbitrary RuntimeExceptions and code that tries to read
archives from untrusted sources is expected to deal with them
itself.

"Just harden all parsers" is a variation of (4) in my view as I don't
believe we would manage to cover all cases no matter how hard and long
we try.

I hope I didn't overlook any. 

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

2021-06-29 Thread Stefan Bodewig

On 2021-06-29, Miguel Munoz wrote:

> Catching all RuntimeExceptions and wrapping them in an IOException
> looks like the cleanest solution. RuntimeExceptions usually mean bugs,
> so if the archive code is throwing them due to a corrupted archive, it
> makes sense to wrap it in a checked exception. I would like to suggest
> creating a new class looking something like this:

> public class CorruptedArchiveException extends IOException { }

https://github.com/apache/commons-compress/compare/catch-RuntimeExceptions

:-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

2021-06-28 Thread Stefan Bodewig

On 2021-06-27, Gilles Sadowski wrote:

> Le dim. 27 juin 2021 à 21:15, Stefan Bodewig  a écrit :

>> As I said, we can as well document that each method could throw
>> arbitrary RuntimeExceptions, but I don't believe we can list the kinds
>> of RuntimeExceptions exhaustively

> Why not?
> Listing all runtime exceptions is considered part of good
> documentation.

Because we do not know which RuntimeExceptions may be caused by an
invalid archive.

Most of the RuntimeException happen because our parsers believe in a
world where every archive is valid. For example we may read a few bytes
that are the size of an array and then create the array without checking
whether the size is negative and a NegativeArraySizeException occurs.

As we haven't considered the case of a negative size in code, we also do
not know this exception could be thrown. If we had considered the case
of negative sizes, the parser could have avoided the exception
alltogether. This is what I meant with

>> - if we knew which exceptions can be thrown, then we could as well
>> check the error conditions ourselves beforehand.

Our parsers could of course be hardened, but this is pretty difficult to
do years after they've been written and would certainly miss a few
corner cases anyway.

And then there is a certain category of exceptions thrown by Java
classlib methods we invoke. We've just added a catch block around
java.util.zip.ZipEntry#setExtra because certain invalid archives cause a
RuntimeException there - and if I remember correctly a RuntimeException
the method's javadoc doesn't list.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

2021-06-27 Thread Stefan Bodewig

On 2021-06-27, Gilles Sadowski wrote:

> Hi.

>> [...]

>> it seemed Gilles was opposed to this idea

> Rather (IIRC) my last comment was that it was your choice as to
> what the API should look like.

Sorry, I didn't mean to misrepresent your POV.

> My opinion on the matter was along Gary's lines (which is J. Bloch's
> rationale provided in "Effective Java").
> Indeed I personally would indeed *not* pick option 1 because it puts
> the onus on the Commons library whereas input that does not comply
> with preconditions (i.e. a supported format) should unsurprisingly
> throw an IAE.

In which case we need to catch all the other RuntimeExceptions and turn
them into IAEs, right? :-)

Some if we want to throws any other specific RuntimeException following
Matt's suggestion.

We are already throwing checked IOExceptions for invalid archives in
many many cases today. Our users expect us to do so for all invalid
archives - well some of them.

As I said, we can as well document that each method could throw
arbitrary RuntimeExceptions, but I don't believe we can list the kinds
of RuntimeExceptions exhaustively - if we knew which exceptions can be
thrown, then we could as well check the error conditions ourselves
beforehand.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] Dealing with uncaught RuntimeExceptions (again)

2021-06-27 Thread Stefan Bodewig

On 2021-06-27, Gary Gregory wrote:

> Catching all unchecked exceptions (UE) and rethrowing as checked exceptions
> (CE) feels like both a horror show and an exercise in futility, especially
> in order to appease some tool that complains today of one thing which may
> complain differently tomorrow, I really don't like that idea on paper.

Independent of the nonsense JFrog does our users have repeatedly tols us
they expect Compress to throw an IOException rather than a
RuntimeException for broken archives.

https://issues.apache.org/jira/browse/COMPRESS-169 fixed in Compress 1.4
ten years ago probably is the first one and it is easy to find many more
of this type (COMPRESS-424, COMPRESS-490, COMPRESS-131, COMPRESS-219
... here I stopped grepping through our changelog).

> Let's keep in mind that a CE means one can catch it and try to do something
> about it which is definitely not the case for a corrupted archive. I mean,
> you can download it again, sure, and end up where you started.

This is not quite true. You can not recover from an EOFException
either. There are lots of cases where we already throw IOExceptions for
irrecoverably corrupt archives today.

> IOW, let's do what we think is best, not what some tool some company wants
> to sell ("our tool is hard core and found 5 gazillion bugs in the open
> source echo system")

+1 can't argue with selecting the option we consider best for our users.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] Dealing with uncaught RuntimeExceptions (again)

2021-06-27 Thread Stefan Bodewig

I'd like to get closure on which approach we want to take.

When we read a broken archive it may trigger arbitrary RuntimeExceptions
because we are not explicitly checking for each and every sizuation
where a bounds check could fail, a negative size is sent to a classlib
method that then throws an IllegalArgumentException or whatnot (even a
NullPointerException may escape us every now and then).

Uncaught RuntimeExceptions are considered security issued by some tools
because of a potential DoS attack. Historically we have never agreed
with this point of view and I'm not suggesting to change that.

Even though we may not know what is wrong, when the RuntimeException
occurs, we do know the archive is broken and this is the reason for the
exception.

AFAICS there are two ways we can deal with it:

(1) every method that reads from the archive declares it can throw
arbitrary RuntimeExceptions as well. And we document that broken
archives may cause RuntimeExceptions and that we never consider such
a case a security issue.

(2) we catch RuntimeExceptions at every method that reads from the
archive and wrap them in a custom IOException, making sure such a
case can never escape us.

Personally I prefer (2) but can live with (1) - I've suggested something
along the lines of (2) in [1] and it seemed Gilles was opposed to this
idea (and Matt was torn).

In [2] Bernd seemed to support (2).

Are there any other opinions?

Stefan

[1]
https://lists.apache.org/thread.html/r5d2427566dff4c7d293e8d48f9ac62b7958d19047f730836ce5b3c60%40%3Cdev.commons.apache.org%3E

[2]
https://lists.apache.org/thread.html/r3ce77eb9ab9429097ca57c48cb99b8be497ee5b69d419b52a6722616%40%3Cdev.commons.apache.org%3E

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] Java 16 and 17-ea

2021-06-12 Thread Stefan Bodewig

On 2021-06-12, Stefan Bodewig wrote:

> On 2021-06-12, Gary Gregory wrote:

>> Please note that the Java 16 and 17 builds are now green on GitHub after my
>> changes this morning to update some dependencies.

> They haven't been green before - or for any JDK > 14 - because of
> missing pack200 classes inside of the classlib.

Sorry, obviously I missed Peter's change

https://github.com/apache/commons-compress/commit/92d9df3320d00dee0edef8290ef325bc83b9255a

and somehow never looked at the broken builds as I assumed it must have
been the pack200 thing.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] Java 16 and 17-ea

2021-06-12 Thread Stefan Bodewig

On 2021-06-12, Gary Gregory wrote:

> Please note that the Java 16 and 17 builds are now green on GitHub after my
> changes this morning to update some dependencies.

They haven't been green before - or for any JDK > 14 - because of
missing pack200 classes inside of the classlib.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [compress] Dealing with RuntimeExceptions While Parsing Archives

2021-06-06 Thread Stefan Bodewig

On 2021-06-06, Gilles Sadowski wrote:

> Le dim. 6 juin 2021 à 07:51, Stefan Bodewig  a écrit :

>> Hi

>> I'm thinking about a specific IOException subclass that is thrown when a
>> RuntimeException "happens" somewhere in the code that parses data in
>> Zip/SevenZ/TarFile, see

> I'm afraid I missed part of the story as to what is the original problem.

Sorry, I should have expanded on that.

When we uncompress a stream / expand an archive our users most of the
time are not responsible for the input. If the data they hand over to
Compress is invalid, they expect the library to throw an IOException -
and in many cases this is true.

But the reality is most of our parsing code is written for the good case
where the archive follows the spec. The code relies on numbers to be
where they should be and not letters, it may fail to check an offset is
inside of the bounds of an array and so on. So for certain types of
broken archives the parsers will throw arbitrary RuntimeException
(NumberFormat, ArrayIndexOutOfBounds, NegativeArraySize and so on).

People do not expect said RuntimeExcepitons, so they don't catch them.

In a situation where an attacker controls the input this can be used to
make the application reading it crash. So for certain types of
applications this might be security relevant, it could be a DoS
vector. Of course one can argue the calling code should better protect
itself when it reads untrusted input and catch even undeclared
exceptions, that's why we've never issued CVEs for exceptions where our
parser code has not been strict enough in the past.

After Compress 1.20 we've had lots of reports of RuntimeExceptions being
thrown, many of which have been uncovered by fuzz testing tools (not
only, but also OSS Fuzz).

What I suggest is a stop-gap. It is not an excuse for not properly
verifying input in our parsing code but rather a way to limit the impact
of such oversights.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] Dealing with RuntimeExceptions While Parsing Archives

2021-06-05 Thread Stefan Bodewig

Hi

I'm thinking about a specific IOException subclass that is thrown when a
RuntimeException "happens" somewhere in the code that parses data in
Zip/SevenZ/TarFile, see

https://github.com/apache/commons-compress/compare/catch-RuntimeExceptions

is this a good idea? Should anything be worded/named differently?

If this seems right I'd add similar code to all
Archive/CompressorInputStream classes as well.

Personally I would not do the same for the code that writes
archives/compresses streams as in this case the library user is fully
responsible for the input.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] 7z and Recovering Corrupt Archives

2021-06-04 Thread Stefan Bodewig

Hi all

7z archives provide CRCs for the metadata section so you can quickly
identify a wide range of broken archives - which is far better than what
you get for ZIP for example.

It is possible to recover from a certain type of broken archive. A case
where the archive has been written almost completely and just the CRC
and the locator of metadata are missing. The docs talk about
disks/drives being removed prematurely.

The basic idea is to search backwards from the end of the file for the
metadata and try to parse it. This is what SevenZFile does and has
always done. This is the root cause of
https://issues.apache.org/jira/browse/COMPRESS-542 - the file ends with
something that looks like metadata of an archive with lots and lots of
files in it and the allocation of arrays leads to a OOM.

Current master will detect corrupt archives more quickly - in particular
without excessive allocations - but still it may take quite some time to
reject thousands of candidates of "this could be the first byte of
proper meta data". We are scanning the last megabyte of the file and
there is ample chance this last megabyte may contain random noise that
looks promising.

Personally I believe that almost nobody actually needs this mode of
recovery.

Therefore I've thought we might want to introduce an option that enables
the recovery mode. If it was disabled and we found the CRC was missing
we'd throw a new specific exception that says "you may want to try with
recovery enabled instead".

Making this new option default to disabling recovery would break
backwards compatibility but it is tempting to think this could be
fine. I'm a bit torn here. What do you think?


Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: OSS-Fuzz issues are being reported as vulnerabilities

2021-05-24 Thread Stefan Bodewig

On 2021-05-24, Bernd wrote:

> Am Mo., 24. Mai 2021 um 20:46 Uhr schrieb Matt Sicker :

>> There's also a bit of an issue of fixing these types of
>> vulnerabilities at the library level. The library itself typically
>> won't have much in the way of a security model until you integrate it
>> into an application.

> That is true in general, but we have for the found issues (which are not
> OOM) two possibilities, catch the runtime exceptions and rethrow them as a
> IOException (Maybe a new subtype like MalformedArchiveException extends
> IOException) or document in javadoc that the runtime exceptions dealing
> with Out of Bounds and maybe NullPointers might be thrown. Do you think
> this is a good idea?

So far we've preferred the "throw an IOException" approach in
Compress. In some areas like handling of ZIP extra fields we're simply
catching all RuntimeExcepiton and rethrow them as IOExceptions, in most
other cases we try to identify the cases where a RuntimeException would
occur and throw an IOException prior to that.

> - I can prepare a patch for it, would prefer a new specific sub
> exception.

My changes so far have focussed on avoiding the exception alltogether
rather than "catch everything at the API entry point" as I was hoping to
catch errors early and avoid OOMs and infinite loops by more aggressive
bounds checks.

Some of Compress' packages have got a big public API for historic
reasons. The tar package is probably the worst offender with TarUtils
having lots of methods that are public for no good reason, some of which
don't even declare any checked exceptions to be thrown. Here "document
the exception" is the only thing you can do - and catch/rethrow inside
of the other code parts that call them.

> Having said that, it is not uncommon that a size field is used to allocate
> Buffers, in that case an OOM is possible and a Limit Manager would be
> helpful. This does not only help against malicious files. Not sure if the
> fuzzer wil find that in the future...

Some of our packages support a custom option that says "don't use more
than X MB", but this is limited to a few algorithms that do provide
estimates like LZMA and XZ. So far OSS Fuzz is not setting this option
for anything (I believe this only applies to the 7z case currently) so
OSS Fuzz could run into OOMs that users could avoid by setting the given
option - something we could change if we wanted to.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: OSS-Fuzz issues are being reported as vulnerabilities

2021-05-24 Thread Stefan Bodewig

On 2021-05-24, Tero Saarni wrote:

> We are getting reports from JFrog Xray vulnerability scanner that seem
> to be related to recently fixed OSS-Fuzz issues:

I wasn't aware of this effect. This is very unfortunate.

> * Summary: Apache Commons Compress archivers/zip/ZipFile.java
>   ZipFile::readCentralDirectoryEntry() Function Uncaught Exception DoS
>   Severity: High

> * Summary: Apache Commons Compress archivers/tar/TarArchiveEntry.java
>   TarArchiveEntry::processPaxHeader() Function Uncaught Runtime Exception DoS
>   Severity: High

> In previous thread it was said that none of the fuzzer findings was
> deemed security issues.  Were these incorrectly flagged by the
> vulnerability scanner?

Historically we have never considered uncaught runtime exceptions to be
security issues. We've fixed similar issues in the past and still do.

So when I said nothing had been a deemed a security issue I meant
"deemed by us". Unfortunately the OSS Fuzz classification doesn't match
ours.

There are a few more cases around 7z that have not been flagged as
security issues - I have no idea why not.

In all cases corrupt archives may cause RuntimeExceptions
(ArrayIndexOutOfBounds, IllegalArgument, BufferUnderflow, ...) rather
than IOExceptions. If you try to read archives from untrusted sources,
this may lead to unexpected exceptions.

> I'd be curious to know if there is planned date for commons-compress
> 1.21?

There is no planned date I was aware of.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS-Fuzz Issue Publication

2021-05-09 Thread Stefan Bodewig

Many thanks Fabian

and sorry for the delay - unfortunately I'm not really able to free up
as much time as necessary for any OSS stuff right now

On 2021-05-03, Fabian Meumertzheim wrote:

> The behavior you are observing has only become the standard somewhat
> recently [1], which is also why I had decided to point it out before we
> performed the integration [2].

> [1] https://github.com/google/oss-fuzz/issues/5255

I must have overlooked that back then - or just didn't understand what
it meant. One key is the phrase "after a patch is released" which also
is used in [1] which means a completely different thing to ASF
communities than to the person opening the issue above. Nobody around
here would argue against disclosing details of a vulnerability after a
new release containing the fix is available.

The best we can do probably is pointing out that the new policy is
incompatible with the ASF security policy - point 14 in

https://www.apache.org/security/committers.html#vulnerability-handling

without trying to argue who is right. Going from there we will see
whether there is an option for ASF projects to continue using OSS Fuzz
or not. Unfortunately I believe this discussion must be driven by
somebody with a predictable and sufficiently large slice of time for
this, which I will not be for at least the next week, likely longer.

Unless anybody else jumps in I'll take it on myself once I believe to be
available. Fortunately so far no issues have shown up that would force
ou hand - and even if something came up I'm sure we could figure out
some sort of singular exemption.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[all] OSS-Fuzz Issue Publication

2021-05-03 Thread Stefan Bodewig

Hi (Fabian)

by now we've resolved the first issues detected by ClusterFuzz (and I
forgot to credit it OSS Fuzz in Compress, my bad). What we observed is
that the issues became public automatically once the patch fixing the
issue was merged into master and ClusterFuzz reran the test. In the case
of Compress somewhere around 24 hours after fixing things in master.

So far none of the issues we resolved would be deemed as a security
issue. But now we wonder, what if something indeed was a security issue
that we do not want to become public knowledge before we have cut a
release? Is there a way to prevent a verified and fixed issue from
becoming public automatically?

Here at the ASF we vote on releases, and we vote on the code base in our
default branch (master for most if not all components). Voting takes at
least three days, so the current behavior would mean the issue became
public knowledge a few days before a release fixing it was available.

Can you shed any light on this?

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-19, Stefan Bodewig wrote:

> On 2021-04-18, Fabian Meumertzheim wrote:

>> Stefan, if you agree, I would submit the two PRs tomorrow and ask you
>> to sign them off on GitHub via a comment on the PR and a link to this
>> email thread.

> Fine with me.

I hope my approval has been enough as I'm not a "reviewer with write
access".

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-18, Fabian Meumertzheim wrote:

> Stefan, if you agree, I would submit the two PRs tomorrow and ask you
> to sign them off on GitHub via a comment on the PR and a link to this
> email thread.

Fine with me.

Thank you

  Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-18, Fabian Meumertzheim wrote:

> On Sun, Apr 18, 2021 at 6:22 PM Stefan Bodewig  wrote:
>> Can probably do, what is the duty of a primary contact? My github
>> username is bodewig.

> The primary contact may be asked to sign off on PRs to that project in
> the OSS-Fuzz repo, in particular if someone needs to be added to the
> "auto_ccs" list.

I see.

Can there be more than one "primary" contact? There is a reason why we
use role based mail aliases and mailing lists, it is pretty likely
people become completely unavailable for a while and I don't want to
block adding people to auto_cc just because I prefer to be offline.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-18, Stefan Bodewig wrote:

> I've created https://issues.apache.org/jira/browse/INFRA-21741 if you
> want to lend a hand moderating, you may want to add yourself to the
> ticket before the list is created.

The list has been created, so if you want to receive the fuzz reports
please subscribe to fuzz-testing@commons and one of the intial
moderators will accept the subscription (if we can recognize the email
address :-).

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-18, Fabian Meumertzheim wrote:

> Anyone who is (or wants to be) a moderator on that list and has a Google
> account, please let me know the primary email address so that I can add it
> to the "auto_ccs" list for oss-fuzz.com access.

> Stefan, would you want to act as the "primary_contact"? That does not
> require a Google account, but a GitHub account would be helpful for
> interactions with the OSS-Fuzz repo.

Can probably do, what is the duty of a primary contact? My github
username is bodewig.

> I have prepared fuzzers for compress and imaging. Would you want me to set
> up both once the list has been properly set up?

Given that Bruno expressed interest for imaging, I'd say yes.

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

Hi all

I've created https://issues.apache.org/jira/browse/INFRA-21741 if you
want to lend a hand moderating, you may want to add yourself to the
ticket before the list is created.

Thanks

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-17, Matt Sicker wrote:

> I have a Google account I can be CC’d on. I do security engineering
> professionally, so I have some experience in the area as well.

Thanks Matt, I'll add you as one of the initial moderators as well.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

On 2021-04-17, Fabian Meumertzheim wrote:

> Let me describe the restrictions in more detail, including example reports.
> Everyone listed under "primary" or "auto_cc" will receive the bugs created
> in the issue tracker at [1] in email form and can also add comments by
> replying to the email thread, regardless of whether they have a Google
> account or not. These bugs only include some basic information such as a
> truncated stack trace and a suggested severity. See [2] for an example.

> [1] https://bugs.chromium.org/p/oss-fuzz/issues
> [2] https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32208

> The detailed report, reproducer testcase and crashing revision information
> (everything hosted on oss-fuzz.com) requires authentication and thus a
> Google account. The detailed report AFAIK even requires authentication
> after the bug has become public. I attached an example report as a crudely
> exported PDF, just to give you an idea of what information it contains.

As our mailing list is set up to strip any non-text part, I've made the
PDF you sent available:

https://stefan.samaflost.de/staging/ExampleFuzzer.fuzzerTestOneInput.pdf

> Regarding privacy concerns: You could consider creating a Google account
> specifically for this purpose and only use it to log in to oss-fuzz.com in
> a private browsing context.

It is perfectly fine for me if Matt and/or others have access.

Cheers

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

2021-04-17 Thread Stefan Bodewig

On 2021-04-15, Fabian Meumertzheim wrote:

> Just to keep the following in mind: Full access to bug reports and
> reproducers requires a Google account (which can be associated with
> any existing non-list email address). At least the moderators of the
> list would therefore have to be listed explicitly in the project's
> YAML file [1] in the OSS-Fuzz repo, in addition to the new mailing
> list.

> [1] 
> https://google.github.io/oss-fuzz/getting-started/new-project-guide/#primary

I'm not sure I understand this. AFAIU I could never become a "primary"
or an "auto_cc" as I will not create a Google account. Do we need to
have one? In that case somebody who doesn't share my personal set of
allergic reactions may want to act as primary.

But how does that translate to using a mailing list as recipient of
reports? The section you link says you need a google account to "get
full access" - I don't think the list will need "full access".

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

2021-04-17 Thread Stefan Bodewig

On 2021-04-13, Gary Gregory wrote:

> Please don't use @security for automated emails, that ML IMO should be for
> humans.

> If you want to setup a new ML for bots that's fine, we can direct GitHub's
> Dependanot emails there if GitHub allows for that.

I don't believe dependabot and the results of fuzz testing share an
audience. Dependabot mails are by no means sensitive (the PRs are public
anyway) so there is no need for restricting the subscription to the
messages it creates.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

2021-04-17 Thread Stefan Bodewig

On 2021-04-13, Mark Thomas wrote:

> On 13/04/2021 17:49, Stefan Bodewig wrote:

> 

>> Fabian has offered to set up OSS Fuzz for Compress. Given that the
>> issues OSS Fuzz detects may or may not be security sensitive, I don't
>> feel it would be a good idea to have the tool send reports to a public
>> mailing list. Therefore I propose to create another subscription
>> moderated list just for these kinds of reports. I'm afraid it could be
>> too noisy for security@commons.

> Following the "split by audience, not by topic" guideline, I'd suggest
> using security@commons.a.o rather than a separate list. Much, much
> bigger projects than Compress use OSS Fuzz and direct traffic to their
> security list where it seems to be manageable.

With more projects jumping it this may become more traffic. Given that
at least one subscriber of security@ (Gary) is strongly against using
that list, I don't want to force it on him.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[all] OSS Fuzz

2021-04-13 Thread Stefan Bodewig

Hi all

I want to pick up (and finish) the discussion that started in
Compress[1].

Short Recap:

OSS Fuzz[2] runs fuzz testing for open source projects by invoking
methods of our code with random data looking for unexpected outcomes
(undeclared exceptions or worse code that never returns because it is
stuck in an infinite loop for example).

For Compress Fabian (who started [1]) has already identified and
reported several issues, one of which would have become a CVE if the
code in question had been part of any release of Compress. In the past
other people have run different fuzzers and found "interesting" results
in Compress as well.

Compress may be especially vulnerable as it basically tries to make
sense out of a bunch of user supplied bytes - but the same is probably
true for codec or imaging for example.

Fabian has offered to set up OSS Fuzz for Compress. Given that the
issues OSS Fuzz detects may or may not be security sensitive, I don't
feel it would be a good idea to have the tool send reports to a public
mailing list. Therefore I propose to create another subscription
moderated list just for these kinds of reports. I'm afraid it could be
too noisy for security@commons.

Proposal

Unless anybody objects until then I will create such a list (I believe
there is a self-service thingy for that, otherwise I'll ask the infra
folks) on the coming Sunday. I'd add myself as a moderator but we will
need more moderators. Also I'll gladly accept ideas for the name of the
list.

If there are objections against yet another mailing list I'll ask Fabian
to set things up using a private mail alias. If you want to receive the
messages as well, please tell me.

Cheers

Stefan

[1]
https://lists.apache.org/thread.html/rb34ea7d9272b8e600437ea705b13aba1bcc2f23ceb55880bce27e479%40%3Cdev.commons.apache.org%3E

[2] https://google.github.io/oss-fuzz/

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-09 Thread Stefan Bodewig

On 2021-03-09, Gary Gregory wrote:

> A reminder that we can break our own builds by configuring maven plugins
> like spotbugs, pmd, and so on. If we need to configure another plugin to
> run in our builds to check for different errors, then let's consider that.

Fuzz testing need compute power beyond what you want to provide via a
local build.

So I understand you prefer not to join OSS-Fuzz as a project. IMHO
personal emails will not scale. What if I subscribe with my email
address and disappear for six months?

At least for Compress I see value in Fuzz testing.

Any other opniions?

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-09 Thread Stefan Bodewig

On 2021-03-08, Gary Gregory wrote:

> Note that we already have FIVE mailing lists:

> commits
> dev
> issues
> notifications
> user

which are all public

> PLUS, private and security.

subscribers of which will probably not like to receive automated emails.

> Do we really want a SIXTH? Can't this fit in one of the above?

Which one do you suggest?

Cheers

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-08 Thread Stefan Bodewig

On 2021-03-08, Gary Gregory wrote:

> Are we talking about a human sending emails to the security list or letting
> the actual tool loose on the list to possibly spam it with false positives?

We are talking about a tool sending mails that (currently) is unable to
identify whether an issue it detects is security critical or not.

I propose a new subscription moderated list so people can decide whether
they want to see the mails - and we don't leak sensitive information by
accident. Human beings subscribed to said list can then escalate to
security@ as necessary.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-07 Thread Stefan Bodewig

On 2021-03-07, Gary Gregory wrote:

> This issue has popped as well WRT GitHub emails from Dependabot.

I don't think this is comparable.

The fuzzer may find issues that can be exploited as DoS attacks, so the
results probably should go to a subscription-moderated list IMHO.

Stefan

> Gary

> On Sun, Mar 7, 2021, 12:45 Matt Sicker  wrote:

>> We could create another private list for static analysis alerts perhaps?

>> On Sun, 7 Mar 2021 at 03:51, Stefan Bodewig  wrote:

>>> On 2021-03-07, Fabian Meumertzheim wrote:

>>>> On Sat, Mar 6, 2021 at 10:08 PM Stefan Bodewig 
>> wrote:

>>>>> OTOH I'm not sure I understand the requirements of OSS-Fuzz. I haven't
>>>>> read the docs only looked at the image of the process. Seeing a
>>>>> Sheriffbot tracking deadlines makes the me very uncomfortable. I'm a
>>>>> volunteer and so are most others around here.

>>>> The disclosure policy for OSS-Fuzz is detailed here:

>> https://google.github.io/oss-fuzz/getting-started/bug-disclosure-guidelines/
>>>> Reports will become public after 90 days (plus a 14 day grace period
>>>> if a patch is close to being released).

>>> Well, 90 days would work for me. Let's hear whether others object.

>>> Extending the deadline if it ends on a wekeend is the opposite of what
>>> I'd personally need, though :-)

>>>>>> All I would need from you is a list of emails to which the automated
>>>>>> bug reports should go. The reports are usually directly actionable as
>>>>>> they include stack traces and minimized reproducers.

>>>>> In general I'd think the notifications list of the Commons project
>> would
>>>>> be a the best fit. Of course the nature of the issues detected could
>>>>> lead to the fuzzer uncovering security critical bugs that we may not
>>>>> want to become public before a release fixing it has become available.

>>>> I am currently working on improving the automatic security/severity
>>>> analysis of Java findings in OSS-Fuzz, which should help prioritize
>>>> the security-relevant bugs (e.g. OoM, infinite loops) over the less
>>>> important ones (e.g. undeclared exception).

>>>> However, afaik the list of email recipients for a bug currently can't
>>>> depend on the security content of the bug, so it might be better to
>>>> choose a private mailing list here.

>>> I see. But I really wouldn't want to use the security list for
>>> everything. Maybe somebody else got a good idea where to send results?

>>> Stefan

>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org


>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-07 Thread Stefan Bodewig

On 2021-03-07, Fabian Meumertzheim wrote:

> On Sat, Mar 6, 2021 at 10:08 PM Stefan Bodewig  wrote:

>> OTOH I'm not sure I understand the requirements of OSS-Fuzz. I haven't
>> read the docs only looked at the image of the process. Seeing a
>> Sheriffbot tracking deadlines makes the me very uncomfortable. I'm a
>> volunteer and so are most others around here.

> The disclosure policy for OSS-Fuzz is detailed here:
> https://google.github.io/oss-fuzz/getting-started/bug-disclosure-guidelines/
> Reports will become public after 90 days (plus a 14 day grace period
> if a patch is close to being released).

Well, 90 days would work for me. Let's hear whether others object.

Extending the deadline if it ends on a wekeend is the opposite of what
I'd personally need, though :-)

>>> All I would need from you is a list of emails to which the automated
>>> bug reports should go. The reports are usually directly actionable as
>>> they include stack traces and minimized reproducers.

>> In general I'd think the notifications list of the Commons project would
>> be a the best fit. Of course the nature of the issues detected could
>> lead to the fuzzer uncovering security critical bugs that we may not
>> want to become public before a release fixing it has become available.

> I am currently working on improving the automatic security/severity
> analysis of Java findings in OSS-Fuzz, which should help prioritize
> the security-relevant bugs (e.g. OoM, infinite loops) over the less
> important ones (e.g. undeclared exception).

> However, afaik the list of email recipients for a bug currently can't
> depend on the security content of the bug, so it might be better to
> choose a private mailing list here.

I see. But I really wouldn't want to use the security list for
everything. Maybe somebody else got a good idea where to send results?

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] OSS-Fuzz integration

2021-03-06 Thread Stefan Bodewig

On 2021-03-05, Fabian Meumertzheim wrote:

> I am one of the maintainers of Jazzer
> (https://github.com/CodeIntelligenceTesting/jazzer), a new open-source
> fuzzer for JVM projects based on libFuzzer.

> I have set up a few Commons projects for local fuzzing with Jazzer,
> which lead to quite a few bug reports in Compress and other projects
> (https://issues.apache.org/jira/browse/COMPRESS-569?jql=reporter%20%3D%20Meumertzheim).
> While the majority of the bugs found are undeclared exceptions, this
> approach also caught an infinite loop on a crafted 0.5KB .tar before
> it could make it into a release (see COMPRESS-569).

Yes, many thanks for that.

> Jazzer is in the process of being integrated into OSS-Fuzz
> (https://github.com/google/oss-fuzz) for continuous fuzzing on
> Google-provided infrastructure (ClusterFuzz).

> If you agree this is a good idea, I could set up Compress for fuzzing
> on OSS-Fuzz.

Also I'd like to point out issues detected by Maksim Zuev last spring
and summer who used a different fuzzing tool.

When reading archives or compressed streams, Compress consumes a lot of
input and we are obviously not validating it in all cases as good as we
should. Much of the code assumes it will only ever encounter valid
archives. I believe fuzzing can help us finding places where we trust
input too much.

commons-codec or commons-imaging likely are in similar places.

OTOH I'm not sure I understand the requirements of OSS-Fuzz. I haven't
read the docs only looked at the image of the process. Seeing a
Sheriffbot tracking deadlines makes the me very uncomfortable. I'm a
volunteer and so are most others around here.

> All I would need from you is a list of emails to which the automated
> bug reports should go. The reports are usually directly actionable as
> they include stack traces and minimized reproducers.

In general I'd think the notifications list of the Commons project would
be a the best fit. Of course the nature of the issues detected could
lead to the fuzzer uncovering security critical bugs that we may not
want to become public before a release fixing it has become available.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] github

2020-07-27 Thread Stefan Bodewig

On 2020-07-26, Melloware wrote:

> I know there seems to to be a holy war about the use of GitHub going
> on here

This has never been my intention. Far from it. And if you believ I've
tried to start any kind of war you must have misread my original mail
completely. I'm really sorry about thaty.

I appreciate the benefits github brings but realize it also causes
problems that I don't know how to solve. I would have liked to talk
about what we can do to build communities around components again rather
than whether creating PRs is better than not doing that.

Maybe I should have written an essay and not bother the list. All that
hapened is that I've caused another stirrup about topics I didn't even
intend to touch.

I'll rest my case as this is not going to become productive.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] github (was Re: [VOTE] Create additional mailing lists for automated posts)

On 2020-07-24, Xeno Amess wrote:

>> We respectfully discuss and in the end come to a  compromise or a common
>> ground where we can agree to disagree. I still see this happen here and
>> don't think all of us need to have the same opinion.

> So maybe at the end some of commons repos using as new version of
> dependencies as they can, others using as old version of dependencies as
> they can?

Could be such an outcome, yes, or maybe a stronger recommendation for a
certain approach. We'll see in a different thread :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [ALL] CI builds

On 2020-07-23, Olivier Lamy wrote:

> In the Maven project we have plenty of maven-* git repo so we have created
> a dedicated Jenkins plugin (which is used by other TLP such netbeans) which
> scan the gitbox server to get repo based on regular expression or name
> content and create the build reusing the same build file.

This sounds very interesting, thank you Olivier. Is this
https://github.com/apache/maven-jenkins-lib ?

I'm not sure there are enough people around here still willing to use
Jenkins to enable something like this for all repos :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] github (was Re: [VOTE] Create additional mailing lists for automated posts)

On 2020-07-24, Xeno Amess wrote:

> As for community building, I agree with you that commons seems not a
> close community, but I doubt it be github's fault.  even there be no
> github, sub-repos in commons are not that close to each other.

Commons is an old project and it started with a striving community. Most
people have left, as is natural for any open source project. I believe
we've done better attracting new community members before, but this is
certainly colored by my mood and preferences.

As for your follow-up response about agreeing on when to upgrade
dependencies. Community doesn't mean uniformity to me. We respectfully
discuss and in the end come to a compromise or a common ground where we
can agree to disagree. I still see this happen here and don't think all
of us need to have the same opinion.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] github (was Re: [VOTE] Create additional mailing lists for automated posts)

On 2020-07-24, Xeno Amess wrote:

> I will explain why github come to be center, but not apache gitbox.
> 1.1
> I have right to register an account on github.
> 1.2
> I registered an account at github.
> 1.3
> I commit then create pr.
> 1.4
> pr get reviewed then merged.

I am fully aware of how github works, I use PRs myself.

The perceived ease[1] of doing this comes at a price and I'm mourning
the loss of community building.

Stefan

[1] with gitbox you are certainly able to contribute

git clone
git checkout -b my-work
...
git format-patch
attach patch to JIRA

but you know that.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[all] github (was Re: [VOTE] Create additional mailing lists for automated posts)

This is an attempt at answering something raised be Gilles in a
different thread. I'm afraid it is getting longer than I
intended. Something seems to need to get out. Sorry.

On 2020-07-23, Gilles Sadowski wrote:

> I missed the turn where this project's PMC decided that we must
> be present on GH in order to continue what some of us have been
> doing for more than 10 years.

The project decided to set up github mirrors and with the exposure to
github you get the rest of the package.

I don't remember whether we had an explicit vote about enabling github
mirrors, much less whether it has been a component by component
decision.

Putting aside what I think about github as a company one thing that I
have observed with projects migrating to github across the board is it
seems to lower the barrier for new contributors. This results in two
things: (1) you get more contributions and (2) the contributors usually
don't stick around, most contributions are drive-by one-off
contributions.

Gilles, I believe you have seen an uptick in contributions to the
sevaral maths components even if you don't like they way they have
happened.

> There is a trend to make GH central to the development process
> (marginalizing "dev@" and JIRA and colonizing "issues@").

This is troubling me as well. Where "this" is that using github PRs
seems to keep the potential future committers away from the dev list and
they never become part of a component's community.

We do get more contributions anf the quality of the codebases likely
improves because of this, but I feel it makes the community weaker.
Discussions only rarely happen "here" nowadays.

There are a few people like Gary, Rob and a few others who manage to
devote time across almost all components and I adore them for that. And
we need them because most of the components wouldn't stand a chance to
get a release vote through without them. Our component communities have
become too small to sustain themselves and at the same time some of them
see more commits than they used to have in "the good old days" with many
people active on this list.

I haven't got an answer to this.

[Sidenote: In the early 90s I contributed to my first open source
project before CVS had a network protocol. I can not understand why
having to checkout a SCM repository, creating a patch and sending it to
somebody else is a burden that keeps people from contributing. I have
learnt to accept this as a fact. There are lots of facts I cannot
explain. And yes, I know I sound old. I am :-)]

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] When to update dependencies?

On 2020-07-24, Bernd Eckenfels wrote:

> When it comes to dependencies wie have both problems: if we upgrade 
> dependencies to aggressively (or if we don't test with older dependencies) 
> then users have the problem that they might not easily be able to upgrade to 
> a new commons version since the required new dependency version might 
> conflict with other (thirdparty) users of that dependency.

> On the other hand if we not continuously update external dependencies we 
> might miss out on their new features, performance and fixes. In addition we 
> might fall behind and have the  to do painful Big Bang Upgrades. Also when 
> our transitive dependencies are outdated and contain bugs (or compliance 
> violations due to old code) some customers might not be happy.

I hear you. I'm not opposed to updating a dependency if we want to use
new features. What I'd like to avoid is updating without a reason other
than "there is a new version and it seems to work according to our
tests".

> So there is a middle ground to be found, which unfortunately collides with 
> the current limited effort maintenance of some of the components:

> - we should define a minimum baseline version of dependencies and runtimes 
> and on each release we check if we still meet them. When we raise the 
> baseline we should ship a new minor (or even major) version. Also we might 
> want to ship security fixes only as a micro update (I.e. not requiring major 
> updates besides the affected code)

One problem with patch updates for security releases is the release vote
will reveal a security update is in the making and the diff will be
small enough to give away the details before the vote has passed. I may
be paranoid.

> - we should regularly test against latest dependency versions (at least 
> within the same minor branch).

Apache Gump?

What you describe sounds good, but "unfortunately collides with the
current limited effort maintenance of some of the components".

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] When to update dependencies?

On 2020-07-24, Xeno Amess wrote:

> how about:
> 1. we force versions of dependencies in commons-parent
> 2. we make every commons repo use commons-parent as parent.
> 3. we make sure no other repos forces versions of dependencies; all of the
> versions number shall be inherited from commons-parent
> 4. we upgrade versions in commons-parent every several months.

As you may guess I'd be strongly opposed to this as it would mean we
updated everey few months just because we can, noot because there was a
good reason. On top no component could upgrade more agressively if it
really needed a feature of a newer version.

Apart from that I'd expect some component developers to be rather
unhappy if *anything* was forced on them. :-)

Honestly I doubt we could even find a common ground on which version of
Java to support if we tried to have one version for all components.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] When to update dependencies?

On 2020-07-24, Gary Gregory wrote:

> Now back to our code depending on other dependencies. My view is that there
> are bugs, you just have not hit them. If I find one in a dependency and it
> gets fixed, it's going to mean a new version of that dependency.

This either is "we hit a bug that affects us" - where I already say this
is a reason to upgrade for me - or "the user hits a bug" - in which case
I prefer to let the user update the buggy dependency. I don't want to
force the user to update because they might be affected by bugs.

And of course we both know that newer versions not only fix bugs, they
introduce new ones as well.

> Furthermore, when I look online for Javadoc and examples, if I use a
> current jar version, then my odds are better that I can implement what I
> find online.

https://javadoc.io/ or similar services?

> I view it as simpler and safer to update from a "near" dependency than
> letting a dependency acquire "bit rot" and upgrade later, especially
> if an update means making adjustments. I want to make smaller
> increments of adjustment rather than a larger set. Just like I prefer
> to RERO instead of big bangs for releasing Commons components.

I fully agree with you if nobody else depends on my stuff. I.e. I'm
working on an application rather than a library. When working on a
library I really want the user to be and stay in control.

> Another way to look at this is to look at a large software stack: If every
> library developer never updates dependencies, then your application,
> through transitive dependencies will end up depending (virtually) on many
> versions of the each library, which is much more likely to create jar hell
> and other problems than the other edge case where everyone uses the latest
> version (or a fixed version.)

You are just hitting home on Stefan's almost 20 year old impression that
automatic transitive dependency resolution is a BAD IDEA. ;-)

I fully expect you (all of you) to ignore that last remark.

> From a hand-waving-talking-over-beers-more-FOSS-y philosophical-POV, I'd
> like to think that by eating our own current Apache dog food as well as
> other FOSS dog food, we all help each other make our software better.

We might help the developers of our dependencies but I'm afraid we are
hurting our dependees - and I tend to care more for the latter.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] When to update dependencies?

On 2020-07-24, Torsten Curdt wrote:

> It still needs a person to decide to merge a PR for a new version.
> So this indeed is just about the dependency upgrade policies.

Right.

> But isn't that what the version definition is for?
> I'd argue that 1.12.4 <-> 1.12.6 should be a compatible upgrade AND
> downgrade,
> 1.12.4 -> 1.20.0 not so much.

As Gary pointed out else-thread most of the time we do not know how
strict the team developing our dependency adheres to SemVer.

Even if it was completely API compatible, we'd replace a version that
worked for our users with a different version that may introduce
problems. No matter how small the risk is, what is the benefit of
upgrading if we don't need the new version ourselves?

> But to avoid all this is why I usually try to inline dependencies for
> libraries as much as possible. Basically pretending to not have any.

Agreed, this is a different strategy that makes the whole question moot.

> Also a point I made many times.
> Just wanted to mention it - again :)

;-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[all] When to update dependencies?

Hi all

here I'd like to explain why I prefer not to update dependencies just
because we can. Maybe you can convince me that I'm wrong. I've tried to
make this point in different threads but either it has been lost or it
just wasn't worth discussing.

First of all let me get a few things out of the way

* I'm not talking about emails, I can deal with them

* I don't care whether a bot or a human asks for a version update

* I'm only talking about dependencies that are visible to our
  users. Test time dependencies or versions of Maven plugins are
  probably not an issue. Although Compress has mananaged to break its
  OSGi bundle just by upgrading the parent POM in the past.
  https://issues.apache.org/jira/browse/COMPRESS-498

All our components have downstream users. I.e. our dependencies become
somebody else's dependencies as well.

Let's say commons-foo 1.1.0 depends on A 1.12.4 and bumps the dependency
to A 1.12.18 for commons-foo 1.2.0.

When a user of commons-foo upgrades to 1.2.0 and hasn't defined their
dependency on A explicitly they will also upgrade A to 1.12.18. This may
be fine or it may cause problems. The new version of A may have made
incompatible changes that break the user's code or it may just have bugs
that were not present in A 1.12.4 and now raise their head.

Of course the users can explicitly state a dependency on A 1.12.4
themselves. But there is no guarantee commons-foo compiled against A
1.12.18 will still work with A 1.12.4.

About fifteen years ago Ant was bitten by StringBuffer adding a new
method append(StringBuffer) in Java 1.4 (if memory serves me
right). Code that called someStringBuffer.append(anotherStringBuffer)
compiled on Java 1.3 would call append(Object), but compiled on 1.4 it
would call the new version and thus could not run on Java 1.3. This is
the kind of change animal sniffer was invented to detect and the
--release option of javac deals with. There is no such tool helping us
with APIs that are not part of the Java classlib.

Therefore I believe updating a dependency is a risk and we should leave
it to the users to decide which version they want to use.

Unless we've got real reasons to update. Real reasons IMHO are security
issues, bugs in dependencies causing bugs in our code or when we really
want to use new features introduced in a new version.

Outside of these good reasons I wouldn't want to ever update a
dependency.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: Enable GitHub Dependabot.

On 2020-07-24, Rob Tompkins wrote:

>> On Jul 23, 2020, at 10:16 PM, Matt Sicker  wrote:

>> Also, how different is a bot proposing a dependency update from a human
>> doing the same? The bot includes far more context about the update in the
>> PR comment, too, which is super useful for determining whether or not the
>> dependency is worth updating. You can even configure it to only notify
>> about security updates if it’s too noisy.

> I don’t understand how substantive forward progress on a project can be 
> considered noisy. It’s just audit.

Oh my, please calm down.

Peter just said he hasn't been reading mails for a few days, is
overwhelmed now and will need time to review what has happened. He
didn't complain, he was apologizing for not responding immediately -
which he shouldn't feel was necessary IMHO.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] Dependabot PRs

On 2020-07-23, Oliver Heger wrote:

> Am 22.07.20 um 18:28 schrieb Stefan Bodewig:
>> On 2020-07-22, Rob Tompkins wrote:

>>> I’m happy to merge them….will get to them by tomorrow morning ok?

>> Personally I don't see any value for our downstream users if we update
>> our dependencies without actually needing an update - with the exception
>> of security updates. I don't like the idea of forcing our users to
>> update a different dependency just because they update our component, it
>> should be their choice when to update what.

> Stefan has a valid point here IMHO.

Thank you ;-)

I intend to raise a separate thread for this later as I'm afraid this
thread has been burnt by a different discussion.

> From out user's POV, our components are in some sense "more
> compatible" if they reference the oldest possible version of a
> dependency rather than the newest one.

+1

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [ALL] CI builds

2020-07-23 Thread Stefan Bodewig

On 2020-07-23, Stefan Bodewig wrote:

> My preference would be for using less of github rather than more. But
> I'm probably alone with that.

Of course I'm not. Sorry Gilles. :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Create additional mailing lists for automated posts

2020-07-23 Thread Stefan Bodewig

On 2020-07-23, Gilles Sadowski wrote:

> If I'm not mistaken, the issues@ ML was intended to keep one
> posted of and reactive on a human discussion happening on
> JIRA.  With the advent of JIRA-GitHub integration, the ratio of
> auto-generated messages relayed through that channel has
> exploded, with literally hundreds of redundant messages per
> week (or in a single day, today).

Most of them have been created by humans who open pull requests or
comment on them.

I agree that it is unfortunate we get each message twice for a github PR
that is linked to a JIRA issue and it would be nice if this could be
disabled somehow - but that's probably not possible without volunteering
to improve the tooling around the github/JIRA bridge. This (the
duplicate messages) would not be solved by spillting the MLs - you'd
still see each message twice if you subscribe to both lists.

> Could we have a ML dedicated to bot-generated messages

For things like dependabot I doubt we can really tell PRs opened by a
bot from those opened by a human.

> Specifically, I propose that
> github-iss...@commons.apache.org
> be set up for relaying GitHub generated posts (like comments,
> PRs merging, and so on) and that
> iss...@commons.apache.org
> returns to its original purpose (only).

Personally I don't care as I'd probably subscribe to the new list as
well and route all messages to the same place. If it helps with your
workflow then I'm +0.

For components that have github PRs enabled a PR not linked to a JIRA
issue could be missed by somebody not subscribed to github-issues -
which is not a problem as long as anybody who cares for the component is
subscribed.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [ALL] CI builds

2020-07-23 Thread Stefan Bodewig

On 2020-07-22, Gary Gregory wrote:

> My main driver is that we already use GitHub for source mirroring and PRs,
> so it feels better to me to have builds in the same place.

My preference would be for using less of github rather than more. But
I'm probably alone with that.

-0 on defaulting to github actions (my preference would be defaulting to
Apache infrastructure CI systems).

> I propose we default to GitHub while allowing each component to do whatever
> it wants. Specifically, I would like to drop Travis CI and use GitHub where
> both are used by a component.

+1 for "allowing each component to do whatever it wants" including
sticking to Travis and/or setting up Jenkins builds.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: Enable GitHub Dependabot.

2020-07-22 Thread Stefan Bodewig

I hope anybody sees this message.

Can we please discuss this per component? I personally do like the idea
of dependabot for applications but feel it is completly wrong for
libraries and would prefer to not use it.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all]should we really allow denpabot upgrade a dependency that changes major version?

2020-07-22 Thread Stefan Bodewig

To answer the question of your subject: my opinion is a very strong NO.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] Dependabot PRs

2020-07-22 Thread Stefan Bodewig

On 2020-07-22, Rob Tompkins wrote:

> I’m happy to merge them….will get to them by tomorrow morning ok?

TBH I'd prefer to turn them off and reject the PRs.

Personally I don't see any value for our downstream users if we update
our dependencies without actually needing an update - with the exception
of security updates. I don't like the idea of forcing our users to
update a different dependency just because they update our component, it
should be their choice when to update what.

Of course this is just my opinion and I'm not exactly known as somebody
who embraces the idea of automatic resolution of transitive dependencies
in the first place ;-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[fileupload] Re: A release train...

2020-07-19 Thread Stefan Bodewig

On 2020-07-18, Merbin J Anselm wrote:

> Well. Commons Fileupload's last release was in December 2018 and it has
> been released at least once a year before that. My thoughts were on this
> line

Well, the question is whether there have been changes that would warrant
a new release at all. It hasn't seen that mayn commits the past eighteen
months.

I just had a very quick look at the commits since the last release and
one of the first things done was bumping the major version. So if
anybody cut a release from the master branch it would not be compatible
with 1.4 at all. Not sure this would help anybody.

As far as changes go I see your (Merbin's) FILEUPLOAD-274, adding
support for Jakarta API coordinates and some performance
improvements. I'm not sure whether this is all the people looking after
fileupload (I'm not one of them) had planned for a new major version.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] release validation

2020-07-12 Thread Stefan Bodewig

On 2020-07-12, Rob Tompkins wrote:

> given the consistency of the signatures from the plugins…do we need to
> check them for releases anymore?

Yes, please. Not everybody uses the plugins and even if everybody did a
misconfiguration could be pulling in the wrong key or a key not
available from the expected download location.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] COMPRESS-538 : about Zip64

2020-07-03 Thread Stefan Bodewig

On 2020-06-28, Peter Lee wrote:

> Currently we will add a Zip64 extra field for the entries with uncompressed
> size unspecified. And we will update the zip64 extra field in
> ZipArchiveOutputStream.rewriteSizesAndCrc a little bit : if we actually
> doesn't need a Zip64 extra, we will not remove it. Instead we keep it in
> Local File Header, and we update the 'Zip Version Needed to Extract' to the
> ones without zip64. Then we removed the extra field(after it's already
> written to the zip archive) to avoid the zip64 written to the Central
> Directory.

> Not sure why we are doing it like that.

We only know the sizes once we've written the entry's content to the
archive. The content is behind the extra field. Removing the extra field
from the archive would require us move all content a few bytes towards
the front of the archive. Potentially a lot of effort which will only
gain us a few bytes.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: some questions about commons projects.

2020-06-12 Thread Stefan Bodewig

On 2020-06-12, Gilles Sadowski wrote:

> 2020-06-12 15:52 UTC+02:00, Xeno Amess :

>> But if Apache Commons is thought to be a whole project, I do think
>> the relationship between each of its components should be enforced.

The Commons project is the legal entity that binds people with similar
interest in creating reusable components.

This group of people involves some who work on lots of components and
may strive for more standardization and others who are mostly interested
in one component and don't see any benefit in changing the placement of
braces in "their" component just because people who never worked on
"their" component liked a different style better.

Realistically there is far less cross polination between components than
you may expect. Things lice BCEL or Weaver need people who are familiar
with Java byte code. The Math components require a deeper understanding
of certain mathematical concepts than many coders have. Crypto, Compress
and others attract people with certain interests.

> Some regular contributors (or ancient contributors for
> old/mature components) will veto touching the code
> just for the sake of standardization.

That group likely includes me. Well, argue against not veto, actually.

>> For example, we might start from trying to use a same code style
>> formatter.

If you really want to discuss this we should split out a different
thread rather than polluting this one. It would probably lead to an
exchange of arguments and an agreement to disagree.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: some questions about commons projects.

2020-06-12 Thread Stefan Bodewig

On 2020-06-12, Xeno Amess wrote:

> Hi.

>>> 2. How are commons projects related?

>> They are not necessarily related.  Usually it is considered
>> a feature if a component has zero dependency (as it is was
>> easier to avoid "JAR hell").
>> However, there are also drawbacks, e.g. duplicating functionality
>> (and work) needed by several components.

> Something was not quite right about this.  For example, in
> commons-vfs, we just use commons-lang3 as a dependency.  But in
> commons-email, we fork some of utility functions in commons-lang3 as a
> java class in commons-email.  Which is the right way, or a more
> commonly accepted way in commons projects?

Neither is right or wrong in general, it all depends on the context.

VFS has a bunch of dependencies anyway, so adding a dependency on
commons-lang3 is no big deal. Email may have decided to copy a few
classes in order to avoid a depencency.

Another example I'm aware of is Compress which has copied code from
commons-io (basically parts of IOUtils) in order to avoid a
dependency. And it has copied classes developed in Comnpress to Codec
(some of the more exotic hashes/checksums) because they seemed to fit
there - but Compress didn't want to pay for this by adding a dependency.

One thing that may not have become clear from Gilles' great answer: many
decisions are made by the people who work on a concrete code base while
people only active elsewhere get out of the way. There are some common
grounds - rules that are common to all Apache projects mostly - but the
components operate rather autonomously.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: minor typos cleanup

2020-06-01 Thread Stefan Bodewig

On 2020-06-02, Peter Lee wrote:

> Oops, I was looking the commit and found two @throws
> IllegalArgumentException, and I was thinking this is a duplicated throws
> caused by copy-paste. And I was so much foolish that I deleted it without
> any invesgating into the code. Really sorry about this.

No problem and thank you for fixing it so quickly.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: minor typos cleanup

2020-06-01 Thread Stefan Bodewig

On 2020-06-01,  wrote:

> The following commit(s) were added to refs/heads/master by this push:
>  new 42b6aa4  minor typos cleanup
> - * @throws IllegalArgumentException if the {@link 
> TarArchiveOutputStream#longFileMode} equals
> - *  {@link 
> TarArchiveOutputStream#LONGFILE_ERROR} and the file
> - *  name is too long

why are you removing the javadocs about throwing on long names?

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: COMPRESS-530 : skip non-number when parsing pax header

2020-05-27 Thread Stefan Bodewig

On 2020-05-27, Peter Lee wrote:

> Did some googles, can't find too much but  this :
> https://www.systutorials.com/docs/linux/man/5-star/
> And it says :
>> Each record starts with a a decimal length field. The length includes the
>> total size of a record including the length field itself and the trailing
>> new line.

For some reason I always end up in either GNU tar's info or the FreeBDS
man page :-). In this case

https://www.freebsd.org/cgi/man.cgi?query=tar=5

see "Pax Interchange Format"

,
| The extended attributes themselves are stored as a series of text-format
| lines encoded in the portable UTF-8 encoding. Each line consists of a
| decimal number, a space, a key string, an equals sign, a value string,
| and a new line.
`

> Seems we should throw a exception.

+1 - likely an IOException.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: COMPRESS-529 : throws IOException if non-number exists in pax header

2020-05-27 Thread Stefan Bodewig

On 2020-05-27, Peter Lee wrote:

> Oops, sorry about that.

No big deal. Better we detect that now than at the point when we want to
cut the release.

>  Will undo all the commits.

It may be possible to keep most of your code changes without breaking
the public API. I must admit I haven't lloked at the patch in full
detail, yet.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: COMPRESS-530 : skip non-number when parsing pax header

2020-05-26 Thread Stefan Bodewig

On 2020-05-26,  wrote:

>+// COMPRESS-530 : skip non-number chars
>+if (ch < '0' || ch > '9') {
>+continue;
>+}

if this ever happens, doesn't that mean the PAX header is malformed? In
that case may it be better to throw an IOException?

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: COMPRESS-529 : throws IOException if non-number exists in pax header

2020-05-26 Thread Stefan Bodewig

On 2020-05-26,  wrote:

> -public void addPaxHeader(String name,String value) {
> - processPaxHeader(name,value);
> +public void addPaxHeader(String name, String value) throws IOException {
> +processPaxHeader(name, value);

no, we can't do that. Adding a checked exception to a public method
breaks source compatibility.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: Jenkins build is back to normal : Commons-Compress-Windows » Apache Commons Compress #731

2020-05-14 Thread Stefan Bodewig

there doesn't seem to be a JDK 7 for Windows in our Jenkins farm anymore
https://cwiki.apache.org/confluence/display/INFRA/JDK+Installation+Matrix

I've made the Windows build use JDK 8 and we now rely on the Linux build
to catch JDK 7 incompatibilites.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] Travis build fail with JDK14

2020-05-14 Thread Stefan Bodewig

On 2020-05-14, Peter Lee wrote:

> Unfortunately it seems commons-compress can not build on Java14. Maybe we
> should provide a statement about this in README or somewhere else?

Done.

Also I added something to the "known limitations" page and will
re-generate the website.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [COMPRESS] Travis build fail with JDK14

2020-05-13 Thread Stefan Bodewig

On 2020-05-13, Peter Lee wrote:

>  Hi,all

> The travis build of Compress is failing now cause the openjdk14 was added
> to travis.yml recently. The reason is the Pack200 was removed from JDK14
> and there was a discussion about it in January. Emmanuel is working on his
> replacement project(https://github.com/pack200/pack200) but not finished
> yet. Seems we have no good replacement for now.

> I'm thinking we should disable openjdk14 in travis before we have find a
> solution for this. WDYT?

I'm fine with disabling the travis build for now.

Had a quick look through JIRA as I was totally sure there must be an
issue tracking this, but it seems I haven't created any. Anyway with the
last release announcement I promised we'd deal with JDK14 for the next
release - one way or the other. :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: what became of beanshell in Apache commons?

2020-04-25 Thread Stefan Bodewig

On 2020-04-24, Peter Kovacs wrote:

> Now I figured that beanshell was included into apache copmmons, which we also 
> use.

The move has been proposed but actually never actually happened IIRC.

https://www.mail-archive.com/dev@taverna.incubator.apache.org/msg00224.html
is the best reference (more than five years ago, mind you) I could find.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Vote] Format of "git" tags

2020-04-01 Thread Stefan Bodewig

On 2020-04-01, Gary Gregory wrote:

> The docs should also make sure that release tags are in the form rel/...
> which makes them read-only.

So far I've created a new tag under rel/ for the RC tag when the vote
has been accepted. So only "real" releases end up there.

If we want to create all our tags there, things may look a bit
confusing. But I could live with that as well.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Vote] Format of "git" tags

2020-04-01 Thread Stefan Bodewig

On 2020-04-01, Gilles Sadowski wrote:

> Alternatives (using the yet-to-be-created tag for the release
> candidate of the first beta version of [Numbers] as an example):

>  [ ] Option 1: NUMBERS_1_0_BETA1_RC1
>  [ ] Option 2: commons-numbers-1.0-beta1-rc1
>  [ ] Option 3: commons-numbers_v1.0-beta1_rc1

+0 to each of them as long as
http://commons.apache.org/releases/prepare.html gets updated with the
outcome.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [commons-compress] branch master updated: Update my(Peter Lee) personal information in pom

2020-03-16 Thread Stefan Bodewig

welcome :-)

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress]Add some easy-to-use APIs for Zip and other archivers

2020-03-09 Thread Stefan Bodewig

On 2020-03-09, Peter Lee wrote:

> I'm thinking about adding some easy-to-use APIs for Zip. Currently I got
> some ideas :
> 1. Add extractAll(String targetPath) in ZipFile : extract all the files to
> the specific directory.
> 2. Add getInputStream(String fileName) in ZipFile : get the input stream
> for a file by name.

> And I believe these could also work in other archivers like tar, 7z and
> some other format.

> Do you think if this is a good idea or not?

There is https://issues.apache.org/jira/browse/COMPRESS-118 which at one
point in time was the issue with the most votes IIRC.

Around the time the ZipSlip vulnerability had to be fixed in various
projects we discussed adding such a high-level API to Compress. You can
find some code that I sketched back then inside of the examples package.

At that time the majority of Compress developers felt a high-level API
was out-of-scope for Compress and many people felt effort should rather
be spent at Commons VFS. See the threads

https://lists.apache.org/thread.html/0b86f62127f771a8ac3b6357a1c1bdb6b4d21bf18bc4a30d0b3650c8%40%3Cdev.commons.apache.org%3E

https://lists.apache.org/thread.html/bb205705291c00ac8d36516b287a14d814dc5f1fe4d422f3f5c0db28%40%3Cdev.commons.apache.org%3E

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Release Apache Commons Configuration 2.7 based on RC2

2020-03-09 Thread Stefan Bodewig

On 2020-03-09, Rob Tompkins wrote:

> We have fixed quite a few bugs and added some significant enhancements since 
> Apache Commons Configuration 2.6 was released, so I would like to release 
> Apache Commons Configuration 2.7.

+1

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [geometry] distribution svn url

2020-03-08 Thread Stefan Bodewig

On 2020-03-08, Matt Juntunen wrote:

> I don't currently have permissions for that. Is someone able to create
> "geometry" and "numbers" directories in there?

Done.

But I suspect you won't have permission to upload a distribution there
either.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [geometry] distribution svn url

2020-03-08 Thread Stefan Bodewig

On 2020-03-08, Matt Juntunen wrote:

> I'm creating a dist-archive module for commons-geometry using
> commons-rng as a template [1]. However, when I build the project it
> fails with an errors saying that the url
> https://dist.apache.org/repos/dist/dev/commons/geometry does not
> exist, which is indeed the case. Browsing that repo, I can see that
> directories exist for rng, collections, math and other released
> commons projects. How are these directories created?

Most of them are probably older than the commons release plugin and have
been created manually.

You should be able to do

svn mkdir https://dist.apache.org/repos/dist/dev/commons/geometry

yourself.

The release plugin should probably check whether the directory exists
and create it if necessary.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Compress] Zip Files: History, Explanation and Implementation

2020-03-07 Thread Stefan Bodewig

On 2020-03-07, Peter Lee wrote:

> I'm planning to build a pure Java deflater/inflater on my own. Believe this
> may help a lot.

Compress contains a pure Java Deflate64 deflater, which also is a
"normal" deflater by defintion. You may want to take a look at it.

When I implemented the LZ4 encoder I leaned on Peter Deutsch's
description of the LZ77 part of the deflate algorithm in RFC1951 but I
believe the original LZ4 code contains a faster matching algorithm than
that - zlib itself probably does so as well by now.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[compress] javadoc (was Re: [VOTE] Release Compress 1.20 based on RC2)

On 2020-02-08, Gary Gregory wrote:

> On Sat, Feb 8, 2020 at 11:50 AM Stefan Bodewig  wrote:

>> On 2020-02-08, Gary Gregory wrote:

>>> - mvn javadoc:javadoc outputs LOTS of errors.

>> Not a single one for me (building with Java 8).

> Did you run 'mvn javadoc:javadoc'?

Yes

> Please see https://pastebin.com/ZLVdrEhr

Th eoutput is talking about modules, so you must be using something
newer than Java8, which may explain the difference. We may need to
change the configuration to get rid of

,
| javadoc: error - The code being documented uses modules but the packages 
defined in https://docs.oracle.com/javase/9/docs/api/ are in the unnamed module
`

but I have no idea how. The whole bunch of error messages prior to that
seem unrelated to Commons Compress. The rest are HTML5 warnings, I'll
look into them.

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Release Compress 1.20 based on RC2

On 2020-02-08, Gary Gregory wrote:

> But next time:
> - On the site: it would be nice to keep all "What's new in 1.zzz?" sections
> so users can see what's new based on the version _they_ currently have.

Really, this is going to become a pretty long list. I've resurrected all
sections since we started providing them back with 1.6 so we'll see how
the page is going to look like :-)

> - mvn javadoc:javadoc outputs LOTS of errors.

Not a single one for me (building with Java 8).

Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[RESULT] Release Compress 1.20 based on RC2

With binding +1s by Gary, Rob and myself the vote has passed.

I'll start with publishing the artifacts and will announce the release
once the mirrors have had time to catch up.

Many thanks

 Stefan

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [VOTE] Release Compress 1.20 based on RC2