On Sat, 29 Sep 2018, Jörn Franke wrote:
as part of the HadoopOffice library ( https://github.com/zuinnote/hadoopoffice/wiki) we provide the functionality to read office documents, such as MS Excel, on Big Data platforms, such as Hadoop/Hive/Spark/Flink.

We should probably list that on the website! Do you have a few paragraph blurb we can use?

I want to release a new version supporting POI 4.0.0, but I have one
remaining blocking issue: The Big Data platforms use an old version of
commons-compress (between 1.4.x and 1.9.x). This means I am always running
into the exception in ZipArchiveThresholdInputStream "InputStream of class
[..] is not implementing InputStreamStatistics" (
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup&pathrev=1832789
).

We need that for security reasons - newer Java versions won't let us protect against zip bomb attacks as they inconveniently hide the expansion stats, so we had to switch to commons to guard against it.

Unfortunately, updating these platforms to the latest commons-compress is
very intrusive and for many organizations not possible.

Wave some CVEs at them and see if you can tempt an upgrade?

If not, you'd probably need to work with the commons folks to backport the zip stats stuff to your old version, so you can keep the security stuff we need? dev@commons is moderately quiet and fairly friendly :)

Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to