On Sat, 29 Sep 2018, Jörn Franke wrote:
as part of the HadoopOffice library (
https://github.com/zuinnote/hadoopoffice/wiki) we provide the
functionality to read office documents, such as MS Excel, on Big Data
platforms, such as Hadoop/Hive/Spark/Flink.
We should probably list that on the website! Do you have a few paragraph
blurb we can use?
I want to release a new version supporting POI 4.0.0, but I have one
remaining blocking issue: The Big Data platforms use an old version of
commons-compress (between 1.4.x and 1.9.x). This means I am always running
into the exception in ZipArchiveThresholdInputStream "InputStream of class
[..] is not implementing InputStreamStatistics" (
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup&pathrev=1832789
).
We need that for security reasons - newer Java versions won't let us
protect against zip bomb attacks as they inconveniently hide the expansion
stats, so we had to switch to commons to guard against it.
Unfortunately, updating these platforms to the latest commons-compress is
very intrusive and for many organizations not possible.
Wave some CVEs at them and see if you can tempt an upgrade?
If not, you'd probably need to work with the commons folks to backport the
zip stats stuff to your old version, so you can keep the security stuff we
need? dev@commons is moderately quiet and fairly friendly :)
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org