I feel really bad that I mixed up versions when I asked this. POI 5 can of 
course stay on Java 8 and everybody can be using POI 5 for as long as they 
want. With Java 11 having reached Premier Support EOL in September last year, 
we should be really having this conversation about Java 17 for POI 6 now.

IMHO anyone still running on Java 8 in 2024 either does not care about running 
the latest version of every library they use, or accepts that rather sooner 
than later his dependencies might not provide fixes for bugs and security 
issues any more.

> * I am not aware of any dependency that we rely on that has fixes that we 
> can't uptake if we stick to Java 8 - ie the projects still publish Java 8 
> friendly releases even if they have higher version releases that don't 
> support Java 8

We are talking about the next major release of POI that will be in production 
through the coming years. Dependencies that come to mind:
- javax.xml.bind is deprecated. The natural replacement would be 
jakarta.xml.bind that already requires Java 11.
- PDFBox will move to Java 11 in their next major version.
- Log4J 3 is currently in beta and has bumped the required runtime version to 
Java 17 (https://logging.apache.org/log4j/3.x/release-notes.html).

Why can’t we do the same thing as those dependencies you mentioned? Publish a 
Java 8 friendly POI 5 and POI 6 using a newer Java baseline?

> * I am not aware of any major Java runtime features that we need to uptake 
> that are not in Java 8

I am also not aware of any runtime features that POI needs that could not have 
been solved in Java 4. But what we end up with is code that is slow and adds 
maintenance cost that enables POI to be compatible with Java 8 and is 
completely useless on Java 11+. 

- Improved I/O in Java 11: 

  Take the IOUtils.copy() methods as an example. They could be replaced by a 
single IOStream.transferTo() call in Java 11 but we still copy every byte 
manually.

  Another example: there are several toByteARRAY() METHODS IN IoUtils that are 
all implemented by calling this method: 

    private static byte[] toByteArray(InputStream stream, final int length, 
final int maxLength,
                                      final boolean checkEOFException, final 
boolean isLengthKnown) throws IOException {
        if (length < 0 || maxLength < 0) {
            throw new RecordFormatException("Can't allocate an array of length 
< 0");
        }
        final int derivedMaxLength = Math.max(maxLength, 
BYTE_ARRAY_MAX_OVERRIDE);
        if ((length != Integer.MAX_VALUE) || (derivedMaxLength != 
Integer.MAX_VALUE)) {
            checkLength(length, derivedMaxLength);
        }

        final int derivedLen = isLengthKnown ? Math.min(length, 
derivedMaxLength) : derivedMaxLength;
        final int byteArrayInitLen = 
calculateByteArrayInitLength(isLengthKnown, length, derivedMaxLength);
        final int internalBufferLen = DEFAULT_BUFFER_SIZE;
        try (UnsynchronizedByteArrayOutputStream baos = 
UnsynchronizedByteArrayOutputStream.builder().setBufferSize(byteArrayInitLen).get())
 {
            byte[] buffer = new byte[internalBufferLen];
            int totalBytes = 0, readBytes;
            do {
                readBytes = stream.read(buffer, 0, Math.min(internalBufferLen, 
derivedLen - totalBytes));
                totalBytes += Math.max(readBytes, 0);
                if (readBytes > 0) {
                    baos.write(buffer, 0, readBytes);
                }
                checkByteSizeLimit(totalBytes);
            } while (totalBytes < derivedLen && readBytes > -1);

            if (BYTE_ARRAY_MAX_OVERRIDE < 0 && readBytes > -1 && !isLengthKnown 
&& stream.read() >= 0) {
                throwRecordTruncationException(derivedMaxLength);
            }

            if (checkEOFException && derivedLen != Integer.MAX_VALUE && 
totalBytes < derivedLen) {
                throw new EOFException("unexpected EOF - expected len: " + 
derivedLen + " - actual len: " + totalBytes);
            }

            return baos.toByteArray();
        }
    }

In Java 11, you’d call either stream.readNBytes() or stream.readAllBytes() and 
put away with the IoUtils.toByteArray() implementation.

- String improvements: 

  Currently we have to use code like `textContent.trim().split("\n“)` to create 
an array of lines and then use a for-each loop to process the entries. Not only 
is the regex compiled every time, but we also keep a string array in memory 
that takes at least as much space as the input string. In java 11, we’d work on 
the stream returned by textContent.trim().lines() that does neither require 
compiling the regex nor keeping a full copy of the input in memory.

- A cleaner API

  Instead of returning null (from public methods), we could return Optional.

- An API to clean up resources:

  Cleaner introduced in Java 9 can help reduce memory footprint with very low 
overhead by automatically cleaning up unused resources in cases where 
try-with-resources cannot be used. If I remember correctly, we currently have 
some bug reports that might be solved by using a Cleaner, but I wouldn’t know 
how to properly fix those in Java 8.

With Java 17 we’d get:
- records: these could be used both internally and in the API and reduce 
boilerplate code
- pattern matching for instanceof: This is not something one cannot live 
without, but it can make the code much more concise and easier to maintain.
- the usual additions to the standard library 
(https://javaalmanac.io/jdk/18/apidiff/8/)

> * For me, there is a better solution to optimising support for newer Java 
> versions while still supporting older Java versions - Multi Release Jars [1]

That’s like doing a fork in one code base. We currently do that for providing a 
module-info.class file. But do you really want to extend that to utility 
classes? IMHO that would be a maintenance nightmare.

> * We have other Apache projects like Tika, Drill and Linkis that use POI and 
> some of those still apps still use Java 8 builds. We have 1000s of other 
> projects that depend on us - eg [2]


Apache Tika will require Java 11 in its upcoming version. Drill and Linkis can 
stay on POI 5 and we can still provide security and important bug fixes for 
that version.

And those projects won’t suddenly stop working because the next major release 
of POI switches the Java version. They can stay on 5 for the time being. Also, 
I took the time and looked at the most used projects from that list, trying to 
figure out what Java version they require:

- Spring is already on Java 17
- DbUnit is on Java 1.4 - but that version is not even supported by current 
POI, and there has been no update for two years.
- Apache Tika 3 will require Java 11
- PDI engine is Java 11
- EasyExcel is on Java 8
- JasperReports is  on 8 but prepares switching to 11
- Primefaces is on 8, but next version is already in RC and requires 11
- Drools is on 11
- OpenCms is on 8
- WSO2 Carbon API: didn’t find information about the java version
- Silverpeas is on 11
- HAPI FHIR: I didn’t find a reference to the Java version, but they use Spring 
Boot 3 which is on 17
- Jahia is on Java 11

So the majority is already on Java 11+.

> * If you look at Stackoverflow or our mailing lists, there is a large number 
> of users who are using old POI versions and I think we need to avoid making 
> it harder for those users to upgrade. Java 8 still gets regular security 
> patches and depending on what you read, as many as 30% of Java users still 
> use Java 8 (eg [3[).


I think a large number will need to upgrade rather sooner than later with 
practically all application servers man many other projects either are  already 
on 11 or even 17 or preparing the switch to Java 17. JBOSS/WildFly, Spring, 
Quarkus, JOOQ, and many others.

Even Java 6 is supported until at least end of 2027 (by Azul), while other 
vendors have already dropped Java 8 (Mircosoft OpenJdk).

Should we really dictate let those who refuse or cannot update their systems 
dictate the future development of POI?

I really think we should move on to at least 11.


Reply via email to