Hi, I’ve made some major progress on this work in this PR: https://github.com/apache/arrow/pull/38876
* The maven plugin for compiling module-info.java files using JDK 8 is working correctly. * arrow-format, arrow-memory-core, arrow-memory-netty, arrow-memory-unsafe, and arrow-vector have been modularized successfully. * Tests pass locally for all of these modules. * They fail in CI. This is likely from me not updating a profile somewhere. Similar to David’s PR from below, arrow-memory and modules needed to be refactored fairly significantly and split into two modules: a public-facing JPMS module and a separate module which adds to Netty’s packages (memory-netty-buffer-patch). What’s more problematic is that because we are using named modules now, users need to add more arguments to their Java command line to use arrow. If one were to use arrow-memory-netty they would need to add the following: --add-opens java.base/jdk.internal.misc=io.netty.common --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED Depending on where the memory-netty-buffer-patch JAR is located, and what version, the command the user needs to supply changes, so this seems like it’d be really inconvenient. Do we want to proceed with modularizing existing memory modules? Both netty and unsafe? Or wait until the new memory module from Java 21 is available? The module-info.java files are written fairly naively. I haven’t inspected thoroughly to determine what packages users will need. We can continue modularizing more components in a separate PR. Ideally all the user breakage (class movement, new command-line argument requirements) happens within one major Arrow version. From: James Duong <james.du...@improving.com.INVALID> Date: Tuesday, November 21, 2023 at 1:16 PM To: dev@arrow.apache.org <dev@arrow.apache.org> Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module System I’m following up on this topic. David has a PR from last year that’s done much of the heavy lifting for refactoring the codebase to be package-friendly. https://github.com/apache/arrow/pull/13072 What’s changed since and what’s left: * New components have been added (Flight SQL for example) that will need to be updated for modules. * There wasn’t a clear solution on how to do this without breaking JDK 8 support. Compiling module-info.java files require using JDK9, but using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe. * There is a Gradle plugin that can compile module-info.java files purely syntactically that we can adapt to maven. It has limitations (the one I see is that it can’t iterate through classloaders to handle annotations), but using this might be a good stopgap until we JDK 8 support is deprecated. * Some plugins need to be updated: * maven-dependency-plugin 3.0.1 can’t parse module-info.class files. * checkstyle 3.1.0 can’t parse module-info.java files. Our existing checkstyle rules file can’t be loaded with newer versions. We can exclude module-info.java for now and have a separate Issue for updating checkstyle itself and the rules file. * grpc-java could not be modularized when the PR above was written. * Grpc 1.57 now can be modularized (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>) From: David Dali Susanibar Arce <davi.sar...@gmail.com> Date: Wednesday, May 25, 2022 at 5:02 AM To: dev@arrow.apache.org <dev@arrow.apache.org> Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module System Hi All, This email's purpose is a request for comments to migrate Arrow Java to JPMS Java Platform Module System <https://openjdk.java.net/projects/jigsaw/spec/> JSE 9+ (1). Current status: - Arrow Java use JSE1.8 specification - Arrow Java works with JSE1.8/9/11/17 - This is possible because Java offers “legacy mode” Proposal: Migrate to JPMS Java Platform Module System. This Draft PR <https://github.com/apache/arrow/pull/13072>(2<https://github.com/apache/arrow/pull/13072%3e(2<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>) contains an initial port of the modules: Format / Memory Core / Memory Netty / Memory Unsafe / Vector for evaluation. Main Reason to migrate: - JPMS offer Strong encapsulation, Well-defined interfaces <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit dependencies. <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4) - JPMS offer reliable configuration and security to hide platform internals. - JPMS offers a partial solution to solve problems about read (80%) /write (20%) code. - JPMS offer optimization for readability about read/write ratio (90/10) thru module-info.java. - Consistency logs, JPMS implement consistency logs to really use that to solve the current problem. - Be able to customize JRE needed with only modules needed (not java.desktop for example and others) thru JLink. - Modules have also been implemented by other languages such as Javascript (ES2015), C++(C++20), Net (Nuget/NetCore).. - Consider taking a look at this discussion about pros/cons <https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/> (5). - Eventual migration to JPMS is a practical necessity as more projects migrate. Effort: - First of all we need to decide to move from JSE1.8 to JSE9+ or be able to offer support for both jar components JSE1.8 and JSE9+ included. - Go bottom up for JPMS. - Packages need to be unique (i.e. org.apache.arrow.memory / io.netty.buffer). Review Draft PR with initial proposal. - Dependencies also need to be modularized. If some of our current dependencies are not able to be used as a module this will be a blocker for our modules (we could patch that but this is an extra effort). Killers: - FIXME! I need your support to identify killer reasons to be able to push this implementation. Please let us know if Arrow Java to JPMS Java Platform Module System is needed and should be implemented. Please use this file for any comments https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing Resources used: (1): https://openjdk.java.net/projects/jigsaw/spec/ (2): https://github.com/apache/arrow/pull/13072 (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/ (4): https://github.com/nipafx/demo-jigsaw-reflection (5): https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ Best regards, -- David