Thanks James for delving into this mess. It looks like this change is unavoidable if we want to modularize? I think this is OK. Will the CLI argument change as we continue modularizing, or is this the only change that will be needed?
On Mon, Dec 4, 2023, at 20:07, James Duong wrote: > Hello, > > I did some work to separate the below PR into smaller PRs. > > > * Updating the versions of dependencies and maven plugins is done > and merged into master. > * I separated out the work modularizing arrow-vector, > arrow-memory-core/unsafe, and arrow-memory-netty. > > Modularizing arrow-memory-core requires a smaller change to user > command-line arguments. Instead of: > --add-opens=java.base/java.nio=ALL-UNNAMED > > The user needs to add: > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED > > I initially tried to modularize arrow-vector separately from > arrow-memory-core but found that any meaningful operation in > arrow-vector would trigger an illegal access in memory-core if it > wasn’t modularized. > > I was able to run the tests for arrow-compression and arrow-tools > successfully after modularizing memory-core, memory-unsafe-, and > arrow-vector. Note that I had more success by making memory-core and > memory-unsafe automatic modules. > > I think we should make a decision here on if we want to bite the bullet > and introduce a breaking user-facing change around command-line > options. The other option is to wait for JDK 21 to modularize. That’s > farther down the line and requires refactoring much of the memory > module code and implementing a module using the foreign memory > interface. > > From: James Duong <james.du...@improving.com.INVALID> > Date: Tuesday, November 28, 2023 at 6:48 PM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > Module System > Hi, > > I’ve made some major progress on this work in this PR: > https://github.com/apache/arrow/pull/38876 > > > * The maven plugin for compiling module-info.java files using JDK 8 > is working correctly. > * arrow-format, arrow-memory-core, arrow-memory-netty, > arrow-memory-unsafe, and arrow-vector have been modularized > successfully. > * Tests pass locally for all of these modules. > * They fail in CI. This is likely from me not updating a profile > somewhere. > > Similar to David’s PR from below, arrow-memory and modules needed to be > refactored fairly significantly and split into two modules: a > public-facing JPMS module and a separate module which adds to Netty’s > packages (memory-netty-buffer-patch). What’s more problematic is that > because we are using named modules now, users need to add more > arguments to their Java command line to use arrow. If one were to use > arrow-memory-netty they would need to add the following: > > --add-opens java.base/jdk.internal.misc=io.netty.common > --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar > > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED > > Depending on where the memory-netty-buffer-patch JAR is located, and > what version, the command the user needs to supply changes, so this > seems like it’d be really inconvenient. > > Do we want to proceed with modularizing existing memory modules? Both > netty and unsafe? Or wait until the new memory module from Java 21 is > available? > > The module-info.java files are written fairly naively. I haven’t > inspected thoroughly to determine what packages users will need. > > We can continue modularizing more components in a separate PR. Ideally > all the user breakage (class movement, new command-line argument > requirements) happens within one major Arrow version. > > From: James Duong <james.du...@improving.com.INVALID> > Date: Tuesday, November 21, 2023 at 1:16 PM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > Module System > I’m following up on this topic. > > David has a PR from last year that’s done much of the heavy lifting for > refactoring the codebase to be package-friendly. > https://github.com/apache/arrow/pull/13072 > > What’s changed since and what’s left: > > * New components have been added (Flight SQL for example) that will > need to be updated for modules. > * There wasn’t a clear solution on how to do this without breaking > JDK 8 support. Compiling module-info.java files require using JDK9, but > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe. > * There is a Gradle plugin that can compile module-info.java > files purely syntactically that we can adapt to maven. It has > limitations (the one I see is that it can’t iterate through > classloaders to handle annotations), but using this might be a good > stopgap until we JDK 8 support is deprecated. > * Some plugins need to be updated: > * maven-dependency-plugin 3.0.1 can’t parse module-info.class > files. > * checkstyle 3.1.0 can’t parse module-info.java files. Our > existing checkstyle rules file can’t be loaded with newer versions. We > can exclude module-info.java for now and have a separate Issue for > updating checkstyle itself and the rules file. > * grpc-java could not be modularized when the PR above was written. > * Grpc 1.57 now can be modularized > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>) > > From: David Dali Susanibar Arce <davi.sar...@gmail.com> > Date: Wednesday, May 25, 2022 at 5:02 AM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module System > Hi All, > > This email's purpose is a request for comments to migrate Arrow Java to JPMS > Java Platform Module System <https://openjdk.java.net/projects/jigsaw/spec/> > JSE 9+ (1). > > Current status: > > - Arrow Java use JSE1.8 specification > > - Arrow Java works with JSE1.8/9/11/17 > > - This is possible because Java offers “legacy mode” > > Proposal: > > Migrate to JPMS Java Platform Module System. This Draft PR > <https://github.com/apache/arrow/pull/13072>(2<https://github.com/apache/arrow/pull/13072%3e(2<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>) > > contains an initial port of > the modules: Format / Memory Core / Memory Netty / Memory Unsafe / > Vector > for evaluation. > > Main Reason to migrate: > > - JPMS offer Strong encapsulation, Well-defined interfaces > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit dependencies. > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4) > > - JPMS offer reliable configuration and security to hide platform internals. > > - JPMS offers a partial solution to solve problems about read (80%) /write > (20%) code. > > - JPMS offer optimization for readability about read/write ratio (90/10) > thru module-info.java. > > - Consistency logs, JPMS implement consistency logs to really use that to > solve the current problem. > > - Be able to customize JRE needed with only modules needed (not > java.desktop for example and others) thru JLink. > > - Modules have also been implemented by other languages such as Javascript > (ES2015), C++(C++20), Net (Nuget/NetCore).. > > - Consider taking a look at this discussion about pros/cons > <https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/> > (5). > > - Eventual migration to JPMS is a practical necessity as more projects > migrate. > > Effort: > > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able to > offer support for both jar components JSE1.8 and JSE9+ included. > > - Go bottom up for JPMS. > > - Packages need to be unique (i.e. org.apache.arrow.memory / > io.netty.buffer). Review Draft PR with initial proposal. > > - Dependencies also need to be modularized. If some of our current > dependencies are not able to be used as a module this will be a blocker for > our modules (we could patch that but this is an extra effort). > > Killers: > > - FIXME! I need your support to identify killer reasons to be able to push > this implementation. > > Please let us know if Arrow Java to JPMS Java Platform Module System is > needed and should be implemented. > > Please use this file for any comments > https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing > > Resources used: > > (1): https://openjdk.java.net/projects/jigsaw/spec/ > > (2): https://github.com/apache/arrow/pull/13072 > > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/ > > (4): https://github.com/nipafx/demo-jigsaw-reflection > > (5): > https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ > > Best regards, > > -- > David