For the work on modularization, I’ve merged the flight-grpc and flight-core modules. I haven’t moved any other packages.
I also encountered this problem with flight-core. Flight-core derives from io.netty.buffer (specifically CompositeByteBuf), but we cannot use Netty buffer with JPMS without requiring users to monkey-patch it with patch-module. I have flight-core using Netty buffer on the classpath, but deriving from CompositeByteBuf requires adding --add-reads=org.apache.arrow.flight.core=ALL-UNNAMED to the command line since Netty’s being treated as part of the unnamed module in this context. This isn’t terribly onerous from the user’s perspective (especially compared to monkey-patching), but we start getting into the business of having different command-line arguments for different modules the user might be using. We could tell users to always add –add-reads=org.apache.arrow.flight.core=ALL-UNNAMED, but if they aren’t using Flight, they’ll get a warning about an unrecognized module. Any thoughts about the best way to present these command-line changes? From: Jean-Baptiste Onofré <j...@nanthrax.net> Date: Wednesday, December 13, 2023 at 6:58 AM To: dev@arrow.apache.org <dev@arrow.apache.org> Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module System Hi, We have a "split packages" between flight-core and flight-grpc. I don't think anyone is using only flight-core to create a new "transport". I think it's acceptable to combine flight-core and flight-grpc, and maybe reshape a bit (imho some classes from flight-grpc should be actually in flight-core). Users can always reshade (even if it's painful). Regards JB On Tue, Dec 12, 2023 at 8:52 PM David Li <lidav...@apache.org> wrote: > > I would be OK with combining flight-core and flight-grpc. It's not clear to > me anyways that the current split would actually be helpful if we do ever > want to support Flight without gRPC. > > On Tue, Dec 12, 2023, at 14:21, James Duong wrote: > > An update on this work. > > > > I’ve been able to update the following to JPMS modules now: > > > > * arrow-memory-core > > * arrow-memory-unsafe > > * arrow-format > > * arrow-vector > > * arrow-compression > > * arrow-tools > > * arrow-avro > > * arrow-jdbc > > * arrow-algorithm > > > > I don’t think the following should be modularized (but would appreciate > > feedback as well): > > > > * arrow-performance (test-only module) > > * flight-integration-tests (test-only module) > > * flight-jdbc-core > > * flight-jdbc-driver (Need to think about this. MySQL suggests they > > want to modularize their JDBC driver: > > https://bugs.mysql.com/bug.php?id=97683 . How does this interact with > > the ServiceLoader changes in JPMS as well). > > > > I’m starting the flight-related modules now. The first issue I’ve > > noticed is that flight-core and flight-grpc reuse packages. I’d like to > > combine flight-core and flight-grpc because of this (currently we only > > have Flight using grpc in Java). > > > > After Flight I’d take a look at the modules that have native code: > > > > * arrow-c-data > > * arrow-orc > > * arrow-dataset > > * arrow-gandiva > > > > > > From: James Duong <james.du...@improving.com.INVALID> > > Date: Tuesday, December 5, 2023 at 1:39 PM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > > Module System > > I expect that we’d still have to change CLI arguments in JDK17. > > > > The need for changing the CLI arguments is due to memory-core now being > > a named module while requiring access to Unsafe and using reflection > > for accessing internals of java.nio.Buffer. > > > > We’re using JDK8 code for doing both currently. It might be worth > > looking into if there’s a JDK9+ way of doing this without needing > > reflection and compiling memory-core as a multi-release JAR. > > > > I don’t expect more CLI argument changes unless we use reflection on > > NIO classes in other modules. > > > > > > From: Adam Lippai <a...@rigo.sk> > > Date: Tuesday, December 5, 2023 at 9:28 AM > > To: dev@arrow.apache.org <dev@arrow.apache.org> > > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > > Module System > > I believe Spark 4.0 was mentioned before. It’ll require Java 17 and > > will be > > released in a few months (June?). > > > > Best regards, > > Adam Lippai > > > > On Tue, Dec 5, 2023 at 12:05 David Li <lidav...@apache.org> wrote: > > > >> Thanks James for delving into this mess. > >> > >> It looks like this change is unavoidable if we want to modularize? I think > >> this is OK. Will the CLI argument change as we continue modularizing, or is > >> this the only change that will be needed? > >> > >> On Mon, Dec 4, 2023, at 20:07, James Duong wrote: > >> > Hello, > >> > > >> > I did some work to separate the below PR into smaller PRs. > >> > > >> > > >> > * Updating the versions of dependencies and maven plugins is done > >> > and merged into master. > >> > * I separated out the work modularizing arrow-vector, > >> > arrow-memory-core/unsafe, and arrow-memory-netty. > >> > > >> > Modularizing arrow-memory-core requires a smaller change to user > >> > command-line arguments. Instead of: > >> > --add-opens=java.base/java.nio=ALL-UNNAMED > >> > > >> > The user needs to add: > >> > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED > >> > > >> > I initially tried to modularize arrow-vector separately from > >> > arrow-memory-core but found that any meaningful operation in > >> > arrow-vector would trigger an illegal access in memory-core if it > >> > wasn’t modularized. > >> > > >> > I was able to run the tests for arrow-compression and arrow-tools > >> > successfully after modularizing memory-core, memory-unsafe-, and > >> > arrow-vector. Note that I had more success by making memory-core and > >> > memory-unsafe automatic modules. > >> > > >> > I think we should make a decision here on if we want to bite the bullet > >> > and introduce a breaking user-facing change around command-line > >> > options. The other option is to wait for JDK 21 to modularize. That’s > >> > farther down the line and requires refactoring much of the memory > >> > module code and implementing a module using the foreign memory > >> > interface. > >> > > >> > From: James Duong <james.du...@improving.com.INVALID> > >> > Date: Tuesday, November 28, 2023 at 6:48 PM > >> > To: dev@arrow.apache.org <dev@arrow.apache.org> > >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > >> > Module System > >> > Hi, > >> > > >> > I’ve made some major progress on this work in this PR: > >> > https://github.com/apache/arrow/pull/38876 > >> > > >> > > >> > * The maven plugin for compiling module-info.java files using JDK 8 > >> > is working correctly. > >> > * arrow-format, arrow-memory-core, arrow-memory-netty, > >> > arrow-memory-unsafe, and arrow-vector have been modularized > >> > successfully. > >> > * Tests pass locally for all of these modules. > >> > * They fail in CI. This is likely from me not updating a profile > >> > somewhere. > >> > > >> > Similar to David’s PR from below, arrow-memory and modules needed to be > >> > refactored fairly significantly and split into two modules: a > >> > public-facing JPMS module and a separate module which adds to Netty’s > >> > packages (memory-netty-buffer-patch). What’s more problematic is that > >> > because we are using named modules now, users need to add more > >> > arguments to their Java command line to use arrow. If one were to use > >> > arrow-memory-netty they would need to add the following: > >> > > >> > --add-opens java.base/jdk.internal.misc=io.netty.common > >> > > >> --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar > >> > >> > > >> --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED > >> > > >> > Depending on where the memory-netty-buffer-patch JAR is located, and > >> > what version, the command the user needs to supply changes, so this > >> > seems like it’d be really inconvenient. > >> > > >> > Do we want to proceed with modularizing existing memory modules? Both > >> > netty and unsafe? Or wait until the new memory module from Java 21 is > >> > available? > >> > > >> > The module-info.java files are written fairly naively. I haven’t > >> > inspected thoroughly to determine what packages users will need. > >> > > >> > We can continue modularizing more components in a separate PR. Ideally > >> > all the user breakage (class movement, new command-line argument > >> > requirements) happens within one major Arrow version. > >> > > >> > From: James Duong <james.du...@improving.com.INVALID> > >> > Date: Tuesday, November 21, 2023 at 1:16 PM > >> > To: dev@arrow.apache.org <dev@arrow.apache.org> > >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > >> > Module System > >> > I’m following up on this topic. > >> > > >> > David has a PR from last year that’s done much of the heavy lifting for > >> > refactoring the codebase to be package-friendly. > >> > https://github.com/apache/arrow/pull/13072 > >> > > >> > What’s changed since and what’s left: > >> > > >> > * New components have been added (Flight SQL for example) that will > >> > need to be updated for modules. > >> > * There wasn’t a clear solution on how to do this without breaking > >> > JDK 8 support. Compiling module-info.java files require using JDK9, but > >> > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe. > >> > * There is a Gradle plugin that can compile module-info.java > >> > files purely syntactically that we can adapt to maven. It has > >> > limitations (the one I see is that it can’t iterate through > >> > classloaders to handle annotations), but using this might be a good > >> > stopgap until we JDK 8 support is deprecated. > >> > * Some plugins need to be updated: > >> > * maven-dependency-plugin 3.0.1 can’t parse module-info.class > >> > files. > >> > * checkstyle 3.1.0 can’t parse module-info.java files. Our > >> > existing checkstyle rules file can’t be loaded with newer versions. We > >> > can exclude module-info.java for now and have a separate Issue for > >> > updating checkstyle itself and the rules file. > >> > * grpc-java could not be modularized when the PR above was written. > >> > * Grpc 1.57 now can be modularized > >> > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>) > >> > > >> > From: David Dali Susanibar Arce <davi.sar...@gmail.com> > >> > Date: Wednesday, May 25, 2022 at 5:02 AM > >> > To: dev@arrow.apache.org <dev@arrow.apache.org> > >> > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module > >> System > >> > Hi All, > >> > > >> > This email's purpose is a request for comments to migrate Arrow Java to > >> JPMS > >> > Java Platform Module System < > >> https://openjdk.java.net/projects/jigsaw/spec/><https://openjdk.java.net/projects/jigsaw/spec/%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e%3e> > >> > JSE 9+ (1). > >> > > >> > Current status: > >> > > >> > - Arrow Java use JSE1.8 specification > >> > > >> > - Arrow Java works with JSE1.8/9/11/17 > >> > > >> > - This is possible because Java offers “legacy mode” > >> > > >> > Proposal: > >> > > >> > Migrate to JPMS Java Platform Module System. This Draft PR > >> > <https://github.com/apache/arrow/pull/13072>(2< > >> https://github.com/apache/arrow/pull/13072%3e(2<<https://github.com/apache/arrow/pull/13072%3e(2%3c><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e%3e> > >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2 > >> < > >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>)<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e%3e> > >> > >> > contains an initial port of > >> > the modules: Format / Memory Core / Memory Netty / Memory Unsafe / > >> > Vector > >> > for evaluation. > >> > > >> > Main Reason to migrate: > >> > > >> > - JPMS offer Strong encapsulation, Well-defined interfaces > >> > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit > >> dependencies. > >> > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4) > >> > > >> > - JPMS offer reliable configuration and security to hide platform > >> internals. > >> > > >> > - JPMS offers a partial solution to solve problems about read (80%) > >> /write > >> > (20%) code. > >> > > >> > - JPMS offer optimization for readability about read/write ratio (90/10) > >> > thru module-info.java. > >> > > >> > - Consistency logs, JPMS implement consistency logs to really use that to > >> > solve the current problem. > >> > > >> > - Be able to customize JRE needed with only modules needed (not > >> > java.desktop for example and others) thru JLink. > >> > > >> > - Modules have also been implemented by other languages such as > >> Javascript > >> > (ES2015), C++(C++20), Net (Nuget/NetCore).. > >> > > >> > - Consider taking a look at this discussion about pros/cons > >> > < > >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ > >> > > >> > (5). > >> > > >> > - Eventual migration to JPMS is a practical necessity as more projects > >> > migrate. > >> > > >> > Effort: > >> > > >> > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able > >> to > >> > offer support for both jar components JSE1.8 and JSE9+ included. > >> > > >> > - Go bottom up for JPMS. > >> > > >> > - Packages need to be unique (i.e. org.apache.arrow.memory / > >> > io.netty.buffer). Review Draft PR with initial proposal. > >> > > >> > - Dependencies also need to be modularized. If some of our current > >> > dependencies are not able to be used as a module this will be a blocker > >> for > >> > our modules (we could patch that but this is an extra effort). > >> > > >> > Killers: > >> > > >> > - FIXME! I need your support to identify killer reasons to be able to > >> push > >> > this implementation. > >> > > >> > Please let us know if Arrow Java to JPMS Java Platform Module System is > >> > needed and should be implemented. > >> > > >> > Please use this file for any comments > >> > > >> https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing > >> > > >> > Resources used: > >> > > >> > (1): https://openjdk.java.net/projects/jigsaw/spec/ > >> > > >> > (2): https://github.com/apache/arrow/pull/13072 > >> > > >> > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/ > >> > > >> > (4): https://github.com/nipafx/demo-jigsaw-reflection > >> > > >> > (5): > >> > > >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ > >> > > >> > Best regards, > >> > > >> > -- > >> > David > >>