I would be OK with combining flight-core and flight-grpc. It's not clear to me anyways that the current split would actually be helpful if we do ever want to support Flight without gRPC.
On Tue, Dec 12, 2023, at 14:21, James Duong wrote: > An update on this work. > > I’ve been able to update the following to JPMS modules now: > > * arrow-memory-core > * arrow-memory-unsafe > * arrow-format > * arrow-vector > * arrow-compression > * arrow-tools > * arrow-avro > * arrow-jdbc > * arrow-algorithm > > I don’t think the following should be modularized (but would appreciate > feedback as well): > > * arrow-performance (test-only module) > * flight-integration-tests (test-only module) > * flight-jdbc-core > * flight-jdbc-driver (Need to think about this. MySQL suggests they > want to modularize their JDBC driver: > https://bugs.mysql.com/bug.php?id=97683 . How does this interact with > the ServiceLoader changes in JPMS as well). > > I’m starting the flight-related modules now. The first issue I’ve > noticed is that flight-core and flight-grpc reuse packages. I’d like to > combine flight-core and flight-grpc because of this (currently we only > have Flight using grpc in Java). > > After Flight I’d take a look at the modules that have native code: > > * arrow-c-data > * arrow-orc > * arrow-dataset > * arrow-gandiva > > > From: James Duong <james.du...@improving.com.INVALID> > Date: Tuesday, December 5, 2023 at 1:39 PM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > Module System > I expect that we’d still have to change CLI arguments in JDK17. > > The need for changing the CLI arguments is due to memory-core now being > a named module while requiring access to Unsafe and using reflection > for accessing internals of java.nio.Buffer. > > We’re using JDK8 code for doing both currently. It might be worth > looking into if there’s a JDK9+ way of doing this without needing > reflection and compiling memory-core as a multi-release JAR. > > I don’t expect more CLI argument changes unless we use reflection on > NIO classes in other modules. > > > From: Adam Lippai <a...@rigo.sk> > Date: Tuesday, December 5, 2023 at 9:28 AM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform > Module System > I believe Spark 4.0 was mentioned before. It’ll require Java 17 and > will be > released in a few months (June?). > > Best regards, > Adam Lippai > > On Tue, Dec 5, 2023 at 12:05 David Li <lidav...@apache.org> wrote: > >> Thanks James for delving into this mess. >> >> It looks like this change is unavoidable if we want to modularize? I think >> this is OK. Will the CLI argument change as we continue modularizing, or is >> this the only change that will be needed? >> >> On Mon, Dec 4, 2023, at 20:07, James Duong wrote: >> > Hello, >> > >> > I did some work to separate the below PR into smaller PRs. >> > >> > >> > * Updating the versions of dependencies and maven plugins is done >> > and merged into master. >> > * I separated out the work modularizing arrow-vector, >> > arrow-memory-core/unsafe, and arrow-memory-netty. >> > >> > Modularizing arrow-memory-core requires a smaller change to user >> > command-line arguments. Instead of: >> > --add-opens=java.base/java.nio=ALL-UNNAMED >> > >> > The user needs to add: >> > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED >> > >> > I initially tried to modularize arrow-vector separately from >> > arrow-memory-core but found that any meaningful operation in >> > arrow-vector would trigger an illegal access in memory-core if it >> > wasn’t modularized. >> > >> > I was able to run the tests for arrow-compression and arrow-tools >> > successfully after modularizing memory-core, memory-unsafe-, and >> > arrow-vector. Note that I had more success by making memory-core and >> > memory-unsafe automatic modules. >> > >> > I think we should make a decision here on if we want to bite the bullet >> > and introduce a breaking user-facing change around command-line >> > options. The other option is to wait for JDK 21 to modularize. That’s >> > farther down the line and requires refactoring much of the memory >> > module code and implementing a module using the foreign memory >> > interface. >> > >> > From: James Duong <james.du...@improving.com.INVALID> >> > Date: Tuesday, November 28, 2023 at 6:48 PM >> > To: dev@arrow.apache.org <dev@arrow.apache.org> >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform >> > Module System >> > Hi, >> > >> > I’ve made some major progress on this work in this PR: >> > https://github.com/apache/arrow/pull/38876 >> > >> > >> > * The maven plugin for compiling module-info.java files using JDK 8 >> > is working correctly. >> > * arrow-format, arrow-memory-core, arrow-memory-netty, >> > arrow-memory-unsafe, and arrow-vector have been modularized >> > successfully. >> > * Tests pass locally for all of these modules. >> > * They fail in CI. This is likely from me not updating a profile >> > somewhere. >> > >> > Similar to David’s PR from below, arrow-memory and modules needed to be >> > refactored fairly significantly and split into two modules: a >> > public-facing JPMS module and a separate module which adds to Netty’s >> > packages (memory-netty-buffer-patch). What’s more problematic is that >> > because we are using named modules now, users need to add more >> > arguments to their Java command line to use arrow. If one were to use >> > arrow-memory-netty they would need to add the following: >> > >> > --add-opens java.base/jdk.internal.misc=io.netty.common >> > >> --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar >> >> > >> --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED >> > >> > Depending on where the memory-netty-buffer-patch JAR is located, and >> > what version, the command the user needs to supply changes, so this >> > seems like it’d be really inconvenient. >> > >> > Do we want to proceed with modularizing existing memory modules? Both >> > netty and unsafe? Or wait until the new memory module from Java 21 is >> > available? >> > >> > The module-info.java files are written fairly naively. I haven’t >> > inspected thoroughly to determine what packages users will need. >> > >> > We can continue modularizing more components in a separate PR. Ideally >> > all the user breakage (class movement, new command-line argument >> > requirements) happens within one major Arrow version. >> > >> > From: James Duong <james.du...@improving.com.INVALID> >> > Date: Tuesday, November 21, 2023 at 1:16 PM >> > To: dev@arrow.apache.org <dev@arrow.apache.org> >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform >> > Module System >> > I’m following up on this topic. >> > >> > David has a PR from last year that’s done much of the heavy lifting for >> > refactoring the codebase to be package-friendly. >> > https://github.com/apache/arrow/pull/13072 >> > >> > What’s changed since and what’s left: >> > >> > * New components have been added (Flight SQL for example) that will >> > need to be updated for modules. >> > * There wasn’t a clear solution on how to do this without breaking >> > JDK 8 support. Compiling module-info.java files require using JDK9, but >> > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe. >> > * There is a Gradle plugin that can compile module-info.java >> > files purely syntactically that we can adapt to maven. It has >> > limitations (the one I see is that it can’t iterate through >> > classloaders to handle annotations), but using this might be a good >> > stopgap until we JDK 8 support is deprecated. >> > * Some plugins need to be updated: >> > * maven-dependency-plugin 3.0.1 can’t parse module-info.class >> > files. >> > * checkstyle 3.1.0 can’t parse module-info.java files. Our >> > existing checkstyle rules file can’t be loaded with newer versions. We >> > can exclude module-info.java for now and have a separate Issue for >> > updating checkstyle itself and the rules file. >> > * grpc-java could not be modularized when the PR above was written. >> > * Grpc 1.57 now can be modularized >> > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>) >> > >> > From: David Dali Susanibar Arce <davi.sar...@gmail.com> >> > Date: Wednesday, May 25, 2022 at 5:02 AM >> > To: dev@arrow.apache.org <dev@arrow.apache.org> >> > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module >> System >> > Hi All, >> > >> > This email's purpose is a request for comments to migrate Arrow Java to >> JPMS >> > Java Platform Module System < >> https://openjdk.java.net/projects/jigsaw/spec/><https://openjdk.java.net/projects/jigsaw/spec/%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e> >> > JSE 9+ (1). >> > >> > Current status: >> > >> > - Arrow Java use JSE1.8 specification >> > >> > - Arrow Java works with JSE1.8/9/11/17 >> > >> > - This is possible because Java offers “legacy mode” >> > >> > Proposal: >> > >> > Migrate to JPMS Java Platform Module System. This Draft PR >> > <https://github.com/apache/arrow/pull/13072>(2< >> https://github.com/apache/arrow/pull/13072%3e(2<<https://github.com/apache/arrow/pull/13072%3e(2%3c><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e> >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2 >> < >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>)<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e> >> >> > contains an initial port of >> > the modules: Format / Memory Core / Memory Netty / Memory Unsafe / >> > Vector >> > for evaluation. >> > >> > Main Reason to migrate: >> > >> > - JPMS offer Strong encapsulation, Well-defined interfaces >> > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit >> dependencies. >> > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4) >> > >> > - JPMS offer reliable configuration and security to hide platform >> internals. >> > >> > - JPMS offers a partial solution to solve problems about read (80%) >> /write >> > (20%) code. >> > >> > - JPMS offer optimization for readability about read/write ratio (90/10) >> > thru module-info.java. >> > >> > - Consistency logs, JPMS implement consistency logs to really use that to >> > solve the current problem. >> > >> > - Be able to customize JRE needed with only modules needed (not >> > java.desktop for example and others) thru JLink. >> > >> > - Modules have also been implemented by other languages such as >> Javascript >> > (ES2015), C++(C++20), Net (Nuget/NetCore).. >> > >> > - Consider taking a look at this discussion about pros/cons >> > < >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ >> > >> > (5). >> > >> > - Eventual migration to JPMS is a practical necessity as more projects >> > migrate. >> > >> > Effort: >> > >> > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able >> to >> > offer support for both jar components JSE1.8 and JSE9+ included. >> > >> > - Go bottom up for JPMS. >> > >> > - Packages need to be unique (i.e. org.apache.arrow.memory / >> > io.netty.buffer). Review Draft PR with initial proposal. >> > >> > - Dependencies also need to be modularized. If some of our current >> > dependencies are not able to be used as a module this will be a blocker >> for >> > our modules (we could patch that but this is an extra effort). >> > >> > Killers: >> > >> > - FIXME! I need your support to identify killer reasons to be able to >> push >> > this implementation. >> > >> > Please let us know if Arrow Java to JPMS Java Platform Module System is >> > needed and should be implemented. >> > >> > Please use this file for any comments >> > >> https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing >> > >> > Resources used: >> > >> > (1): https://openjdk.java.net/projects/jigsaw/spec/ >> > >> > (2): https://github.com/apache/arrow/pull/13072 >> > >> > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/ >> > >> > (4): https://github.com/nipafx/demo-jigsaw-reflection >> > >> > (5): >> > >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/ >> > >> > Best regards, >> > >> > -- >> > David >>