For the work on modularization, I’ve merged the flight-grpc and flight-core 
modules. I haven’t moved any other packages.

I also encountered this problem with flight-core. Flight-core derives from 
io.netty.buffer (specifically CompositeByteBuf), but we cannot use Netty buffer 
with JPMS without requiring users to monkey-patch it with patch-module. I have 
flight-core using Netty buffer on the classpath, but deriving from 
CompositeByteBuf requires adding 
--add-reads=org.apache.arrow.flight.core=ALL-UNNAMED to the command line since 
Netty’s being treated as part of the unnamed module in this context.

This isn’t terribly onerous from the user’s perspective (especially compared to 
monkey-patching), but we start getting into the business of having different 
command-line arguments for different modules the user might be using.

We could tell users to always add 
–add-reads=org.apache.arrow.flight.core=ALL-UNNAMED, but if they aren’t using 
Flight, they’ll get a warning about an unrecognized module. Any thoughts about 
the best way to present these command-line changes?


From: Jean-Baptiste Onofré <j...@nanthrax.net>
Date: Wednesday, December 13, 2023 at 6:58 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module 
System
Hi,

We have a "split packages" between flight-core and flight-grpc. I
don't think anyone is using only flight-core to create a new
"transport".
I think it's acceptable to combine flight-core and flight-grpc, and
maybe reshape a bit (imho some classes from flight-grpc should be
actually in flight-core).
Users can always reshade (even if it's painful).

Regards
JB

On Tue, Dec 12, 2023 at 8:52 PM David Li <lidav...@apache.org> wrote:
>
> I would be OK with combining flight-core and flight-grpc. It's not clear to 
> me anyways that the current split would actually be helpful if we do ever 
> want to support Flight without gRPC.
>
> On Tue, Dec 12, 2023, at 14:21, James Duong wrote:
> > An update on this work.
> >
> > I’ve been able to update the following to JPMS modules now:
> >
> >   *   arrow-memory-core
> >   *   arrow-memory-unsafe
> >   *   arrow-format
> >   *   arrow-vector
> >   *   arrow-compression
> >   *   arrow-tools
> >   *   arrow-avro
> >   *   arrow-jdbc
> >   *   arrow-algorithm
> >
> > I don’t think the following should be modularized (but would appreciate
> > feedback as well):
> >
> >   *   arrow-performance (test-only module)
> >   *   flight-integration-tests (test-only module)
> >   *   flight-jdbc-core
> >   *   flight-jdbc-driver (Need to think about this. MySQL suggests they
> > want to modularize their JDBC driver:
> > https://bugs.mysql.com/bug.php?id=97683 . How does this interact with
> > the ServiceLoader changes in JPMS as well).
> >
> > I’m starting the flight-related modules now. The first issue I’ve
> > noticed is that flight-core and flight-grpc reuse packages. I’d like to
> > combine flight-core and flight-grpc because of this (currently we only
> > have Flight using grpc in Java).
> >
> > After Flight I’d take a look at the modules that have native code:
> >
> >   *   arrow-c-data
> >   *   arrow-orc
> >   *   arrow-dataset
> >   *   arrow-gandiva
> >
> >
> > From: James Duong <james.du...@improving.com.INVALID>
> > Date: Tuesday, December 5, 2023 at 1:39 PM
> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
> > Module System
> > I expect that we’d still have to change CLI arguments in JDK17.
> >
> > The need for changing the CLI arguments is due to memory-core now being
> > a named module while requiring access to Unsafe and using reflection
> > for accessing internals of java.nio.Buffer.
> >
> > We’re using JDK8 code for doing both currently. It might be worth
> > looking into if there’s a JDK9+ way of doing this without needing
> > reflection and compiling memory-core as a multi-release JAR.
> >
> > I don’t expect more CLI argument changes unless we use reflection on
> > NIO classes in other modules.
> >
> >
> > From: Adam Lippai <a...@rigo.sk>
> > Date: Tuesday, December 5, 2023 at 9:28 AM
> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
> > Module System
> > I believe Spark 4.0 was mentioned before. It’ll require Java 17 and
> > will be
> > released in a few months (June?).
> >
> > Best regards,
> > Adam Lippai
> >
> > On Tue, Dec 5, 2023 at 12:05 David Li <lidav...@apache.org> wrote:
> >
> >> Thanks James for delving into this mess.
> >>
> >> It looks like this change is unavoidable if we want to modularize? I think
> >> this is OK. Will the CLI argument change as we continue modularizing, or is
> >> this the only change that will be needed?
> >>
> >> On Mon, Dec 4, 2023, at 20:07, James Duong wrote:
> >> > Hello,
> >> >
> >> > I did some work to separate the below PR into smaller PRs.
> >> >
> >> >
> >> >   *   Updating the versions of dependencies and maven plugins is done
> >> > and merged into master.
> >> >   *   I separated out the work modularizing arrow-vector,
> >> > arrow-memory-core/unsafe, and arrow-memory-netty.
> >> >
> >> > Modularizing arrow-memory-core requires a smaller change to user
> >> > command-line arguments. Instead of:
> >> > --add-opens=java.base/java.nio=ALL-UNNAMED
> >> >
> >> > The user needs to add:
> >> > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED
> >> >
> >> > I initially tried to modularize arrow-vector separately from
> >> > arrow-memory-core but found that any meaningful operation in
> >> > arrow-vector would trigger an illegal access in memory-core if it
> >> > wasn’t modularized.
> >> >
> >> > I was able to run the tests for arrow-compression and arrow-tools
> >> > successfully after modularizing memory-core, memory-unsafe-, and
> >> > arrow-vector. Note that I had more success by making memory-core and
> >> > memory-unsafe automatic modules.
> >> >
> >> > I think we should make a decision here on if we want to bite the bullet
> >> > and introduce a breaking user-facing change around command-line
> >> > options. The other option is to wait for JDK 21 to modularize. That’s
> >> > farther down the line and requires refactoring much of the memory
> >> > module code and implementing a module using the foreign memory
> >> > interface.
> >> >
> >> > From: James Duong <james.du...@improving.com.INVALID>
> >> > Date: Tuesday, November 28, 2023 at 6:48 PM
> >> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
> >> > Module System
> >> > Hi,
> >> >
> >> > I’ve made some major progress on this work in this PR:
> >> > https://github.com/apache/arrow/pull/38876
> >> >
> >> >
> >> >   *   The maven plugin for compiling module-info.java files using JDK 8
> >> > is working correctly.
> >> >   *   arrow-format, arrow-memory-core, arrow-memory-netty,
> >> > arrow-memory-unsafe, and arrow-vector have been modularized
> >> > successfully.
> >> >      *   Tests pass locally for all of these modules.
> >> >      *   They fail in CI. This is likely from me not updating a profile
> >> > somewhere.
> >> >
> >> > Similar to David’s PR from below, arrow-memory and modules needed to be
> >> > refactored fairly significantly and split into two modules: a
> >> > public-facing JPMS module and a separate module which adds to Netty’s
> >> > packages (memory-netty-buffer-patch). What’s more problematic is that
> >> > because we are using named modules now, users need to add more
> >> > arguments to their Java command line to use arrow. If one were to use
> >> > arrow-memory-netty they would need to add the following:
> >> >
> >> > --add-opens java.base/jdk.internal.misc=io.netty.common
> >> >
> >> --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar
> >>
> >> >
> >> --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED
> >> >
> >> > Depending on where the memory-netty-buffer-patch JAR is located, and
> >> > what version, the command the user needs to supply changes, so this
> >> > seems like it’d be really inconvenient.
> >> >
> >> > Do we want to proceed with modularizing existing memory modules? Both
> >> > netty and unsafe? Or wait until the new memory module from Java 21 is
> >> > available?
> >> >
> >> > The module-info.java files are written fairly naively. I haven’t
> >> > inspected thoroughly to determine what packages users will need.
> >> >
> >> > We can continue modularizing more components in a separate PR. Ideally
> >> > all the user breakage (class movement, new command-line argument
> >> > requirements) happens within one major Arrow version.
> >> >
> >> > From: James Duong <james.du...@improving.com.INVALID>
> >> > Date: Tuesday, November 21, 2023 at 1:16 PM
> >> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> >> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
> >> > Module System
> >> > I’m following up on this topic.
> >> >
> >> > David has a PR from last year that’s done much of the heavy lifting for
> >> > refactoring the codebase to be package-friendly.
> >> > https://github.com/apache/arrow/pull/13072
> >> >
> >> > What’s changed since and what’s left:
> >> >
> >> >   *   New components have been added (Flight SQL for example) that will
> >> > need to be updated for modules.
> >> >   *   There wasn’t a clear solution on how to do this without breaking
> >> > JDK 8 support. Compiling module-info.java files require using JDK9, but
> >> > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe.
> >> >      *   There is a Gradle plugin that can compile module-info.java
> >> > files purely syntactically that we can adapt to maven. It has
> >> > limitations (the one I see is that it can’t iterate through
> >> > classloaders to handle annotations), but using this might be a good
> >> > stopgap until we JDK 8 support is deprecated.
> >> >   *   Some plugins need to be updated:
> >> >      *   maven-dependency-plugin 3.0.1 can’t parse module-info.class
> >> > files.
> >> >      *   checkstyle 3.1.0 can’t parse module-info.java files. Our
> >> > existing checkstyle rules file can’t be loaded with newer versions. We
> >> > can exclude module-info.java for now and have a separate Issue for
> >> > updating checkstyle itself and the rules file.
> >> >   *   grpc-java could not be modularized when the PR above was written.
> >> >      *   Grpc 1.57 now can be modularized
> >> > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>)
> >> >
> >> > From: David Dali Susanibar Arce <davi.sar...@gmail.com>
> >> > Date: Wednesday, May 25, 2022 at 5:02 AM
> >> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> >> > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module
> >> System
> >> > Hi All,
> >> >
> >> > This email's purpose is a request for comments to migrate Arrow Java to
> >> JPMS
> >> > Java Platform Module System <
> >> https://openjdk.java.net/projects/jigsaw/spec/><https://openjdk.java.net/projects/jigsaw/spec/%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e%3e>
> >> > JSE 9+ (1).
> >> >
> >> > Current status:
> >> >
> >> > - Arrow Java use JSE1.8 specification
> >> >
> >> > - Arrow Java works with JSE1.8/9/11/17
> >> >
> >> > - This is possible because Java offers “legacy mode”
> >> >
> >> > Proposal:
> >> >
> >> > Migrate to JPMS Java Platform Module System. This Draft PR
> >> > <https://github.com/apache/arrow/pull/13072>(2<
> >> https://github.com/apache/arrow/pull/13072%3e(2<<https://github.com/apache/arrow/pull/13072%3e(2%3c><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e%3e>
> >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2
> >> <
> >> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>)<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e%3e>
> >>
> >> > contains an initial port of
> >> > the modules: Format / Memory Core / Memory Netty / Memory Unsafe /
> >> > Vector
> >> > for evaluation.
> >> >
> >> > Main Reason to migrate:
> >> >
> >> > - JPMS offer Strong encapsulation, Well-defined interfaces
> >> > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit
> >> dependencies.
> >> > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4)
> >> >
> >> > - JPMS offer reliable configuration and security to hide platform
> >> internals.
> >> >
> >> > - JPMS offers a partial solution to solve problems about read (80%)
> >> /write
> >> > (20%) code.
> >> >
> >> > - JPMS offer optimization for readability about read/write ratio (90/10)
> >> > thru module-info.java.
> >> >
> >> > - Consistency logs, JPMS implement consistency logs to really use that to
> >> > solve the current problem.
> >> >
> >> > - Be able to customize JRE needed with only modules needed (not
> >> > java.desktop for example and others) thru JLink.
> >> >
> >> > - Modules have also been implemented by other languages such as
> >> Javascript
> >> > (ES2015), C++(C++20), Net (Nuget/NetCore)..
> >> >
> >> > - Consider taking a look at this discussion about pros/cons
> >> > <
> >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/
> >> >
> >> > (5).
> >> >
> >> > - Eventual migration to JPMS is a practical necessity as more projects
> >> > migrate.
> >> >
> >> > Effort:
> >> >
> >> > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able
> >> to
> >> > offer support for both jar components JSE1.8 and JSE9+ included.
> >> >
> >> > - Go bottom up for JPMS.
> >> >
> >> > - Packages need to be unique (i.e. org.apache.arrow.memory /
> >> > io.netty.buffer). Review Draft PR with initial proposal.
> >> >
> >> > - Dependencies also need to be modularized. If some of our current
> >> > dependencies are not able to be used as a module this will be a blocker
> >> for
> >> > our modules (we could patch that but this is an extra effort).
> >> >
> >> > Killers:
> >> >
> >> > - FIXME! I need your support to identify killer reasons to be able to
> >> push
> >> > this implementation.
> >> >
> >> > Please let us know if Arrow Java to JPMS Java Platform Module System is
> >> > needed and should be implemented.
> >> >
> >> > Please use this file for any comments
> >> >
> >> https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing
> >> >
> >> > Resources used:
> >> >
> >> > (1): https://openjdk.java.net/projects/jigsaw/spec/
> >> >
> >> > (2): https://github.com/apache/arrow/pull/13072
> >> >
> >> > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/
> >> >
> >> > (4): https://github.com/nipafx/demo-jigsaw-reflection
> >> >
> >> > (5):
> >> >
> >> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/
> >> >
> >> > Best regards,
> >> >
> >> > --
> >> > David
> >>

Reply via email to