I would be OK with combining flight-core and flight-grpc. It's not clear to me 
anyways that the current split would actually be helpful if we do ever want to 
support Flight without gRPC.

On Tue, Dec 12, 2023, at 14:21, James Duong wrote:
> An update on this work.
>
> I’ve been able to update the following to JPMS modules now:
>
>   *   arrow-memory-core
>   *   arrow-memory-unsafe
>   *   arrow-format
>   *   arrow-vector
>   *   arrow-compression
>   *   arrow-tools
>   *   arrow-avro
>   *   arrow-jdbc
>   *   arrow-algorithm
>
> I don’t think the following should be modularized (but would appreciate 
> feedback as well):
>
>   *   arrow-performance (test-only module)
>   *   flight-integration-tests (test-only module)
>   *   flight-jdbc-core
>   *   flight-jdbc-driver (Need to think about this. MySQL suggests they 
> want to modularize their JDBC driver: 
> https://bugs.mysql.com/bug.php?id=97683 . How does this interact with 
> the ServiceLoader changes in JPMS as well).
>
> I’m starting the flight-related modules now. The first issue I’ve 
> noticed is that flight-core and flight-grpc reuse packages. I’d like to 
> combine flight-core and flight-grpc because of this (currently we only 
> have Flight using grpc in Java).
>
> After Flight I’d take a look at the modules that have native code:
>
>   *   arrow-c-data
>   *   arrow-orc
>   *   arrow-dataset
>   *   arrow-gandiva
>
>
> From: James Duong <james.du...@improving.com.INVALID>
> Date: Tuesday, December 5, 2023 at 1:39 PM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform 
> Module System
> I expect that we’d still have to change CLI arguments in JDK17.
>
> The need for changing the CLI arguments is due to memory-core now being 
> a named module while requiring access to Unsafe and using reflection 
> for accessing internals of java.nio.Buffer.
>
> We’re using JDK8 code for doing both currently. It might be worth 
> looking into if there’s a JDK9+ way of doing this without needing 
> reflection and compiling memory-core as a multi-release JAR.
>
> I don’t expect more CLI argument changes unless we use reflection on 
> NIO classes in other modules.
>
>
> From: Adam Lippai <a...@rigo.sk>
> Date: Tuesday, December 5, 2023 at 9:28 AM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform 
> Module System
> I believe Spark 4.0 was mentioned before. It’ll require Java 17 and 
> will be
> released in a few months (June?).
>
> Best regards,
> Adam Lippai
>
> On Tue, Dec 5, 2023 at 12:05 David Li <lidav...@apache.org> wrote:
>
>> Thanks James for delving into this mess.
>>
>> It looks like this change is unavoidable if we want to modularize? I think
>> this is OK. Will the CLI argument change as we continue modularizing, or is
>> this the only change that will be needed?
>>
>> On Mon, Dec 4, 2023, at 20:07, James Duong wrote:
>> > Hello,
>> >
>> > I did some work to separate the below PR into smaller PRs.
>> >
>> >
>> >   *   Updating the versions of dependencies and maven plugins is done
>> > and merged into master.
>> >   *   I separated out the work modularizing arrow-vector,
>> > arrow-memory-core/unsafe, and arrow-memory-netty.
>> >
>> > Modularizing arrow-memory-core requires a smaller change to user
>> > command-line arguments. Instead of:
>> > --add-opens=java.base/java.nio=ALL-UNNAMED
>> >
>> > The user needs to add:
>> > --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED
>> >
>> > I initially tried to modularize arrow-vector separately from
>> > arrow-memory-core but found that any meaningful operation in
>> > arrow-vector would trigger an illegal access in memory-core if it
>> > wasn’t modularized.
>> >
>> > I was able to run the tests for arrow-compression and arrow-tools
>> > successfully after modularizing memory-core, memory-unsafe-, and
>> > arrow-vector. Note that I had more success by making memory-core and
>> > memory-unsafe automatic modules.
>> >
>> > I think we should make a decision here on if we want to bite the bullet
>> > and introduce a breaking user-facing change around command-line
>> > options. The other option is to wait for JDK 21 to modularize. That’s
>> > farther down the line and requires refactoring much of the memory
>> > module code and implementing a module using the foreign memory
>> > interface.
>> >
>> > From: James Duong <james.du...@improving.com.INVALID>
>> > Date: Tuesday, November 28, 2023 at 6:48 PM
>> > To: dev@arrow.apache.org <dev@arrow.apache.org>
>> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
>> > Module System
>> > Hi,
>> >
>> > I’ve made some major progress on this work in this PR:
>> > https://github.com/apache/arrow/pull/38876
>> >
>> >
>> >   *   The maven plugin for compiling module-info.java files using JDK 8
>> > is working correctly.
>> >   *   arrow-format, arrow-memory-core, arrow-memory-netty,
>> > arrow-memory-unsafe, and arrow-vector have been modularized
>> > successfully.
>> >      *   Tests pass locally for all of these modules.
>> >      *   They fail in CI. This is likely from me not updating a profile
>> > somewhere.
>> >
>> > Similar to David’s PR from below, arrow-memory and modules needed to be
>> > refactored fairly significantly and split into two modules: a
>> > public-facing JPMS module and a separate module which adds to Netty’s
>> > packages (memory-netty-buffer-patch). What’s more problematic is that
>> > because we are using named modules now, users need to add more
>> > arguments to their Java command line to use arrow. If one were to use
>> > arrow-memory-netty they would need to add the following:
>> >
>> > --add-opens java.base/jdk.internal.misc=io.netty.common
>> >
>> --patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar
>>
>> >
>> --add-opens=java.base/java.nio=org.apache.arrow.memory.core,io.netty.common,ALL-UNNAMED
>> >
>> > Depending on where the memory-netty-buffer-patch JAR is located, and
>> > what version, the command the user needs to supply changes, so this
>> > seems like it’d be really inconvenient.
>> >
>> > Do we want to proceed with modularizing existing memory modules? Both
>> > netty and unsafe? Or wait until the new memory module from Java 21 is
>> > available?
>> >
>> > The module-info.java files are written fairly naively. I haven’t
>> > inspected thoroughly to determine what packages users will need.
>> >
>> > We can continue modularizing more components in a separate PR. Ideally
>> > all the user breakage (class movement, new command-line argument
>> > requirements) happens within one major Arrow version.
>> >
>> > From: James Duong <james.du...@improving.com.INVALID>
>> > Date: Tuesday, November 21, 2023 at 1:16 PM
>> > To: dev@arrow.apache.org <dev@arrow.apache.org>
>> > Subject: Re: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform
>> > Module System
>> > I’m following up on this topic.
>> >
>> > David has a PR from last year that’s done much of the heavy lifting for
>> > refactoring the codebase to be package-friendly.
>> > https://github.com/apache/arrow/pull/13072
>> >
>> > What’s changed since and what’s left:
>> >
>> >   *   New components have been added (Flight SQL for example) that will
>> > need to be updated for modules.
>> >   *   There wasn’t a clear solution on how to do this without breaking
>> > JDK 8 support. Compiling module-info.java files require using JDK9, but
>> > using JDK9 breaks using JDK8 methods of accessing sun.misc.Unsafe.
>> >      *   There is a Gradle plugin that can compile module-info.java
>> > files purely syntactically that we can adapt to maven. It has
>> > limitations (the one I see is that it can’t iterate through
>> > classloaders to handle annotations), but using this might be a good
>> > stopgap until we JDK 8 support is deprecated.
>> >   *   Some plugins need to be updated:
>> >      *   maven-dependency-plugin 3.0.1 can’t parse module-info.class
>> > files.
>> >      *   checkstyle 3.1.0 can’t parse module-info.java files. Our
>> > existing checkstyle rules file can’t be loaded with newer versions. We
>> > can exclude module-info.java for now and have a separate Issue for
>> > updating checkstyle itself and the rules file.
>> >   *   grpc-java could not be modularized when the PR above was written.
>> >      *   Grpc 1.57 now can be modularized
>> > (grpc/grpc-java#3522<https://github.com/grpc/grpc-java/issues/3522>)
>> >
>> > From: David Dali Susanibar Arce <davi.sar...@gmail.com>
>> > Date: Wednesday, May 25, 2022 at 5:02 AM
>> > To: dev@arrow.apache.org <dev@arrow.apache.org>
>> > Subject: [DISC][Java]: Migrate Arrow Java to JPMS Java Platform Module
>> System
>> > Hi All,
>> >
>> > This email's purpose is a request for comments to migrate Arrow Java to
>> JPMS
>> > Java Platform Module System <
>> https://openjdk.java.net/projects/jigsaw/spec/><https://openjdk.java.net/projects/jigsaw/spec/%3e><https://openjdk.java.net/projects/jigsaw/spec/%3e%3chttps:/openjdk.java.net/projects/jigsaw/spec/%3e%3e>
>> > JSE 9+ (1).
>> >
>> > Current status:
>> >
>> > - Arrow Java use JSE1.8 specification
>> >
>> > - Arrow Java works with JSE1.8/9/11/17
>> >
>> > - This is possible because Java offers “legacy mode”
>> >
>> > Proposal:
>> >
>> > Migrate to JPMS Java Platform Module System. This Draft PR
>> > <https://github.com/apache/arrow/pull/13072>(2<
>> https://github.com/apache/arrow/pull/13072%3e(2<<https://github.com/apache/arrow/pull/13072%3e(2%3c><https://github.com/apache/arrow/pull/13072%3e(2%3c%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3c%3e>
>> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2
>> <
>> https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2>>>)<https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)><https://github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3chttps:/github.com/apache/arrow/pull/13072%3e(2%3e%3e%3e)%3e>
>>
>> > contains an initial port of
>> > the modules: Format / Memory Core / Memory Netty / Memory Unsafe /
>> > Vector
>> > for evaluation.
>> >
>> > Main Reason to migrate:
>> >
>> > - JPMS offer Strong encapsulation, Well-defined interfaces
>> > <https://github.com/nipafx/demo-jigsaw-reflection>, Explicit
>> dependencies.
>> > <https://nipafx.dev/java-modules-reflection-vs-encapsulation/> (3)(4)
>> >
>> > - JPMS offer reliable configuration and security to hide platform
>> internals.
>> >
>> > - JPMS offers a partial solution to solve problems about read (80%)
>> /write
>> > (20%) code.
>> >
>> > - JPMS offer optimization for readability about read/write ratio (90/10)
>> > thru module-info.java.
>> >
>> > - Consistency logs, JPMS implement consistency logs to really use that to
>> > solve the current problem.
>> >
>> > - Be able to customize JRE needed with only modules needed (not
>> > java.desktop for example and others) thru JLink.
>> >
>> > - Modules have also been implemented by other languages such as
>> Javascript
>> > (ES2015), C++(C++20), Net (Nuget/NetCore)..
>> >
>> > - Consider taking a look at this discussion about pros/cons
>> > <
>> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/
>> >
>> > (5).
>> >
>> > - Eventual migration to JPMS is a practical necessity as more projects
>> > migrate.
>> >
>> > Effort:
>> >
>> > - First of all we need to decide to move from JSE1.8 to JSE9+ or be able
>> to
>> > offer support for both jar components JSE1.8 and JSE9+ included.
>> >
>> > - Go bottom up for JPMS.
>> >
>> > - Packages need to be unique (i.e. org.apache.arrow.memory /
>> > io.netty.buffer). Review Draft PR with initial proposal.
>> >
>> > - Dependencies also need to be modularized. If some of our current
>> > dependencies are not able to be used as a module this will be a blocker
>> for
>> > our modules (we could patch that but this is an extra effort).
>> >
>> > Killers:
>> >
>> > - FIXME! I need your support to identify killer reasons to be able to
>> push
>> > this implementation.
>> >
>> > Please let us know if Arrow Java to JPMS Java Platform Module System is
>> > needed and should be implemented.
>> >
>> > Please use this file for any comments
>> >
>> https://docs.google.com/document/d/1qcJ8LPm33UICuGjRnsGBcm8dLI08MyiL8BO5JVzTutA/edit?usp=sharing
>> >
>> > Resources used:
>> >
>> > (1): https://openjdk.java.net/projects/jigsaw/spec/
>> >
>> > (2): https://github.com/apache/arrow/pull/13072
>> >
>> > (3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/
>> >
>> > (4): https://github.com/nipafx/demo-jigsaw-reflection
>> >
>> > (5):
>> >
>> https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/
>> >
>> > Best regards,
>> >
>> > --
>> > David
>>

Reply via email to