This is an automated email from the ASF dual-hosted git repository.
andygrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-java.git
The following commit(s) were added to refs/heads/main by this push:
new 5292e05 feat(proto): generate datafusion-proto Java classes at build
time (#9)
5292e05 is described below
commit 5292e05c924a2b4125f2db66898e236dc969912b
Author: Andy Grove <[email protected]>
AuthorDate: Tue May 12 20:16:52 2026 -0600
feat(proto): generate datafusion-proto Java classes at build time (#9)
---
CONTRIBUTING.md | 92 ++++++++++++++++++++++
pom.xml | 58 ++++++++++++++
.../datafusion/proto/ProtoGenerationTest.java | 41 ++++++++++
3 files changed, 191 insertions(+)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..302b6b3
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,92 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Contributing to Apache DataFusion Java
+
+Bug reports, design discussion, and patches are welcome. This project follows
+the Apache DataFusion contribution model.
+
+## Filing issues and discussing changes
+
+- File bugs and feature requests on [GitHub
issues](https://github.com/apache/datafusion-java/issues).
+- For larger or design-level discussion, the mailing list is
+ [[email protected]](mailto:[email protected]).
+- Please open an issue before sending a PR for any significant change so the
+ approach can be agreed on first.
+
+## Development workflow
+
+Branch from `main`, write changes with [conventional
commit](https://www.conventionalcommits.org/)
+messages in the imperative mood (e.g. `feat: add foo`, `fix(native): handle
bar`),
+and open a pull request targeting `main`.
+
+The first build in a fresh checkout reaches out to `raw.githubusercontent.com`
+to fetch the DataFusion protobuf schemas (see *Updating the DataFusion /
+protobuf schema version* below). Subsequent builds are offline — the
+`download-maven-plugin` cache under `~/.m2/repository/.cache/` satisfies them.
+
+## Code style
+
+- Java: run `./mvnw spotless:apply` before committing. CI fails the build if
+ formatting drifts.
+- Rust: run `cargo fmt` and `cargo clippy --all-targets -- -D warnings` inside
+ `native/`.
+- New source files need the Apache 2.0 license header. Apache RAT enforces this
+ during `verify`.
+
+## Updating the DataFusion / protobuf schema version
+
+Three things must move together when bumping DataFusion:
+
+1. `native/Cargo.toml` — the `datafusion` crate dependency.
+2. `pom.xml` — the `<datafusion.version>` Maven property. **Must equal the
+ Cargo version**; a mismatch means JVM-built protobuf plans won't deserialize
+ on the native side.
+3. `pom.xml` — the `<sha512>` checksums on the two `download-maven-plugin`
+ executions. These pin the downloaded `.proto` files; the build fails if
+ upstream silently re-tags them, which is the desired behavior.
+
+Recipe:
+
+```sh
+# 1. Bump the Cargo dep
+$EDITOR native/Cargo.toml # set datafusion = "<new>"
+(cd native && cargo update -p datafusion)
+
+# 2. Bump the Maven property to match
+$EDITOR pom.xml # set <datafusion.version>
+
+# 3. Compute the new SHA-512 hashes for both `.proto` files from the upstream
+# tag you just set in step 2, then paste them into the two <sha512> elements
+# in pom.xml.
+NEW=$(grep -m1 -oE '<datafusion.version>[^<]+' pom.xml | cut -d'>' -f2)
+curl -sL
"https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto-common/proto/datafusion_common.proto"
| shasum -a 512 | awk '{print $1}'
+curl -sL
"https://raw.githubusercontent.com/apache/datafusion/$NEW/datafusion/proto/proto/datafusion.proto"
| shasum -a 512 | awk '{print $1}'
+$EDITOR pom.xml # paste the two hashes into the <sha512>
elements
+
+# Drop the local download cache so the next build re-downloads against the new
hashes.
+rm -rf ~/.m2/repository/.cache/download-maven-plugin target/proto
+
+# 4. Verify
+make && make test
+```
+
+The protobuf runtime version (`<protobuf.version>` in `pom.xml`) tracks the
+Java ecosystem (security and JDK compatibility), not DataFusion. Bump it
+independently when there is a reason.
diff --git a/pom.xml b/pom.xml
index d2b1b91..9039d8b 100644
--- a/pom.xml
+++ b/pom.xml
@@ -33,6 +33,8 @@ under the License.
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<junit.version>5.11.3</junit.version>
+ <datafusion.version>53.1.0</datafusion.version>
+ <protobuf.version>3.25.5</protobuf.version>
</properties>
<dependencies>
@@ -58,9 +60,21 @@ under the License.
<version>19.0.0</version>
<scope>runtime</scope>
</dependency>
+ <dependency>
+ <groupId>com.google.protobuf</groupId>
+ <artifactId>protobuf-java</artifactId>
+ <version>${protobuf.version}</version>
+ </dependency>
</dependencies>
<build>
+ <extensions>
+ <extension>
+ <groupId>kr.motd.maven</groupId>
+ <artifactId>os-maven-plugin</artifactId>
+ <version>1.7.1</version>
+ </extension>
+ </extensions>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
@@ -118,6 +132,7 @@ under the License.
<exclude>NOTICE.txt</exclude>
<!-- Project documentation that does not require
headers -->
<exclude>README.md</exclude>
+ <exclude>CONTRIBUTING.md</exclude>
<exclude>docs/**</exclude>
<!-- VCS and editor metadata -->
<exclude>.gitignore</exclude>
@@ -138,6 +153,49 @@ under the License.
</excludes>
</configuration>
</plugin>
+ <plugin>
+ <groupId>com.googlecode.maven-download-plugin</groupId>
+ <artifactId>download-maven-plugin</artifactId>
+ <version>1.9.0</version>
+ <executions>
+ <execution>
+ <id>fetch-datafusion-common-proto</id>
+ <phase>generate-sources</phase>
+ <goals><goal>wget</goal></goals>
+ <configuration>
+
<url>https://raw.githubusercontent.com/apache/datafusion/${datafusion.version}/datafusion/proto-common/proto/datafusion_common.proto</url>
+
<outputDirectory>${project.build.directory}/proto/datafusion/proto-common/proto</outputDirectory>
+
<outputFileName>datafusion_common.proto</outputFileName>
+
<sha512>d6f3368372ea277cc23e26f196994b81616d38599357bb374cbd7eb1760e649a789e4c133d86a395ac701049a500348da2ec039d3f978ac5d8112c2876dded1f</sha512>
+ </configuration>
+ </execution>
+ <execution>
+ <id>fetch-datafusion-proto</id>
+ <phase>generate-sources</phase>
+ <goals><goal>wget</goal></goals>
+ <configuration>
+
<url>https://raw.githubusercontent.com/apache/datafusion/${datafusion.version}/datafusion/proto/proto/datafusion.proto</url>
+
<outputDirectory>${project.build.directory}/proto/datafusion/proto/proto</outputDirectory>
+ <outputFileName>datafusion.proto</outputFileName>
+
<sha512>c3d162b8e2a418e03f74caceaccfd934af89bb95a12ede13d4cc1701d24c734d74b1e96372142b173db05938dab7f965ad60d476363308c441677a63ea5fbcf7</sha512>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+ <plugin>
+ <groupId>org.xolstice.maven.plugins</groupId>
+ <artifactId>protobuf-maven-plugin</artifactId>
+ <version>0.6.1</version>
+ <configuration>
+
<protocArtifact>com.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}</protocArtifact>
+
<protoSourceRoot>${project.build.directory}/proto</protoSourceRoot>
+ </configuration>
+ <executions>
+ <execution>
+ <goals><goal>compile</goal></goals>
+ </execution>
+ </executions>
+ </plugin>
</plugins>
</build>
</project>
diff --git a/src/test/java/org/apache/datafusion/proto/ProtoGenerationTest.java
b/src/test/java/org/apache/datafusion/proto/ProtoGenerationTest.java
new file mode 100644
index 0000000..9d99763
--- /dev/null
+++ b/src/test/java/org/apache/datafusion/proto/ProtoGenerationTest.java
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.datafusion.proto;
+
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+
+import org.apache.datafusion.protobuf.LogicalPlanNode;
+import org.junit.jupiter.api.Test;
+
+/**
+ * Smoke test: confirms the datafusion-proto schema was downloaded, protoc
generated Java sources,
+ * those sources landed on the compile classpath, and the {@code
protobuf-java} runtime resolves.
+ *
+ * <p>Does not exercise any DataFusion plan semantics — those tests arrive
with JVM-side plan
+ * construction.
+ */
+class ProtoGenerationTest {
+
+ @Test
+ void generatedClassIsLoadableAndConstructible() {
+ LogicalPlanNode node = LogicalPlanNode.newBuilder().build();
+ assertNotNull(node);
+ }
+}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]