This is an automated email from the ASF dual-hosted git repository.
xushiyan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new c7e0bf278d16 docs: Update developer setup guide and refresh IDE setup
screenshots (#14251)
c7e0bf278d16 is described below
commit c7e0bf278d16dfca35210a61908e23b22a6c985a
Author: deepakpanda93 <[email protected]>
AuthorDate: Sat Jan 10 05:48:30 2026 +0530
docs: Update developer setup guide and refresh IDE setup screenshots
(#14251)
---------
Co-authored-by: Shiyan Xu <[email protected]>
---
website/contribute/developer-setup.md | 219 ++++++++++++---------
.../images/contributing/IDE_setup_annotation.png | Bin 318314 -> 349210 bytes
.../images/contributing/IDE_setup_checkstyle_1.png | Bin 262795 -> 235873 bytes
.../images/contributing/IDE_setup_checkstyle_2.png | Bin 304546 -> 307093 bytes
.../images/contributing/IDE_setup_checkstyle_3.png | Bin 668674 -> 708522 bytes
.../images/contributing/IDE_setup_checkstyle_4.png | Bin 292288 -> 322669 bytes
.../IDE_setup_code_style_java_after.png | Bin 368410 -> 376770 bytes
.../IDE_setup_code_style_java_before.png | Bin 420130 -> 445945 bytes
.../images/contributing/IDE_setup_copyright_1.png | Bin 710312 -> 386844 bytes
.../images/contributing/IDE_setup_copyright_2.png | Bin 581227 -> 213911 bytes
.../assets/images/contributing/IDE_setup_java.png | Bin 134352 -> 0 bytes
.../images/contributing/IDE_setup_maven_1.png | Bin 157576 -> 176450 bytes
.../images/contributing/IDE_setup_maven_2.png | Bin 588765 -> 237088 bytes
.../images/contributing/IDE_setup_reload.png | Bin 242712 -> 316371 bytes
14 files changed, 126 insertions(+), 93 deletions(-)
diff --git a/website/contribute/developer-setup.md
b/website/contribute/developer-setup.md
index bee95db66adb..6ce9933021e6 100644
--- a/website/contribute/developer-setup.md
+++ b/website/contribute/developer-setup.md
@@ -11,130 +11,87 @@ last_modified_at: 2024-08-12T10:47:57-07:00
To contribute code, you need
- a GitHub account
- - a Linux (or) macOS development environment with Java JDK 8, Apache Maven
(3.x+) installed
+ - Git installed for version control
+ - a Linux (or) macOS development environment with Java JDK 11, Apache Maven
(3.x+) installed
- [Docker](https://www.docker.com/) installed for running demo, integ tests
or building website
- for large contributions, a signed [Individual Contributor License
Agreement](https://www.apache.org/licenses/icla.pdf) (ICLA) to the Apache
Software Foundation (ASF).
- (Recommended) Join our dev mailing list & slack channel, listed on
[community](/community/get-involved) page.
+## Default Build Profiles
-## IntelliJ Setup
-
-IntelliJ is the recommended IDE for developing Hudi. To contribute, you would
need to do the following
-
-- Fork the Hudi code on Github & then clone your own fork locally. Once
cloned, we recommend building as per instructions on [spark
quickstart](/docs/quick-start-guide) or [flink
quickstart](/docs/flink-quick-start-guide).
-
-- In IntelliJ, select `File` > `New` > `Project from Existing Sources...` and
select the `pom.xml` file under your local Hudi source folder.
-
-- In `Project Structure`, select Java 1.8 as the Project SDK.
-
- 
-
-- Make the following configuration in `Preferences` or `Settings` in newer
IntelliJ so the Hudi code can compile in the IDE:
- * Enable annotation processing in compiler.
-
- 
- * Configure Maven *NOT* to delegate IDE build/run actions to Maven so you
can run tests in IntelliJ directly.
-
- 
- 
-
-- If you switch maven build profile, e.g., from Spark 3.2 to Spark 3.3, you
need to first build Hudi in the command line first and `Reload All Maven
Projects` in IntelliJ like below,
-so that IntelliJ re-indexes the code.
-
- 
-
-- \[Recommended\] We have embraced the code style largely based on [google
format](https://google.github.io/styleguide/javaguide.html). Please set up your
IDE with style files from [\<project
root\>/style/](https://github.com/apache/hudi/tree/master/style). These
instructions have been tested on IntelliJ.
- * Open `Settings` in IntelliJ
- * Install and activate CheckStyle plugin
-
-

- * In `Settings` > `Tools` > `Checkstyle`, use a recent version, e.g.,
10.17.0
+The following table summarizes the default build profiles and versions used by
the Apache Hudi project.
-

- * Click on `+`, add the style/checkstyle.xml file, and name the
configuration as "Hudi Checks"
+| Component | Default Profile / Version | Notes
|
+|-----------|---------------------------|------------------------------------------------|
+| Spark | 3.5 | Default Spark 3 build profile
|
+| Scala | 2.12 | Default Scala version for Spark
builds |
+| Java | 11 | Required Java version for building
the project |
+| Flink | 1.20 | Default Flink streaming profile
|
-

- * Activate the checkstyle configuration by checking `Active`
-

- * Open `Settings` > `Editor` > `Code Style` > `Java`
- * Select "Project" as the "Scheme". Then, go to the settings, open
`Import Scheme` > `CheckStyle Configuration`, select `style/checkstyle.xml` to
load
+## Useful Maven commands for developers
-

- * After loading the configuration, you should see that the `Indent` and
`Continuation indent` become 2 and 4, from 4 and 8, respectively
+Listing out some of the maven commands that could be useful for developers.
-

- * Apply/Save the changes
-- \[Recommended\] Set up the [Save Action
Plugin](https://plugins.jetbrains.com/plugin/7642-save-actions) to auto format
& organize imports on save. The Maven Compilation life-cycle will fail if there
are checkstyle violations.
-
-- \[Recommended\] As it is required to add [Apache License
header](https://www.apache.org/legal/src-headers#headers) to all source files,
configuring "Copyright" settings as shown below will come in handy.
-
-
-
-
-
-## Useful Maven commands for developers.
-Listing out some of the maven commands that could be useful for developers.
-
-- Compile/build entire project
+- Compile/build entire project
```shell
-mvn clean package -DskipTests
+mvn clean package -DskipTests
```
-Default profile is spark2 and scala2.11
+Default profile is Spark 3.5 and Scala 2.12
-- For continuous development, you may want to build only the modules of
interest. for eg, if you have been working with
-Hudi Streamer, you can build using this command instead of entire project.
Majority of time goes into building all different bundles we have
-like flink bundle, presto bundle, trino bundle etc. But if you are developing
something confined to hudi-utilties, you can achieve faster
+- For continuous development, you may want to build only the modules of
interest. for eg, if you have been working with
+Hudi Streamer, you can build using this command instead of entire project.
Majority of time goes into building all different bundles we have
+like flink bundle, presto bundle, trino bundle etc. But if you are developing
something confined to hudi-utilities, you can achieve faster
build times.
```shell
mvn package -DskipTests -pl packaging/hudi-utilities-bundle/ -am
```
-To enable multi-threaded building, you can add -T.
+To enable multi-threaded building, you can add -T.
```shell
mvn -T 2C package -DskipTests -pl packaging/hudi-utilities-bundle/ -am
```
-This command will use 2 parallel threads to build.
+This command will use 2 parallel threads to build.
-You can also confine the build to just one module if need be.
+You can also confine the build to just one module if need be.
```shell
mvn -T 2C package -DskipTests -pl hudi-spark-datasource/hudi-spark -am
```
-Note: "-am" will build all dependent modules as well.
-In local laptop, entire project build can take somewhere close to 7 to 10
mins. While buildig just hudi-spark-datasource/hudi-spark
-with multi-threaded, could get your compilation in 1.5 to 2 mins.
+Note: "-am" will build all dependent modules as well.
+In local laptop, entire project build can take somewhere close to 7 to 10
mins. While building just hudi-spark-datasource/hudi-spark
+with multi-threaded, could get your compilation in 1.5 to 2 mins.
-If you wish to run any single test class in java.
+If you wish to run any single test class in java.
```shell
-mvn test -Punit-tests -pl hudi-spark-datasource/hudi-spark/ -am -B
-DfailIfNoTests=false -Dtest=TestCleaner
+mvn test -Punit-tests -pl hudi-spark-datasource/hudi-spark/ -am -B
-DfailIfNoTests=false -Dtest=TestCleaner -Dspark3.5
```
-If you wish to run a single test method in java.
+If you wish to run a single test method in java.
```shell
-mvn test -Punit-tests -pl hudi-spark-datasource/hudi-spark/ -am -B
-DfailIfNoTests=false -Dtest=TestCleaner#testKeepLatestCommitsMOR
+mvn test -Punit-tests -pl hudi-spark-datasource/hudi-spark/ -am -B
-DfailIfNoTests=false -Dtest=TestCleaner#testKeepLatestCommitsMOR -Dspark3.5
```
To filter particular scala test:
```shell
-mvn -Dsuites="org.apache.spark.sql.hudi.TestSpark3DDL @Test Chinese table "
-Dtest=abc -DfailIfNoTests=false test -pl packaging/hudi-spark-bundle -am
+mvn -Dsuites="org.apache.spark.sql.hudi.ddl.TestSpark3DDL @Test Chinese table
" -Dtest=abc -DfailIfNoTests=false test -pl packaging/hudi-spark-bundle -am
-Dspark3.5
```
-Dtest=abc will assist in skipping all java tests.
--Dsuites="org.apache.spark.sql.hudi.TestSpark3DDL @Test Chinese table "
filters for a single scala test.
+-Dsuites="org.apache.spark.sql.hudi.ddl.TestSpark3DDL @Test Chinese table "
filters for a single scala test.
- Run an Integration Test
```shell
-mvn -T 2C -Pintegration-tests -DfailIfNoTests=false
-Dit.test=ITTestHoodieSanity#testRunHoodieJavaAppOnMultiPartitionKeysMORTable
verify
+mvn -T 2C -Pintegration-tests -DfailIfNoTests=false
-Dit.test=ITTestHoodieSanity#testRunHoodieJavaAppOnMultiPartitionKeysMORTable
verify -Dspark3.5
```
`verify` phase runs the integration test and cleans up the docker cluster
after execution. To retain the docker cluster use
`integration-test` phase instead.
**Note:** If you encounter `unknown shorthand flag: 'H' in -H`, this error
occurs when local environment has docker-compose version >= 2.0.
-The latest docker-compose is accessible using `docker-compose` whereas v1
version is accessible using `docker-compose-v1` locally.<br/>
-You can use `alt def` command to define different docker-compose versions.
Refer https://github.com/dotboris/alt. <br/>
+The latest docker-compose is accessible using `docker-compose` whereas v1
version is accessible using `docker-compose-v1` locally.
+You can use `alt def` command to define different docker-compose versions.
Refer [alt](https://github.com/dotboris/alt).
Use `alt use` to use v1 version of docker-compose while running integration
test locally.
@@ -142,7 +99,7 @@ Use `alt use` to use v1 version of docker-compose while
running integration test
* `docker` : Docker containers used by demo and integration tests. Brings up
a mini data ecosystem locally
* `hudi-cli` : CLI to inspect, manage and administer datasets
- * `hudi-client` : Spark client library to take a bunch of inserts + updates
and apply them to a Hoodie table
+ * `hudi-client` : Spark client library to take a bunch of inserts + updates
and apply them to a Hudi table
* `hudi-common` : Common classes used across modules
* `hudi-hadoop-mr` : InputFormat implementations for ReadOptimized,
Incremental, Realtime views
* `hudi-hive` : Manage hive tables off Hudi datasets and houses the
HiveSyncTool
@@ -152,9 +109,85 @@ Use `alt use` to use v1 version of docker-compose while
running integration test
* `packaging` : Poms for building out bundles for easier drop in to Spark,
Hive, Presto, Utilities
* `style` : Code formatting, checkstyle files
-## Code WalkThrough
+## Code Walkthrough
+
+Watch this [quick video](https://www.youtube.com/watch?v=N2eDfU_rQ_U) for a
code walkthrough to get started.
+
+## IntelliJ Setup
+
+IntelliJ is the recommended IDE for developing Hudi. To contribute, you would
need to do the following
+
+- Fork the Hudi code on Github & then clone your own fork locally. Once
cloned, we recommend building as per instructions on [spark
quickstart](/docs/quick-start-guide) or [flink
quickstart](/docs/flink-quick-start-guide).
+
+- In IntelliJ, select `File` > `New` > `Project from Existing Sources...` and
select the `pom.xml` file under your local Hudi source folder.
+
+- In `Project Structure` > `Project`, select Java 11 as the Project SDK.
+
+<details>
+<summary>Configure IDE Preferences and Settings</summary>
+
+Make the following configuration in `Preferences` or `Settings` in newer
IntelliJ so the Hudi code can compile in the IDE:
+
+- Enable annotation processing in compiler.
+
+ 
+
+- Configure Maven *NOT* to delegate IDE build/run actions to Maven so you can
run tests in IntelliJ directly.
+
+ 
+ 
+
+</details>
+
+<details>
+<summary>Reload Maven Projects After Profile Changes</summary>
+
+If you switch maven build profile, e.g., to a different Spark version, you
need to first build Hudi in the command line and then `Reload All Maven
Projects` in IntelliJ like below,
+so that IntelliJ re-indexes the code.
+
+
+
+</details>
+
+<details>
+<summary>[Recommended] Set Up Code Style and CheckStyle</summary>
+
+We have embraced the code style largely based on [google
format](https://google.github.io/styleguide/javaguide.html). Please set up your
IDE with style files from [\<project
root\>/style/](https://github.com/apache/hudi/tree/master/style). These
instructions have been tested on IntelliJ.
+
+- Open `Settings` in IntelliJ
+- Install and activate CheckStyle plugin
+
+

+
+- In `Settings` > `Tools` > `Checkstyle`, use a recent version, e.g., 12.1.0
+- Click on `+`, add the style/checkstyle.xml file, and name the configuration
as "Hudi Checks"
+
+

+
+- Activate the checkstyle configuration by checking `Active`
+- Open `Settings` > `Editor` > `Code Style` > `Java`
+- Select "Project" as the "Scheme". Then, go to the settings, open `Import
Scheme` > `CheckStyle Configuration`, select `style/checkstyle.xml` to load
+
+

+
+- After loading the configuration, you should see that the `Indent` and
`Continuation indent` become 2 and 4, from 4 and 8, respectively
+- Apply/Save the changes
+
+</details>
+
+<details>
+<summary>[Recommended] Set Up Save Actions and Copyright</summary>
+
+- Set up the [Save Action
Plugin](https://plugins.jetbrains.com/plugin/7642-save-actions) to auto format
& organize imports on save. The Maven Compilation life-cycle will fail if there
are checkstyle violations.
+
+- As it is required to add [Apache License
header](https://www.apache.org/legal/src-headers#headers) to all source files,
configuring "Copyright" settings as shown below will come in handy.
+
+
+
+
+
+</details>
-This Quick Video will give a code walkthrough to start with
[watch](https://www.youtube.com/watch?v=N2eDfU_rQ_U).
## Running unit tests and local debugger via Intellij IDE
@@ -162,23 +195,23 @@ This Quick Video will give a code walkthrough to start
with [watch](https://www.
When submitting a PR please make sure to NOT commit the changes mentioned in
these steps, instead once testing is done make sure to revert the changes and
then submit a pr.
:::
-0. Build the project with the intended profiles via the `mvn` cli, for example
for spark 3.2 use `mvn clean package -Dspark3.2 -Dscala-2.12 -DskipTests`.
+0. Build the project with the intended profiles via the `mvn` cli, for example
for spark 3.5 use `mvn clean package -Dspark3.5 -Dscala-2.12 -DskipTests`.
1. Install the "Maven Helper" plugin from the Intellij IDE.
2. Make sure IDEA uses Maven to build/run tests:
- * You need to select the intended Maven profiles (using Maven tool pane in
IDEA): select profiles you are targeting for example `spark2.4` and
`scala-2.11` or `spark3.2`, `scala-2.12` etc.
- * Add `.mvn/maven.config` file at the root of the repo w/ the the profiles
you selected in the pane: `-Dspark3.2` `-Dscala-2.12`
- * Add `.mvn/` to the `.gitignore` file located in the root of the project.
+ * You need to select the intended Maven profiles (using Maven tool pane in
IDEA): select profiles you are targeting for example `spark3.5`, `scala-2.12`
etc.
+ * Add `.mvn/maven.config` file at the root of the repo w/ the profiles you
selected in the pane: `-Dspark3.5` `-Dscala-2.12`
+ * Add `.mvn/` to the `.gitignore` file located in the root of the project.
3. Make sure you change (temporarily) the `scala.binary.version` in the root
`pom.xml` to the intended scala profile version. For example if running with
spark3 `scala.binary.version` should be `2.12`
4. Finally right click on the unit test's method signature you are trying to
run, there should be an option with a mvn symbol that allows you to `run
<test-name>`, as well as an option to `debug <test-name>`.
* For debugging make sure to first set breakpoints in the src code see
(https://www.jetbrains.com/help/idea/debugging-code.html)
## Docker Setup
-We encourage you to test your code on docker cluster please follow this for
[docker setup](https://hudi.apache.org/docs/docker_demo).
+We encourage you to test your code on the docker cluster please follow this
for [docker setup](https://hudi.apache.org/docs/docker_demo).
-## Remote Debugging
+## Remote Debugging
-if your code fails on docker cluster you can remotely debug your code please
follow the below steps.
+If your code fails on the docker cluster you can remotely debug your code
please follow the below steps.
Step 1 :- Run your Hudi Streamer Job with --conf as defined this will ensure
to wait till you attach your intellij with Remote Debugging on port 4044
@@ -195,14 +228,14 @@ spark-submit \
--schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider
```
-Step 2 :- Attaching Intellij (tested on Intellij Version > 2019. this steps
may change acc. to intellij version)
+Step 2 :- Attaching Intellij (tested on Intellij Version > 2019. these steps
may change acc. to intellij version)
+
+- Come to Intellij --> Edit Configurations -> Remote -> Add Remote -> Put
Below Configs -> Apply & Save -> Put Debug Point -> Start.
+- Name : Hudi Remote
+- Port : 4044
+- Command Line Args for Remote JVM :
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=4044
+- Use Module ClassPath : select hudi
-- Come to Intellij --> Edit Configurations -> Remote -> Add Remote - > Put
Below Configs -> Apply & Save -> Put Debug Point -> Start. <br/>
-- Name : Hudi Remote <br/>
-- Port : 4044 <br/>
-- Command Line Args for Remote JVM :
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=4044 <br/>
-- Use Module ClassPath : select hudi <br/>
-
## Website
[Apache Hudi site](https://hudi.apache.org) is hosted on a special `asf-site`
branch. Please follow the `README` file under `docs` on that branch for
diff --git a/website/static/assets/images/contributing/IDE_setup_annotation.png
b/website/static/assets/images/contributing/IDE_setup_annotation.png
index 7646a438cd17..152f8d80079c 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_annotation.png and
b/website/static/assets/images/contributing/IDE_setup_annotation.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_checkstyle_1.png
b/website/static/assets/images/contributing/IDE_setup_checkstyle_1.png
index 7de5ba99a351..594eff516f44 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_checkstyle_1.png and
b/website/static/assets/images/contributing/IDE_setup_checkstyle_1.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_checkstyle_2.png
b/website/static/assets/images/contributing/IDE_setup_checkstyle_2.png
index bf624ce1d4a4..129e472c28f4 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_checkstyle_2.png and
b/website/static/assets/images/contributing/IDE_setup_checkstyle_2.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_checkstyle_3.png
b/website/static/assets/images/contributing/IDE_setup_checkstyle_3.png
index 61adb275e694..399fe93dd6af 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_checkstyle_3.png and
b/website/static/assets/images/contributing/IDE_setup_checkstyle_3.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_checkstyle_4.png
b/website/static/assets/images/contributing/IDE_setup_checkstyle_4.png
index f962c5c0ed05..cb696c6f23e7 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_checkstyle_4.png and
b/website/static/assets/images/contributing/IDE_setup_checkstyle_4.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_code_style_java_after.png
b/website/static/assets/images/contributing/IDE_setup_code_style_java_after.png
index ef3dfbee61f6..abc9e4bb289c 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_code_style_java_after.png
and
b/website/static/assets/images/contributing/IDE_setup_code_style_java_after.png
differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_code_style_java_before.png
b/website/static/assets/images/contributing/IDE_setup_code_style_java_before.png
index ecdb0fa464af..9e160351fc59 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_code_style_java_before.png
and
b/website/static/assets/images/contributing/IDE_setup_code_style_java_before.png
differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_copyright_1.png
b/website/static/assets/images/contributing/IDE_setup_copyright_1.png
index 0430f7e893b3..87d96e19c8fd 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_copyright_1.png and
b/website/static/assets/images/contributing/IDE_setup_copyright_1.png differ
diff --git
a/website/static/assets/images/contributing/IDE_setup_copyright_2.png
b/website/static/assets/images/contributing/IDE_setup_copyright_2.png
index 53a7ac746382..34ceefd63d3c 100644
Binary files
a/website/static/assets/images/contributing/IDE_setup_copyright_2.png and
b/website/static/assets/images/contributing/IDE_setup_copyright_2.png differ
diff --git a/website/static/assets/images/contributing/IDE_setup_java.png
b/website/static/assets/images/contributing/IDE_setup_java.png
deleted file mode 100644
index 9eef94b9148e..000000000000
Binary files a/website/static/assets/images/contributing/IDE_setup_java.png and
/dev/null differ
diff --git a/website/static/assets/images/contributing/IDE_setup_maven_1.png
b/website/static/assets/images/contributing/IDE_setup_maven_1.png
index 428b303bee67..8cc9db9dfabd 100644
Binary files a/website/static/assets/images/contributing/IDE_setup_maven_1.png
and b/website/static/assets/images/contributing/IDE_setup_maven_1.png differ
diff --git a/website/static/assets/images/contributing/IDE_setup_maven_2.png
b/website/static/assets/images/contributing/IDE_setup_maven_2.png
index a6f9950c1d7b..f87d62e4daf0 100644
Binary files a/website/static/assets/images/contributing/IDE_setup_maven_2.png
and b/website/static/assets/images/contributing/IDE_setup_maven_2.png differ
diff --git a/website/static/assets/images/contributing/IDE_setup_reload.png
b/website/static/assets/images/contributing/IDE_setup_reload.png
index 89e73a4e1d2f..d839fff2c512 100644
Binary files a/website/static/assets/images/contributing/IDE_setup_reload.png
and b/website/static/assets/images/contributing/IDE_setup_reload.png differ