This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 98079059a821 docs: Update contribution guide with changes due to JIRA 
to GH issues migration (#13921)
98079059a821 is described below

commit 98079059a8215e4ded399c6b39443a3ca228aa8e
Author: vinoth chandar <[email protected]>
AuthorDate: Wed Sep 17 17:41:38 2025 -0700

    docs: Update contribution guide with changes due to JIRA to GH issues 
migration (#13921)
---
 website/community/get-involved.mdx      |  11 +-
 website/contribute/developer-setup.md   | 209 +--------------------
 website/contribute/how-to-contribute.md | 312 +++++++++++++++++++++++++++++---
 website/docs/performance.md             |   2 +-
 4 files changed, 297 insertions(+), 237 deletions(-)

diff --git a/website/community/get-involved.mdx 
b/website/community/get-involved.mdx
index 3eb1bca997c0..ad051a028c06 100644
--- a/website/community/get-involved.mdx
+++ b/website/community/get-involved.mdx
@@ -13,13 +13,12 @@ There are several ways to get in touch with the Hudi 
community.
 
 | When?                                                 | Channel to use       
                                                                                
                                                                                
                                                                                
                                                                |
 
|-------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| For development discussions                           | Dev Mailing list 
([Subscribe](mailto:[email protected]), 
[Unsubscribe](mailto:[email protected]), 
[Archives](https://lists.apache.org/[email protected])). Empty email 
works for subscribe/unsubscribe. Please use [gists](https://gist.github.com) to 
share code/stacktraces on the email.         |
-| For any general questions, user support               | Users Mailing list 
([Subscribe](mailto:[email protected]), 
[Unsubscribe](mailto:[email protected]), 
[Archives](https://lists.apache.org/[email protected])). Empty email 
works for subscribe/unsubscribe. Please use [gists](https://gist.github.com) to 
share code/stacktraces on the email. |
-| For reporting bugs or issues or discover known issues | Use [ASF 
self-service](https://selfserve.apache.org/jira-account.html) to request access 
to the [Hudi JIRA 
project](https://issues.apache.org/jira/projects/HUDI/summary).                 
                                                                                
                                                          |
-| For quick pings & 1-1 chats                           | Join our 
<SlackCommunity title="slack group" />. In case the link does not work, please 
leave a comment on this [github 
issue](https://github.com/apache/hudi/issues/143) or drop an email to 
[email protected]                                                     |
+| For development discussions                           | [Github 
Discussions](https://github.com/apache/hudi/discussions) or Dev Mailing list 
([Subscribe](mailto:[email protected]), 
[Unsubscribe](mailto:[email protected]), 
[Archives](https://lists.apache.org/[email protected])). Empty email 
works for subscribe/unsubscribe.          |
+| For any general questions, user support               | [Github 
Discussions](https://github.com/apache/hudi/discussions) or Users Mailing list 
([Subscribe](mailto:[email protected]), 
[Unsubscribe](mailto:[email protected]), 
[Archives](https://lists.apache.org/[email protected])). Empty email 
works for subscribe/unsubscribe.  |
+| For reporting bugs or issues or discover known issues | Use Github 
[Issues](https://github.com/apache/hudi/issues), please read guidelines 
[here](/contribute/how-to-contribute#filing-issues)                             
                                                                                
                                              |
+| For quick pings & 1-1 chats                           | Join our 
<SlackCommunity title="slack group" />. In case the link does not work, please 
start a GH discussion or file a community support issue or drop an email to 
[email protected]                                                     |
 | For proposing large features, changes                 | Start a RFC. 
Instructions [here](/contribute/rfc-process).                                   
                                                                                
                                                                                
                                                                        |
-| Join weekly sync-up meeting                           | Follow instructions 
[here](https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+Community+Weekly+Sync).
                                                                                
                                                                                
                                                      |
-| See [#here](#accounts) for wiki access                |                      
                                                                                
                                                                                
                                                                                
                                                                |
+| Join sync-up meetings                                 | [Community 
sync](/community/syncs) and [Dev Sync](/contribute/developer-sync-call).        
                                                                                
                                                                                
                                              |
 | For stream of commits, pull requests etc              | Commits Mailing list 
([Subscribe](mailto:[email protected]), 
[Unsubscribe](mailto:[email protected]), 
[Archives](https://lists.apache.org/[email protected]))              
                                                                                
                             |
 
 If you wish to report a security vulnerability, please contact 
[[email protected]](mailto:[email protected]).
diff --git a/website/contribute/developer-setup.md 
b/website/contribute/developer-setup.md
index 5065ac75f99a..215ae54aa4d4 100644
--- a/website/contribute/developer-setup.md
+++ b/website/contribute/developer-setup.md
@@ -13,10 +13,7 @@ To contribute code, you need
  - a GitHub account
  - a Linux (or) macOS development environment with Java JDK 8, Apache Maven 
(3.x+) installed
  - [Docker](https://www.docker.com/) installed for running demo, integ tests 
or building website
- - for large contributions, a signed [Individual Contributor License
-   Agreement](https://www.apache.org/licenses/icla.pdf) (ICLA) to the Apache
-   Software Foundation (ASF).
- - (Recommended) Create an account on 
[JIRA](https://issues.apache.org/jira/projects/HUDI/summary) to open 
issues/find similar issues.
+ - for large contributions, a signed [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf) (ICLA) to the Apache 
Software Foundation (ASF).
  - (Recommended) Join our dev mailing list & slack channel, listed on 
[community](/community/get-involved) page.
 
 
@@ -75,193 +72,8 @@ so that IntelliJ re-indexes the code.
 ![IDE setup copyright 1](/assets/images/contributing/IDE_setup_copyright_1.png)
 ![IDE setup copyright 2](/assets/images/contributing/IDE_setup_copyright_2.png)
 
-## Accounts and Permissions
-
- - [Hudi issue tracker 
(JIRA)](https://issues.apache.org/jira/projects/HUDI/issues):
-   Anyone can access it and browse issues. Anyone can register an account and 
login
-   to create issues or add comments. Only contributors can be assigned issues. 
If
-   you want to be assigned issues, a PMC member can add you to the project 
contributor
-   group.  Email the dev mailing list to ask to be added as a contributor, and 
include your ASF Jira username.
-
- - [Hudi Wiki Space](https://cwiki.apache.org/confluence/display/HUDI):
-   Anyone has read access. If you wish to contribute changes, please create an 
account and
-   request edit access on the dev@ mailing list (include your Wiki account 
user ID).
-
- - Pull requests can only be merged by a HUDI committer, listed 
[here](https://incubator.apache.org/projects/hudi)
-
- - [Voting on a release](https://www.apache.org/foundation/voting): Everyone 
can vote.
-   Only Hudi PMC members should mark their votes as binding.
-
-## Life of a Contributor
-
-This document details processes and procedures we follow to make contributions 
to the project and take it forward. 
-If you are looking to ramp up into the project as a contributor, we highly 
encourage you to read this guide in full, familiarize yourself with the 
workflow 
-and more importantly also try to improve the process along the way as well. 
-
-### Filing JIRAs
-
- - Hudi uses JIRA to manage issues. First, familiarize yourself with the 
various [components](https://issues.apache.org/jira/projects/HUDI/components) 
against which issues are filed in Hudi.
- - Make an attempt to find an existing JIRA, that may solve the same issue you 
are reporting. When in doubt, you can always email the mailing list so that the 
community can provide early feedback, 
-   point out any similar JIRAs or RFCs. 
- - Try to gauge whether this JIRA needs an 
[RFC](https://cwiki.apache.org/confluence/display/HUDI/RFC+Process). As always, 
email the mailing list if unsure. If you need an RFC since the change is
-   large in scope, then please follow the wiki instructions to get the process 
rolling along.
- - While raising a new JIRA or updating an existing one, please make sure to 
do the following
-      - The issue `type` and `components` (when resolving the ticket) are set 
correctly
-      - If you intend to target the JIRA for a specific release, please fill 
in the `fix version(s)` field, with the [release 
number](https://issues.apache.org/jira/projects/HUDI/releases).
-      - Summary should be descriptive enough to catch the essence of the 
problem/ feature
-      - Where necessary, capture the version of Hudi/Spark/Hive/Hadoop/Cloud 
environments in the ticket
-      - Whenever possible, provide steps to reproduce via sample code or on 
the [docker setup](https://hudi.apache.org/docker_demo)
- - All newly filed JIRAs are placed in the `NEW` state. If you are sure about 
this JIRA representing valid, scoped piece of work, please click `Accept Issue` 
to move it `OPEN` state
- - If you are not sure, please wait for a PMC/Committer to confirm/triage the 
issue and accept it. This process avoids contributors spending time on JIRAs 
with unclear scope.
- - Whenever possible, break down large JIRAs (e.g JIRAs resulting from an 
[RFC](https://cwiki.apache.org/confluence/display/HUDI/RFC+Process)) into `sub 
tasks` by clicking `More > create sub-task` from the parent JIRA , 
-   so that the community can contribute at large and help implement it much 
quickly. We recommend prefixing such JIRA titles with `[UMBRELLA]`
-
-### Claiming JIRAs
-
- - Finding a JIRA to work on 
-      - If you are new to the project, you can ramp up by picking up any 
issues tagged with the 
[newbie](https://issues.apache.org/jira/issues/?jql=project+%3D+HUDI+AND+component+%3D+newbie)
 component.
-      - If you want to work on some higher priority issue, then scout for Open 
issues against the next release on the JIRA, engage on unassigned/inactive 
JIRAs and offer help.
-      - Issues tagged with `Usability` , `Code Cleanup`, `Testing` components 
often present excellent opportunities to make a great impact.
- - If you don't have perms to self-assign JIRAs, please email the dev mailing 
list with your JIRA id and a small intro for yourself. We'd be happy to add you 
as a contributor.
- - As courtesy, if you are unable to continue working on a JIRA, please move 
it back to "OPEN" state and un-assign yourself.
-      - If a JIRA or its corresponding pull request has been inactive for a 
week, awaiting feedback from you, PMC/Committers could choose to re-assign them 
to another contributor.
-      - Such re-assignment process would be communicated over JIRA/GitHub 
comments, checking with the original contributor on his/her intent to continue 
working on the issue.
-      - You can also contribute by helping others contribute. So, if you don't 
have cycles to work on a JIRA and another contributor offers help, take it!
-
-### Contributing Code
-
- - Once you finalize on a project/task, please open a new JIRA or assign an 
existing one to yourself. 
-      - Almost all PRs should be linked to a JIRA. It's always good to have a 
JIRA upfront to avoid duplicating efforts.
-      - If the changes are minor, then `[MINOR]` prefix can be added to Pull 
Request title without a JIRA. Below are some tips to judge **MINOR** Pull 
Request :
-        - trivial fixes (for example, a typo, a broken link, intellisense or 
an obvious error)
-        - the change does not alter functionality or performance in any way
-        - changed lines less than 100
-        - obviously judge that the PR would pass without waiting for CI / CD 
verification
-      - But, you may be asked to file a JIRA, if reviewer deems it necessary
- - Before you begin work,
-      - Claim the JIRA using the process above and assign the JIRA to yourself.
-      - Click "Start Progress" on the JIRA, which tells everyone that you are 
working on the issue actively.
- - [Optional] Familiarize yourself with internals of Hudi using content on 
this page, as well as [wiki](https://cwiki.apache.org/confluence/display/HUDI)
- - Make your code change
-   - Get existing tests to pass using `mvn clean install -DskipITs`
-   - Add adequate tests for your new functionality
-   - For involved changes, it's best to test the changes in real production 
environments and report the results in the PR. 
-   - For website changes, please build the site locally & test navigation, 
formatting & links thoroughly
-   - If your code change changes some aspect of documentation (e.g new config, 
default value change), 
-     please ensure there is another PR to [update the 
docs](https://github.com/apache/hudi/tree/asf-site/README.md) as well.
- - Sending a Pull Request
-   - Format commit and the pull request title like `[HUDI-XXX] Fixes bug in 
Spark Datasource`, 
-     where you replace `HUDI-XXX` with the appropriate JIRA issue. 
-   - Pull request titles must have either `[HUDI-XXX]` or `[MINOR]` in their 
title. Note the brackets and capitalization.
-   - Please ensure your commit message body is descriptive of the change. 
Bulleted summary would be appreciated.
-   - You must follow the instructions in the template and fill out all fields 
to pass our compliance checks.
-   - Do not remove or modify any headings in the template.
-   - Push your commit to your own fork/branch & create a pull request (PR) 
against the Hudi repo.
-   - If you don't hear back within 3 days on the PR, please send an email to 
the dev @ mailing list.
-   - Address code review comments & keep pushing changes to your fork/branch, 
which automatically updates the PR
-   - Before your change can be merged, it should be squashed into a single 
commit for cleaner commit history.
- - Finally, once your pull request is merged, make sure to `Close` the JIRA.
-
-### Coding guidelines 
-
-Our code can benefit from contributors speaking the same "language" when 
authoring code. After all, it gets read a lot more than it gets
-written. So optimizing for "reads" is a good goal. The list below is a set of 
guidelines, that contributors strive to upkeep and reflective 
-of how we want to evolve our code in the future.
-
-#### Style 
-
- - **Formatting** We should rely on checkstyle and spotless to auto fix 
formatting; automate this completely. Where we cannot,
-    we will err on the side of not taxing contributors with manual effort.
- - **Refactoring**
-   - Refactor with purpose; any refactor suggested should be attributable to 
functionality that now becomes easy to implement.
-   - A class is asking to be refactored, when it has several overloaded 
responsibilities/have sets of fields/methods which are used more cohesively 
than others. 
-   - Try to name tests using the given-when-then model, that cleans separates 
preconditions (given), an action (when), and assertions (then).
- - **Naming things**
-   - Let's name uniformly; using the same word to denote the same concept. 
e.g: bootstrap vs external vs source, when referring to bootstrapped tables. 
-     Maybe they all mean the same, but having one word makes the code lot more 
easily readable. 
-   - Let's name consistently with Hudi terminology. e.g dataset vs table, base 
file vs data file.
-   - Class names preferably are nouns (e.g Runner) which reflect their 
responsibility and methods are verbs (e.g run()).
-   - Avoid filler words, that don't add value e.g xxxInfo, xxxData, etc.
-   - We name classes in code starting with `Hoodie` and not `Hudi` and we want 
to keep it that way for consistency/historical reasons. 
- - **Methods**
-   - Individual methods should short (~20-30 lines) and have a single purpose; 
If you feel like it has a secondary purpose, then maybe it needs
-     to be broken down more.
-   - Lesser the number of arguments, the better; 
-   - Place caller methods on top of callee methods, whenever possible.
-   - Avoid "output" arguments e.g passing in a list and filling its values 
within the method.
-   - Try to limit individual if/else blocks to few lines to aid readability.
-   - Separate logical blocks of code with a newline in between e.g read a file 
into memory, loop over the lines.
- - **Classes**
-   - Like method, each Class should have a single purpose/responsibility.
-   - Try to keep class files to about 200 lines of length, nothing beyond 500.
-   - Avoid stating the obvious in comments; e.g each line does not deserve a 
comment; Document corner-cases/special perf considerations etc clearly.
-   - Try creating factory methods/builders and interfaces wherever you feel a 
specific implementation may be changed down the line.
-
-#### Substance
-
-- Try to avoid large PRs; if unavoidable (many times they are) please separate 
refactoring with the actual implementation of functionality. 
-  e.g renaming/breaking up a file and then changing code changes, makes the 
diff very hard to review.
-- **Licensing**
-    - Every source file needs to include the Apache license header. Every new 
dependency needs to have 
-      an open source license 
[compatible](https://www.apache.org/legal/resolved#criteria) with Apache.
-    - If you are re-using code from another apache/open-source project, 
licensing needs to be compatible and attribution added to `LICENSE` file
-    - Please DO NOT copy paste any code from StackOverflow or other online 
sources, since their license attribution would be unclear. Author them yourself!
-- **Code Organization** 
-    - Anything in `hudi-common` cannot depend on a specific engine runtime 
like Spark. 
-    - Any changes to bundles under `packaging`, will be reviewed with 
additional scrutiny to avoid breakages across versions.
-- **Code reuse**
-  - Whenever you can, please use/enhance use existing utils classes in code 
(`CollectionUtils`, `ParquetUtils`, `HoodieAvroUtils`). Search for classes 
ending in `Utils`.
-  - As a complex project, that must integrate with multiple systems, we tend 
to avoid dependencies like `guava`, `apache commons` for the sake of easy 
integration. 
-     Please start a discussion on the mailing list, before attempting to 
reintroduce them
-  - As a data system, that takes performance seriously, we also write pieces 
of infrastructure (e.g `ExternalSpillableMap`) natively, that are optimized 
specifically for our scenarios.
-     Please start with them first, when solving problems.
- - **Breaking changes**
-   - Any version changes for dependencies, needs to be ideally vetted across 
different user environments in the community, to get enough confidence before 
merging.
-   - Any changes to methods annotated with `PublicAPIMethod` or classes 
annotated with `PublicAPIClass` require upfront discussion and potentially an 
RFC.
-   - Any non-backwards compatible changes similarly need upfront discussion 
and the functionality needs to implement an upgrade-downgrade path.
-
-#### Tests
-
-- **Categories**
-    - unit - testing basic functionality at the class level, potentially using 
mocks. Expected to finish quicker
-    - functional - brings up the services needed and runs test without mocking
-    - integration - runs subset of functional tests, on a full fledged 
enviroment with dockerized services
-- **Prepare Test Data**
-    - Many unit and functional test cases require a Hudi dataset to be 
prepared beforehand. `HoodieTestTable` and `HoodieWriteableTestTable` are 
dedicated test utility classes for this purpose. Use them whenever appropriate, 
and add new APIs to them when needed.
-    - When add new APIs in the test utility classes, overload APIs with 
variety of arguments to do more heavy-liftings for callers.
-    - In most scenarios, you won't need to use `FileCreateUtils` directly.
-    - If test cases require interaction with actual `HoodieRecord`s, use 
`HoodieWriteableTestTable` (and `HoodieTestDataGenerator` probably). Otherwise, 
`HoodieTestTable` that manipulates empty files shall serve the purpose.
-- **Strive for Readability**
-    - Avoid writing flow controls for different assertion cases. Split to a 
new test case when appropriate.
-    - Use plain for-loop to avoid try-catch in lambda block. Declare 
exceptions is okay.
-    - Use static import for constants and static helper methods to avoid 
lengthy code.
-    - Avoid reusing local variable names. Create new variables generously.
-    - Keep helper methods local to the test class until it becomes obviously 
generic and re-useable. When that happens, move the helper method to the right 
utility class. For example, `Assertions` contains common assert helpers, and 
`SchemaTestUtil` is for schema related helpers.
-    - Avoid putting new helpers in `HoodieTestUtils` and 
`HoodieClientTestUtils`, which are named too generic. Eventually, all test 
helpers shall be categorized properly.  
-
-### Reviewing Code/RFCs
-
- - All pull requests would be subject to code reviews, from one or more of the 
PMC/Committers. 
- - Typically, each PR will get an "Assignee" based on their area of expertise, 
who will work with you to land the PR.
- - Code reviews are vital, but also often time-consuming for everyone 
involved. Below are some principles which could help align us better.
-   - Reviewers need to provide actionable, concrete feedback that states what 
needs to be done to get the PR closer to landing.
-   - Reviewers need to make it explicit, which of the requested changes would 
block the PR vs good-to-dos.
-   - Both contributors/reviewers need to keep an open mind and ground 
themselves to making the most technically sound argument.
-   - If progress is hard, please involve another PMC member/Committer to share 
another perspective.
-   - Staying humble and eager to learn, goes a long way in ensuring these 
reviews are smooth.
- - Reviewers are expected to uphold the code quality, standards outlined above.
- - When merging PRs, always make sure you are squashing the commits using the 
"Squash and Merge" feature in Github
- - When necessary/appropriate, reviewers could make changes themselves to PR 
branches, with the intent to get the PR landed sooner. (see 
[how-to](https://cwiki.apache.org/confluence/display/HUDI/Resources#Resources-PushingChangesToPRs))
-   Reviewers should seek explicit approval from author, before making large 
changes to the original PR.
-
-### Suggest Changes
-
-We welcome new ideas and suggestions to improve the project, along any 
dimensions - management, processes, technical vision/direction. To kick start a 
discussion on the mailing thread
-to effect change and source feedback, start a new email thread with the 
`[DISCUSS]` prefix and share your thoughts. If your proposal leads to a larger 
change, then it may be followed up
-by a [vote](https://www.apache.org/foundation/voting) by a PMC member or 
others (depending on the specific scenario). 
-For technical suggestions, you can also leverage [our RFC 
Process](https://cwiki.apache.org/confluence/display/HUDI/RFC+Process) to 
outline your ideas in greater detail.
-
-### Useful Maven commands for developers. 
+
+## Useful Maven commands for developers. 
 Listing out some of the maven commands that could be useful for developers. 
 
 - Compile/build entire project 
@@ -326,17 +138,6 @@ You can use `alt def` command to define different 
docker-compose versions. Refer
 Use `alt use` to use v1 version of docker-compose while running integration 
test locally.
 
 
-## Releases
-
- - Apache Hudi community plans to do minor version releases every 6 weeks or 
so.
- - If your contribution merged onto the `master` branch after the last 
release, it will become part of the next release.
- - Website changes are regenerated on-demand basis (until automation in place 
to reflect immediately)
-
-## Communication
-
-All communication is expected to align with the [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct).
-Discussion about contributing code to Hudi happens on the [dev@ mailing 
list](/community/get-involved). Introduce yourself!
-
 ## Code & Project Structure
 
   * `docker` : Docker containers used by demo and integration tests. Brings up 
a mini data ecosystem locally
@@ -357,7 +158,9 @@ This Quick Video will give a code walkthrough to start with 
[watch](https://www.
 
 ## Running unit tests and local debugger via Intellij IDE
 
-#### IMPORTANT REMINDER FOR BELOW STEPS: When submitting a PR please make sure 
to NOT commit the changes mentioned in these steps, instead once testing is 
done make sure to revert the changes and then submit a pr.
+:::note Important reminder
+When submitting a PR please make sure to NOT commit the changes mentioned in 
these steps, instead once testing is done make sure to revert the changes and 
then submit a pr.
+:::
 
 0. Build the project with the intended profiles via the `mvn` cli, for example 
for spark 3.2 use `mvn clean package -Dspark3.2 -Dscala-2.12 -DskipTests`. 
 1. Install the "Maven Helper" plugin from the Intellij IDE.
diff --git a/website/contribute/how-to-contribute.md 
b/website/contribute/how-to-contribute.md
index a93e48b13d35..50d01b7b280d 100644
--- a/website/contribute/how-to-contribute.md
+++ b/website/contribute/how-to-contribute.md
@@ -7,38 +7,296 @@ last_modified_at: 2020-09-01T15:59:57-04:00
 
 Apache Hudi community welcomes contributions from anyone!
 
-Here are few ways, you can get involved.
+## Ways to become a contributor
 
- - Ask (and/or) answer questions on our support channels listed above.
- - Review code or RFCs
- - Help improve documentation
- - Author blogs on our wiki
- - Testing; Improving out-of-box experience by reporting bugs
- - Share new ideas/directions to pursue or propose a new RFC
- - Contributing code to the project: check out [newbie 
JIRAs](https://issues.apache.org/jira/issues/?filter=12350891).
+A GitHub account is needed to file issues, start discussions, and send pull 
requests to Hudi. Here are a few ways you can get involved.
 
-## Become a Committer
+ - Engage with the community on [GitHub 
Discussions](https://github.com/apache/hudi/discussions) or Slack
+ - Help improve docs and contribute blogs 
[here](https://github.com/apache/hudi/tree/asf-site) for hudi.apache.org
+ - Share [new 
feature](https://github.com/apache/hudi/issues/new?template=hudi_feature.yml) 
requests or propose a [new RFC](/contribute/rfc-process)
+ - Contribute code to the project by raising [pull requests 
(PR)](https://github.com/apache/hudi/pulls) adhering to the [contribution 
guide](/contribute/developer-setup). Here are some good [first 
issues](https://github.com/apache/hudi/issues?q=state%3Aopen%20label%3Agood-first-issues).
+ - Report 
[bugs](https://github.com/apache/hudi/issues/new?template=hudi_bug.yml) or 
suggest 
[improvements](https://github.com/apache/hudi/issues/new?template=hudi_improvement.yml)
 to the user experience; review code or RFCs on GitHub
+ - Share your success story on [Hudi 
LinkedIn](https://www.linkedin.com/company/apache-hudi/) Community Syncs.
+ - Pull requests can only be merged by a Hudi committer, listed 
[here](/community/team), but anyone is free to review.
+ - [Voting on a release](https://www.apache.org/foundation/voting): Everyone 
can vote on the dev mailing list. Only Hudi PMC members should mark their votes 
as binding.
 
-We are always looking for strong contributors, who can become 
[committers](https://www.apache.org/dev/committers) on the project. 
-Committers are chosen by a majority vote of the Apache Hudi 
[PMC](https://www.apache.org/foundation/how-it-works#pmc-members), after a 
discussion on their candidacy based on the following criteria (not 
exclusive/comprehensive).
+All communication is expected to align with the [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct).
+
+## Contributing on GitHub
+
+:::note Developer setup 
+ If you are planning to contribute code, please refer to [developer 
setup](/contribute/developer-setup) for instructions and information that will 
+ help you get going.
+:::
+
+This document details the processes and procedures we follow to make 
contributions to the project.
+If you are looking to ramp up as a contributor to the project, we highly 
encourage you to read this guide in full and familiarize yourself with the 
workflow.
+
+## Filing Issues
+
+Hudi manages development tasks and project/release management using GitHub 
Issues, following the process and protocol below.
+
+There are five types of GitHub Issues.
+
+| Issue type        | Purpose                                                 
| Who can file                           | Label                  |
+|:------------------|:--------------------------------------------------------|----------------------------------------|------------------------|
+| Epic              | Roadmap tracking across multiple releases               
| Only created by maintainers/committers | type:epic              |
+| Feature           | New feature development stories. Can have sub-issues    
| Anyone                                 | type:feature           |
+| Improvements      | Regular dev tasks and improvements. Can have sub-issues 
| Anyone                                 | type:devtask           |
+| Bug               | For issues that are fixing bugs in Hudi                 
| Anyone                                 | type:bug               |
+| Community Support | Problems reported that may still need triaging          
| Anyone                                 | type:community-support |
+
+When filing issues, please follow the issue templates tightly to ensure smooth 
project management for everyone involved. 
+
+Some things to keep in mind and strive for: 
+
+- Make an attempt to find an existing issue that may solve the same issue you 
are reporting.
+- Carefully gauge whether a new feature needs an 
[RFC](/contribute/rfc-process).
+- If you intend to target an issue for a specific release, please mark the 
release using the `milestone` field on the GitHub issue.
+- If you are not sure, please wait for a PMC/Committer to confirm/triage the 
issue and accept it. This also avoids contributors spending time on issues with 
unclear scope.
+- Whenever possible, break down large issues into sub-issues such that each 
sub-issue can be fixed by a PR of reasonable size/complexity.
+- You can also contribute by helping others contribute. So, if you don't have 
cycles to work on an issue and another contributor offers help, take it!
+
+When in doubt, you can always start a GitHub discussion so that the community 
can provide early feedback and point out any similar issues, PRs, or RFCs.
+
+## Opening Pull Requests
+
+This project follows [Conventional 
Commits](https://www.conventionalcommits.org/en/v1.0.0/) specification for PR 
titles to ensure consistency and enable automated tooling.
+
+:::important
+All pull requests must either reference a GitHub Issue or describe the issue 
inline clearly within the pull request.
+Refer to an issue using `closes #<issue_number>` if intending to auto-close 
the issue when the pull request is merged or closed,
+or simply `issue: #<issue_number>` if auto-close is not desirable.
+
+For larger features, maintainers may insist on filing GitHub Issues, 
appropriately linked to other GitHub issues.
+:::
+
+### PR Title Format
+
+```
+<type>(optional scope): <description>
+```
+
+For breaking changes that require attention, use
+
+```
+<type>(optional scope)!: <description>
+```
+
+The following types are allowed.
+
+| type     | Purpose                                                           
                     |
+|:---------|:---------------------------------------------------------------------------------------|
+| feat     | new feature addition                                              
                     |
+| fix      | bug fix                                                           
                     |
+| docs     | doc changes only - code or html or .md files                      
                     |
+| style    | Code style, formatting or lint-only changes with 0 logic changes  
                     |
+| refactor | Code changes that are neither fixing or adding features. cleanup, 
redoing abstractions |
+| perf     | Performance improvements or tooling                               
                     |
+| test     | Adding, fixing tests and test infrastructure.                     
                     |
+| chore    | Tooling, build system, CI/CD, or maintenance tasks                
                     |
+
+#### Scopes
+Optionally, any of the below can be added as scope to PRs. Scopes provide 
additional context and can be used to specify which part of the codebase is 
affected.
+This helps us track where development activity is directed and whether bugs on 
a component are being resolved in a timely fashion. Tooling should auto-apply 
the 
+right label to pull requests and issues.
+
+| Scope          | Purpose                                                     
                    | Label               |
+|:---------------|:--------------------------------------------------------------------------------|---------------------|
+| common         | common code or abstractions shared across the entire 
project                    | area:common         |
+| core           | Changes affecting transaction management, concurrency and 
core read/write flows | area:core           |
+| api            | Any changes affecting public apis or interfaces             
                    | area:api            |
+| config         | Any changes affecting public configs                        
                    | area:config         |
+| storage-format | Any changes to bits on storage - timeline, index, data and 
metadata             | area:storage-format |
+| metadata-table | Changes around metadata table                               
                    | area:metadata-table |
+| table-services | Cleaning, Clustering, Compaction, Log Compaction, Indexing, 
TTL, ...            | area:table-services |
+| tools          | Any tools like CLI                                          
                     | area:tools          |
+| ingest         | Spark and Flink streamer tools, to ELT data into Hudi. 
Kafka sink               | area:ingest         |
+| spark          | Spark SQL, Streaming, Structured Streaming, Data source     
                    | engine:spark        |
+| flink          | DataStream writing/reading, SQL, Dynamic Tables             
                    | engine:flink        |
+| trino          | Trino Hudi connector maintained in Hudi repo                
                    | engine:trino        |
+
+
+For example:
+
+```
+feat(flink): add bucket index implementation on Flink
+fix(index): fix index update logic
+```
 
- - Embodies the ASF model code of 
[conduct](https://www.apache.org/foundation/policies/conduct)
- - Has made significant technical contributions such as submitting PRs, filing 
bugs, testing, benchmarking, authoring RFCs, providing feedback/code reviews (+ 
more).
- - Has helped the community over a few months, by answering questions on 
support channels above and triaging issues/jiras.
- - Demonstrates clear code/design ownership of a component or code area (eg: 
Delta Streamer, Hive/Presto Integration etc).
- - Brought thought leadership and new ideas into the project and evangelized 
them with the community via conference talks, blog posts.
- - Great citizenship in helping with all peripheral (but very critical) work 
like site maintenance, wiki/jira cleanups and so on.
- - Proven commitment to the project by way of upholding all agreed upon 
processes, conventions and principles of the community.
+When broken down adequately, most issues and pull requests should address just 
one primary area or scope respectively.
+But, there may be some special situations.
 
-## Code Contributions
+1. If your PR makes any API, core, or storage-format changes, it absolutely 
must be called out.
+2. If you are unsure about the component to use either because the PR or issue 
goes across them or it falls outside the list above, omit the scope in the PR 
title or issue label.
 
-Useful resources for contributing can be found under the "Quick Links" left 
menu.
-Specifically, please refer to the detailed [contribution 
guide](/contribute/developer-setup).
+If your PR is targeting an old JIRA (before Hudi migrated to GitHub Issues), 
put the JIRA number in the scope.
 
-## Accounts
+```
+feat(HUDI-1234): add a new feature
+```
+
+### Examples
+
+#### Good PR Titles ✅
+
+```
+feat: add a new index in the metadata table
+fix: resolve type handling in data processing
+docs: update installation instructions
+style: format code according to style guide
+refactor: extract common utility functions
+perf: optimize Spark query performance
+test: add unit tests for common storage layout
+chore: resolve checkstyle warnings
+improvement: enhance error handling in type cast
+blocker: fix class loading failure on Java 17
+security: update dependencies to latest versions
+```
+
+#### Bad PR Titles ❌
+
+```
+Add new feature                   # Missing type
+FIX: bug in login                 # Type should be lowercase
+feat add authentication           # Missing colon
+feature: new login system         # Invalid type (should be 'feat')
+fix                               # Missing description
+```
+
+#### Breaking Changes
+
+For breaking changes, add an exclamation mark after the type/scope:
+
+```
+feat!: change merger API to account for better delete handling
+feat(index)!: change secondary index layout
+```
+PRs with breaking changes will be subject to broader reviews and opinions 
before they are merged.
+
+#### Validation
+
+PR titles are automatically validated using GitHub Actions for semantic 
validation.
+
+If your PR title doesn't follow these guidelines, the validation check will 
fail and you'll need to update it before merging.
+
+In rare cases, you can skip validation by adding one of these labels to your 
PR:
+- `bot`
+- `ignore-semantic-pull-request`
+
+Or include `[skip ci]` in your PR title for CI-related changes.
+
+## Coding guidelines
+
+Our code can benefit from contributors speaking the same "language" when 
authoring code. After all, it gets read a lot more than it gets
+written, so optimizing for "reads" is a good goal. The list below is a set of 
guidelines that contributors strive to uphold and is reflective
+of how we want to evolve our code in the future.
+
+### Style
+
+- **Formatting** We should rely on checkstyle and spotless to auto-fix 
formatting; automate this completely. Where we cannot,
+  we will err on the side of not taxing contributors with manual effort.
+- **Refactoring**
+    - Refactor with purpose; any refactor suggested should be attributable to 
functionality that now becomes easy to implement.
+    - A class is asking to be refactored when it has several overloaded 
responsibilities or has sets of fields/methods that are used more cohesively 
than others.
+    - Try to name tests using the given-when-then model, which cleanly 
separates preconditions (given), an action (when), and assertions (then).
+- **Naming things**
+    - Let's name uniformly by using the same word to denote the same concept. 
For example: bootstrap vs external vs source, when referring to bootstrapped 
tables.
+      Maybe they all mean the same, but having one word makes the code a lot 
more easily readable.
+    - Let's name consistently with Hudi terminology. For example: dataset vs 
table, base file vs data file.
+    - Class names preferably are nouns (e.g Runner) which reflect their 
responsibility and methods are verbs (e.g run()).
+    - Avoid filler words that don't add value, for example: xxxInfo, xxxData, 
etc.
+    - We name classes in code starting with `Hoodie` and not `Hudi` and we 
want to keep it that way for consistency/historical reasons.
+- **Methods**
+    - Individual methods should be short (~20-30 lines) and have a single 
purpose. If you feel like it has a secondary purpose, then maybe it needs
+      to be broken down more.
+    - The fewer the number of arguments, the better.
+    - Place caller methods on top of callee methods, whenever possible.
+    - Avoid "output" arguments, for example: passing in a list and filling its 
values within the method.
+    - Try to limit individual if/else blocks to few lines to aid readability.
+    - Separate logical blocks of code with a newline in between, for example: 
read a file into memory, loop over the lines.
+- **Classes**
+    - Like methods, each class should have a single purpose/responsibility.
+    - Try to keep class files to about 200 lines of length, nothing beyond 500.
+    - Avoid stating the obvious in comments; for example, each line does not 
deserve a comment. Document corner cases/special performance considerations, 
etc., clearly.
+    - Try creating factory methods/builders and interfaces wherever you feel a 
specific implementation may be changed down the line.
+
+#### Substance
+
+Try to avoid large PRs; if unavoidable (many times they are), please separate 
refactoring from the actual implementation of functionality.
+For example, renaming/breaking up a file and then making code changes makes 
the diff very hard to review.
+- **Licensing**
+    - Every source file needs to include the Apache license header. Every new 
dependency needs to have
+      an open source license 
[compatible](https://www.apache.org/legal/resolved#criteria) with Apache.
+    - If you are reusing code from another Apache/open-source project, 
licensing needs to be compatible and attribution added to the `LICENSE` file
+    - Please DO NOT copy-paste any code from StackOverflow or other online 
sources, since their license attribution would be unclear. Author them yourself!
+- **Code Organization**
+    - Anything in `hudi-common` cannot depend on a specific engine runtime 
like Spark.
+    - Any changes to bundles under `packaging`, will be reviewed with 
additional scrutiny to avoid breakages across versions.
+- **Code reuse**
+    - Whenever you can, please use/enhance use existing utils classes in code 
(`CollectionUtils`, `ParquetUtils`, `HoodieAvroUtils`). Search for classes 
ending in `Utils`.
+    - As a complex project that must integrate with multiple systems, we tend 
to avoid dependencies like `guava` and `apache commons` for the sake of easy 
integration.
+      Please start a discussion on the mailing list before attempting to 
reintroduce them
+    - As a data system that takes performance seriously, we also write pieces 
of infrastructure (e.g., `ExternalSpillableMap`) natively that are optimized 
specifically for our scenarios.
+      Please start with them first when solving problems.
+- **Breaking changes**
+    - Any version changes for dependencies need to be ideally vetted across 
different user environments in the community to get enough confidence before 
merging.
+    - Any changes to methods annotated with `PublicAPIMethod` or classes 
annotated with `PublicAPIClass` require upfront discussion and potentially an 
RFC.
+    - Any non-backwards compatible changes similarly need upfront discussion 
and the functionality needs to implement an upgrade-downgrade path.
+- **Documentation**
+   - Where necessary, please ensure there is another PR to [update the 
docs](https://github.com/apache/hudi/tree/asf-site/README.md) as well.
+   - Keep RFCs up to date as implementation evolves.
+
+### Testing 
+Add adequate tests for your new functionality. For involved changes, it's best 
to test the changes in real production environments and report the results in 
the PR.
+For website changes, please build the site locally & test navigation, 
formatting & links thoroughly
+
+- **Categories**
+    - unit - testing basic functionality at the class level, potentially using 
mocks. Expected to finish quicker
+    - functional - brings up the services needed and runs test without mocking
+    - integration - runs a subset of functional tests on a full-fledged 
environment with dockerized services
+- **Prepare Test Data**
+    - Many unit and functional test cases require a Hudi dataset to be 
prepared beforehand. `HoodieTestTable` and `HoodieWriteableTestTable` are 
dedicated test utility classes for this purpose. Use them whenever appropriate, 
and add new APIs to them when needed.
+    - When adding new APIs in the test utility classes, overload APIs with a 
variety of arguments to do more heavy-lifting for callers.
+    - In most scenarios, you won't need to use `FileCreateUtils` directly.
+    - If test cases require interaction with actual `HoodieRecord`s, use 
`HoodieWriteableTestTable` (and `HoodieTestDataGenerator` probably). Otherwise, 
`HoodieTestTable` that manipulates empty files shall serve the purpose.
+- **Strive for Readability**
+    - Avoid writing flow controls for different assertion cases. Split to a 
new test case when appropriate.
+    - Use plain for-loop to avoid try-catch in lambda block. Declare 
exceptions is okay.
+    - Use static import for constants and static helper methods to avoid 
lengthy code.
+    - Avoid reusing local variable names. Create new variables generously.
+    - Keep helper methods local to the test class until it becomes obviously 
generic and reusable. When that happens, move the helper method to the right 
utility class. For example, `Assertions` contains common assert helpers, and 
`SchemaTestUtil` is for schema-related helpers.
+    - Avoid putting new helpers in `HoodieTestUtils` and 
`HoodieClientTestUtils`, which are named too generic. Eventually, all test 
helpers shall be categorized properly.
+
+## Reviewing Pull Requests/RFCs
+
+- All pull requests would be subject to code reviews, from one or more of the 
PMC/Committers.
+- Typically, each PR will get an "Assignee" based on their area of expertise, 
who will work with you to land the PR.
+- Code reviews are vital, but also often time-consuming for everyone involved. 
Below are some principles which could help align us better.
+    - Reviewers need to provide actionable, concrete feedback that states what 
needs to be done to get the PR closer to landing.
+    - Reviewers need to make it explicit, which of the requested changes would 
block the PR vs good-to-dos.
+    - Both contributors/reviewers need to keep an open mind and ground 
themselves to making the most technically sound argument.
+    - If progress is hard, please involve another PMC member/Committer to 
share another perspective.
+    - Staying humble and eager to learn, goes a long way in ensuring these 
reviews are smooth.
+- Reviewers are expected to uphold the code quality, standards outlined above.
+- When merging PRs, always make sure you are squashing the commits using the 
"Squash and Merge" feature in GitHub
+- When necessary/appropriate, reviewers could make changes themselves to PR 
branches, with the intent to get the PR landed sooner.
+  Reviewers should seek explicit approval from the author before making large 
changes to the original PR.
+
+### Proposing Changes
+We welcome new ideas and suggestions to improve the project along any 
dimensions - management, processes, technical vision/direction. To kick-start a 
discussion on the mailing thread
+to effect change and source feedback, start a new email thread with the 
`[DISCUSS]` prefix and share your thoughts. If your proposal leads to a larger 
change, then it may be followed up
+by a [vote](https://www.apache.org/foundation/voting) by a PMC member or 
others (depending on the specific scenario). For technical suggestions, you can 
also leverage [our RFC Process](/contribute/rfc-process) to outline your ideas 
in greater detail.
+
+## Becoming a Committer
+
+We are always looking for strong contributors, who can become 
[committers](https://www.apache.org/dev/committers) on the project.
+Committers are chosen by a majority vote of the Apache Hudi 
[PMC](https://www.apache.org/foundation/how-it-works#pmc-members), after a 
discussion on their candidacy based on the following criteria (not 
exclusive/comprehensive).
 
-It's useful to obtain few accounts to be able to effectively contribute to 
Hudi.
- 
- - Github account is needed to send pull requests to Hudi
- - Sign-up/in to the Apache [JIRA](https://issues.apache.org/jira). Then 
please email the dev mailing list with your username, asking to be added as a 
contributor to the project. This enables you to assign/be-assigned tickets and 
comment on them. 
- - Sign-up/in to the Apache 
[cWiki](https://cwiki.apache.org/confluence/signup.action), to be able to 
contribute to the wiki pages/RFCs. 
+- Embodies the ASF model code of 
[conduct](https://www.apache.org/foundation/policies/conduct)
+- Has made significant technical contributions such as submitting PRs, filing 
bugs, testing, benchmarking, authoring RFCs, providing feedback/code reviews (+ 
more).
+- Has helped the community over a few months by answering questions on support 
channels above and triaging issues.
+- Demonstrates clear code/design ownership of a component or code area (e.g., 
Delta Streamer, Hive/Presto Integration, etc.).
+- Brought thought leadership and new ideas into the project and evangelized 
them with the community via conference talks, blog posts.
+- Great citizenship in helping with all peripheral (but very critical) work 
like site maintenance, wiki cleanups, and so on.
+- Proven commitment to the project by way of upholding all agreed upon 
processes, conventions and principles of the community.
\ No newline at end of file
diff --git a/website/docs/performance.md b/website/docs/performance.md
index 89bcd3ee75b0..669213137d63 100644
--- a/website/docs/performance.md
+++ b/website/docs/performance.md
@@ -76,7 +76,7 @@ significant savings on the overall compute cost.
 </figure>
 
 Hudi upserts have been stress tested upto 4TB in a single commit across the t1 
table. 
-See [here](https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide) for 
some tuning tips.
+See [here](/docs/tuning-guide) for some tuning tips.
 
 #### Indexing
 

Reply via email to