wombatu-kun opened a new pull request, #19008:
URL: https://github.com/apache/hudi/pull/19008

   ### Describe the issue this Pull Request addresses
   
   The Azure CI job "UT FT common & other modules" (UT_FT_10) intermittently 
fails its initial full-reactor `mvn clean install` with 
`maven-compiler-plugin:compile ... Bad service configuration file, or exception 
thrown while constructing Processor object: Java heap space`, most recently 
while compiling `hudi-kafka-connect-bundle` (for example the Azure run for PR 
#19004, buildId 14704, where the change itself is unrelated test-only code). 
The job builds the entire reactor with `-T 3` inside a single `-Xmx8g` JVM on a 
memory-constrained Azure agent, so when several heavy module builds align in 
time the shared heap occasionally exceeds the 8g ceiling and OOMs. The 
annotation-processor line in the message is only where the allocation tips 
over, not the root cause.
   
   ### Summary and Changelog
   
   This scopes a lower build parallelism to just the job that OOMs, without 
touching any source code or the shared install options used by the other jobs. 
UT_FT_10's `clean install` now prepends `-T 2` before `$(MVN_OPTS_INSTALL)` 
(which contains `-T 3`). Maven uses the first `-T` it sees on the command line 
(verified: `-T 2 -T 3` resolves to a thread count of 2, `-T 3 -T 2` to 3), so 
the effective thread count for this job's install becomes 2 while every other 
job keeps the shared `-T 3`. Lowering the concurrency from 3 to 2 reduces how 
many heavy compiles and shade operations can run at the same time, and 
therefore the peak heap.
   
   The approach was chosen after measuring the heap profile of the full `clean 
install` locally: peak heap is about 2.4 GB at `-T 1`, about 2.4 GB at `-T 2`, 
and about 2.9 GB at `-T 3`, all far below the 8 GB ceiling, which shows the 
failure is a rare concurrency-driven tail spike rather than a systematic 
over-use of memory. `-T 2` keeps essentially the same wall-clock as `-T 3` 
(about half a minute slower in the local run) while bringing the measured peak 
heap back down to the `-T 1` level. An earlier idea to disable annotation 
processing on the bundle modules was measured and rejected, because it did not 
change the compile's heap requirement at all. No code was copied from 
third-party sources.
   
   ### Impact
   
   CI-only change scoped to the UT_FT_10 Azure job. No production code, public 
API, configuration default, or runtime behavior changes. The job's install 
phase becomes slightly slower (about half a minute in the local measurement) in 
exchange for lower peak heap; all other CI jobs are unaffected.
   
   ### Risk Level
   
   none
   
   CI-only change that lowers build parallelism for a single job. It cannot 
affect build output or test results, and the verified `-T` precedence 
guarantees the intended thread count.
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to