suryaprasanna opened a new pull request, #17862:
URL: https://github.com/apache/hudi/pull/17862

   ### Describe the issue this Pull Request addresses
   
   This PR fixes an issue in Hudi CLI where attempting to create multiple 
`JavaSparkContext` instances would fail with "Only one SparkContext should be 
running in this JVM" error. The problem occurred because the code directly 
instantiated a new `JavaSparkContext` without checking if one already existed. 
Since, it is a public method and if engineers create custom hudi-cli commands 
and they access this method it can throw "Only one SparkContext should be 
running" error.
   
   The fix uses `SparkSession.builder().getOrCreate()` which properly handles 
existing contexts, and then obtains the `JavaSparkContext` from the session.
   
   ### Summary and Changelog
   
   Users can now run multiple Hudi CLI commands that require Spark without 
encountering SparkContext initialization errors.
   
     **Changes:**
     - Modified `SparkUtil.initJavaSparkContext()` to use 
`SparkSession.builder().getOrCreate()` instead of directly creating 
`JavaSparkContext`
     - Added `SparkSession` import
     - Changed from `new JavaSparkContext(sparkConf)` to 
`JavaSparkContext.fromSparkContext(spark.sparkContext())`
   
   ### Impact
   
   None - this change only affects context creation logic.
   
   ### Risk Level
   
     **Low** - This change uses the recommended pattern for obtaining a 
SparkContext via SparkSession, which is more robust than direct instantiation. 
The `getOrCreate()` method ensures we reuse existing contexts when available 
and only create new ones when necessary.
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to