ever4Kenny opened a new pull request, #3581:
URL: https://github.com/apache/celeborn/pull/3581

   ### What changes were proposed in this pull request?
   
   This PR converts the static initialization of columnar shuffle class 
constructors 
   and skew shuffle method to lazy initialization using the 
initialization-on-demand 
   holder idiom (static inner class pattern) in SparkUtils.java.
   
   Specifically, the following changes were made:
   
   1. Introduced `ColumnarHashBasedShuffleWriterConstructorHolder` static inner 
class 
      to lazily initialize the constructor for ColumnarHashBasedShuffleWriter
   
   2. Introduced `ColumnarShuffleReaderConstructorHolder` static inner class to 
lazily 
      initialize the constructor for CelebornColumnarShuffleReader
   
   3. Introduced `CelebornSkewShuffleMethodHolder` static inner class to lazily 
      initialize the `isCelebornSkewedShuffle` method reference
   
   4. Modified `createColumnarHashBasedShuffleWriter()`, 
`createColumnarShuffleReader()`, 
      and `isCelebornSkewShuffleOrChildShuffle()` methods to use the holder 
pattern for 
      lazy initialization
   
   5. Added JavaDoc comments explaining the lazy loading mechanism
   
   ### Why are the changes needed?
   
   The current implementation statically initializes columnar shuffle class 
constructors 
   and the skew shuffle method at SparkUtils class loading time, which means 
these 
   classes/methods are loaded regardless of whether they are actually used.
   
   This lazy loading approach ensures that:
   - Columnar shuffle classes are only loaded when actually needed (when 
     `celeborn.columnarShuffle.enabled` is true and the create methods are 
called)
   - CelebornShuffleState class is only loaded when skew shuffle detection is 
needed
   - Reduces unnecessary class loading overhead for users not using these 
features
   - Improves startup performance and memory footprint
   - Aligns with the conditional usage pattern already present in 
SparkShuffleManager
   
   The static holder pattern (initialization-on-demand holder idiom) provides 
several 
   advantages:
   - Thread-safe without explicit synchronization (guaranteed by JVM class 
loading mechanism)
   - No synchronization overhead at runtime (no volatile reads or lock 
acquisition)
   - Simpler and more concise code compared to double-checked locking
   - Recommended by Effective Java (Item 83) for lazy initialization
   
   ### Does this PR resolve a correctness bug?
   
   No, this is a performance optimization.
   
   ### Does this PR introduce any user-facing change?
   
   No. This change only affects when certain classes are loaded internally. 
   The functionality and API remain unchanged.
   
   ### How was this patch tested?
   
   - Code review to verify correct implementation of the 
initialization-on-demand holder pattern
   - Verified that JVM class loading guarantees thread safety (JLS ยง12.4.2)
   - Analyzed existing columnar shuffle and skew shuffle test coverage in the 
codebase
   - The changes are backward compatible and don't alter functionality, only 
initialization timing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to