liugs0213 opened a new pull request, #3467:
URL: https://github.com/apache/celeborn/pull/3467

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     - Make sure the PR title start w/ a JIRA ticket, e.g. '[CELEBORN-XXXX] 
Your PR title ...'.
     - Be sure to keep the PR description updated to reflect all changes.
     - Please write your PR title to summarize what this PR proposes.
     - If possible, provide a concise example to reproduce the issue for a 
faster review.
   -->
   
   ### What changes were proposed in this pull request?
   
   1. **New Configuration Parameter**: Added 
\`celeborn.client.readLocalShuffleFile.cloudDiskMounted\` configuration option
   2. **Enhanced Local Read Logic**: Modified \`CelebornShuffleReader\` to 
support cloud disk mounted mode where host address comparison is no longer 
required
   
   ### Why are the changes needed?
   
   The current Celeborn implementation requires network communication between 
client and worker even when data is stored on cloud disks that are mounted 
locally. This creates unnecessary overhead in cloud environments where:
   
   - All worker data is accessible via mounted cloud disks
   - Network latency between client and worker adds performance penalty
   - Multi-layer IO (client -> worker -> cloud disk) is less efficient than 
direct file system access
   
   **The primary goal of using cloud disks is to eliminate dependency on local 
disks**, providing a more scalable and cost-effective storage solution. The 
cloud disk mounted mode eliminates the worker layer for data access, providing 
direct file system access to shuffle data while removing the need for local 
disk storage.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, this PR introduces a new configuration parameter:
   
   - **New config**: \`celeborn.client.readLocalShuffleFile.cloudDiskMounted\` 
(default: false)
   - **Usage**: Must be used together with 
\`celeborn.client.readLocalShuffleFile.enabled=true\`
   - **Behavior**: When enabled, all workers' data can be accessed locally via 
mounted cloud disk, eliminating host address comparison requirements
   
   This change is backward compatible and does not affect existing 
functionality when the new config is not enabled.
   
   ### How was this patch tested?
   
   1. **Functional Testing**: 
      - Tested in cloud disk mounted environment
   
   2. **Performance Testing**:
      - Compared direct cloud disk mounting vs multi-layer IO (client -> worker 
-> cloud disk)
      - **Results**: 5.5% overall performance improvement with direct cloud 
disk access


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to