BiteTheDDDDt opened a new pull request, #57997:
URL: https://github.com/apache/doris/pull/57997

   ### What problem does this PR solve?
   - LocalSentRows: 720.000376M (720000376)
   before:
   - DistributeRowsIntoChannelsTime: 5sec284ms
   after:
   - DistributeRowsIntoChannelsTime: 4sec559ms
   
   Problem Summary:
   
   This pull request refactors the `Writer` class in the shuffle pipeline to 
improve performance and memory management by changing how row indices and 
partition histograms are handled. The main changes include removing unnecessary 
`const` qualifiers, reusing internal buffers to avoid repeated allocations, and 
simplifying the logic for partitioning rows.
   
   **Refactoring and Performance Improvements:**
   
   * Removed `const` qualifiers from the `write` and `_channel_add_rows` 
methods in both `writer.cpp` and `writer.h`, allowing these methods to modify 
internal state and reuse buffers for better performance. 
[[1]](diffhunk://#diff-907673827bb9b79f913966cf9a582f8dce272a622d8f1382bf8e297454955194L33-R33)
 
[[2]](diffhunk://#diff-f91c438b58a2e487a928dd9cddfbbce754b9206bd9f1e08cb3e2e36f7dbf1ab7L39-R53)
   * Introduced internal member buffers (`_row_idx`, 
`_partition_rows_histogram`, `_channel_start_offsets`) in the `Writer` class to 
avoid repeated allocation of temporary arrays during row partitioning.
   * Refactored the logic in `_channel_add_rows` to use these internal buffers, 
resulting in more efficient calculation and assignment of row indices and 
partition sizes for each channel.
   
   **Code Simplification:**
   
   * Simplified the computation of partition histograms and channel offsets, 
making the code easier to read and maintain.
   
   These changes should lead to reduced memory allocation overhead and improved 
runtime efficiency in the shuffle writer logic.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [ ] Regression test
       - [ ] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [ ] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [ ] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to