This is an automated email from the ASF dual-hosted git repository.

ethanfeng pushed a commit to branch branch-0.5
in repository https://gitbox.apache.org/repos/asf/celeborn.git


The following commit(s) were added to refs/heads/branch-0.5 by this push:
     new b3f170fd6 [CELEBORN-1759] Fix reserve slots might lost partition 
location between 0.4 client and 0.5 server
b3f170fd6 is described below

commit b3f170fd611643edf9c79943184845788005488a
Author: onebox-li <[email protected]>
AuthorDate: Tue Dec 3 16:57:53 2024 +0800

    [CELEBORN-1759] Fix reserve slots might lost partition location between 0.4 
client and 0.5 server
    
    ### What changes were proposed in this pull request?
    Fix the worker parses `ReserveSlots` logic for compatibility
    
    ### Why are the changes needed?
    When upgrading to 0.5, the 0.4 client reserves slots for the 0.5 worker. If 
there is only a replicate location, the worker parses abnormally, causing the 
actual reserve to fail, but returns success to the client.
    The worker log "Reserved 0 primary location and 0 replica location" appears.
    
    ### Does this PR introduce _any_ user-facing change?
    When upgrading to 0.5 from 0.4, fix potential reserve slot failure 
scenario.(only replica location).
    
    ### How was this patch tested?
    Manual test.
    
    Closes #2968 from onebox-li/fix-reserve-compatibility.
    
    Authored-by: onebox-li <[email protected]>
    Signed-off-by: mingji <[email protected]>
    (cherry picked from commit 7102174edabc2ec2a89942cf4d5070b34a60479f)
    Signed-off-by: mingji <[email protected]>
---
 .../org/apache/celeborn/common/protocol/message/ControlMessages.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
 
b/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
index e90f03d10..6870ba1ca 100644
--- 
a/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
+++ 
b/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
@@ -1161,7 +1161,7 @@ object ControlMessages extends Logging {
         val pbReserveSlots = PbReserveSlots.parseFrom(message.getPayload)
         val userIdentifier = 
PbSerDeUtils.fromPbUserIdentifier(pbReserveSlots.getUserIdentifier)
         val (primaryLocations, replicateLocations) =
-          if (pbReserveSlots.getPrimaryLocationsList.isEmpty) {
+          if (pbReserveSlots.getPrimaryLocationsList.isEmpty && 
pbReserveSlots.getReplicaLocationsList.isEmpty) {
             PbSerDeUtils.fromPbPackedPartitionLocationsPair(
               pbReserveSlots.getPartitionLocationsPair)
           } else {

Reply via email to