This is an automated email from the ASF dual-hosted git repository.
ethanfeng pushed a commit to branch branch-0.5
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/branch-0.5 by this push:
new b3f170fd6 [CELEBORN-1759] Fix reserve slots might lost partition
location between 0.4 client and 0.5 server
b3f170fd6 is described below
commit b3f170fd611643edf9c79943184845788005488a
Author: onebox-li <[email protected]>
AuthorDate: Tue Dec 3 16:57:53 2024 +0800
[CELEBORN-1759] Fix reserve slots might lost partition location between 0.4
client and 0.5 server
### What changes were proposed in this pull request?
Fix the worker parses `ReserveSlots` logic for compatibility
### Why are the changes needed?
When upgrading to 0.5, the 0.4 client reserves slots for the 0.5 worker. If
there is only a replicate location, the worker parses abnormally, causing the
actual reserve to fail, but returns success to the client.
The worker log "Reserved 0 primary location and 0 replica location" appears.
### Does this PR introduce _any_ user-facing change?
When upgrading to 0.5 from 0.4, fix potential reserve slot failure
scenario.(only replica location).
### How was this patch tested?
Manual test.
Closes #2968 from onebox-li/fix-reserve-compatibility.
Authored-by: onebox-li <[email protected]>
Signed-off-by: mingji <[email protected]>
(cherry picked from commit 7102174edabc2ec2a89942cf4d5070b34a60479f)
Signed-off-by: mingji <[email protected]>
---
.../org/apache/celeborn/common/protocol/message/ControlMessages.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
a/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
b/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
index e90f03d10..6870ba1ca 100644
---
a/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
+++
b/common/src/main/scala/org/apache/celeborn/common/protocol/message/ControlMessages.scala
@@ -1161,7 +1161,7 @@ object ControlMessages extends Logging {
val pbReserveSlots = PbReserveSlots.parseFrom(message.getPayload)
val userIdentifier =
PbSerDeUtils.fromPbUserIdentifier(pbReserveSlots.getUserIdentifier)
val (primaryLocations, replicateLocations) =
- if (pbReserveSlots.getPrimaryLocationsList.isEmpty) {
+ if (pbReserveSlots.getPrimaryLocationsList.isEmpty &&
pbReserveSlots.getReplicaLocationsList.isEmpty) {
PbSerDeUtils.fromPbPackedPartitionLocationsPair(
pbReserveSlots.getPartitionLocationsPair)
} else {