J-HowHuang commented on code in PR #16341:
URL: https://github.com/apache/pinot/pull/16341#discussion_r2229707306


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -411,6 +416,31 @@ private RebalanceResult doRebalance(TableConfig 
tableConfig, RebalanceConfig reb
               tierToInstancePartitionsMap, rebalanceConfig);
         }
       }
+
+      // If peer-download is enabled, verify that for all segments with 
changes in assignment, it is safe to rebalance
+      // Create the DataLossRiskAssessor which is used to check for data loss 
scenarios if peer-download is enabled
+      // for a table. Skip this step if allowPeerDownloadDataLoss = true
+      if (peerSegmentDownloadScheme != null && !allowPeerDownloadDataLoss) {
+        // Setting minAvailableReplicas to 0 since that is what's equivalent 
to downtime=true
+        DataLossRiskAssessor dataLossRiskAssessor = new 
PeerDownloadTableDataLossRiskAssessor(tableNameWithType,
+            tableConfig, 0, _helixManager, _pinotLLCRealtimeSegmentManager);
+        for (Map.Entry<String, Map<String, String>> segmentToAssignment : 
currentAssignment.entrySet()) {
+          String segmentName = segmentToAssignment.getKey();
+          Map<String, String> assignment = segmentToAssignment.getValue();
+          if (!assignment.equals(targetAssignment.get(segmentName))
+              && dataLossRiskAssessor.hasDataLossRisk(segmentName)) {
+            // Fail the rebalance if a segment with the potential for data 
loss is found
+            String errorMsg = "Moving segment " + segmentName + " as part of 
rebalance is risky for peer-download "
+                + "enabled tables, ensure the deep store has a copy of the 
segment and if upsert / dedup enabled "
+                + "that it is completed and try again. It is recommended to 
forceCommit and pause ingestion prior to "
+                + "rebalancing";
+            onReturnFailure(errorMsg, new IllegalStateException(errorMsg), 
tableRebalanceLogger);
+            return new RebalanceResult(rebalanceJobId, 
RebalanceResult.Status.FAILED, errorMsg, instancePartitionsMap,
+                tierToInstancePartitionsMap, targetAssignment, 
preChecksResult, summaryResult);
+          }
+        }
+      }
+

Review Comment:
   Yes this PR is only focusing on minAvailableReplicas=0 or downtime=true.
   
   >  if this only original instance is in CONSUMING, it might be in a state 
other than COMMITTED or RETAINED, this is the time where data loss can still 
happen because it does not hold the immutable segment
   
   For example, if in the state machine in `realtimeSegmentDataManager` is 
`DISCARDED`, then there will not be any local build of immutable segment on 
this instance
   Say we have
   ```
   IS & EV
   ================
   seg__0__0: {
     server_1: CONSUMING,
     server_2: CONSUMING
   }
   ```
   the commit starts and the committer is server_1
   Then, the rebalance of `minAvailableReplicas=1` `downtime=false` kicks in.
   
   ```
   IS 
   ================
   seg__0__0: {
     server_2: CONSUMING,
     server_3: CONSUMING
   }
   EV
   ================
   seg__0__0: {
     server_1: CONSUMING,
     server_2: CONSUMING
   }
   ```
   And let's say server_2 is in `DISCARDED` state. After server_1 successfully 
build the immutable segment but failed to upload, it will set the IS to online 
and leave the blank download url
   
   ```
   IS 
   ================
   seg__0__0: {
     server_2: ONLINE,
     server_3: ONLINE
   }
   EV
   ================
   seg__0__0: {
     server_1: CONSUMING,
     server_2: CONSUMING
   }
   ```
   Server_1 will drop after that, and now server_2 will fail transition to 
ONLINE because the download url is a peer download url yet no peer available to 
download, and server_3 will fail if it doesn't catch up to the offset in time 
that it will try to download.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to