Jackie-Jiang commented on code in PR #16904:
URL: https://github.com/apache/pinot/pull/16904#discussion_r2383291190


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -2687,17 +2687,12 @@ public void 
repairSegmentsInErrorStateForPauselessConsumption(TableConfig tableC
         segmentsInErrorStateInAtLeastOneReplica.size(), 
segmentsInErrorStateInAtLeastOneReplica,
         segmentsInErrorStateInAllReplicas.size(), 
segmentsInErrorStateInAllReplicas, realtimeTableName);
 
-    if 
(!allowRepairOfErrorSegments(repairErrorSegmentsForPartialUpsertOrDedup, 
tableConfig)) {
-      // We do not run reingestion for dedup and partial upsert tables in 
pauseless as it can
-      // lead to data inconsistencies
-      _controllerMetrics.setOrUpdateTableGauge(realtimeTableName,
-          ControllerGauge.PAUSELESS_SEGMENTS_IN_UNRECOVERABLE_ERROR_COUNT, 
segmentsInErrorStateInAllReplicas.size());
-      return;
-    } else {
-      LOGGER.info("Repairing error segments in table: {}.", realtimeTableName);
-      _controllerMetrics.setOrUpdateTableGauge(realtimeTableName, 
ControllerGauge.PAUSELESS_SEGMENTS_IN_ERROR_COUNT,
-          segmentsInErrorStateInAllReplicas.size());
-    }
+    LOGGER.info("Repairing error segments in table: {}.", realtimeTableName);

Review Comment:
   (minor) This log is not really necessary given we already log an warning 
above



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -2728,15 +2733,17 @@ public void 
repairSegmentsInErrorStateForPauselessConsumption(TableConfig tableC
         } catch (Exception e) {
           LOGGER.error("Failed to call reingestSegment for segment: {} on 
server: {}", segmentName, aliveServer, e);
         }
-      } else if (segmentZKMetadata.getStatus() != Status.IN_PROGRESS) {
-        // Trigger reset for segment not in IN_PROGRESS state to download the 
segment from deep store or peer server
+      } else {
         _helixResourceManager.resetSegment(realtimeTableName, segmentName, 
null);
       }
     }
+
+    _controllerMetrics.setOrUpdateTableGauge(realtimeTableName,
+        ControllerGauge.PAUSELESS_SEGMENTS_IN_UNRECOVERABLE_ERROR_COUNT, 
segmentsInUnRecoverableState);
   }
 
   /**
-   * Whether to allow repairing the ERROR segment or not
+   * Whether to allow repairing the ERROR segments which are ONLINE in IS

Review Comment:
   Seems like we use this check to determine whether we allow re-ingestion of 
the segment. Suggest making it more specific. Now it is confusing



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to