xiangfu0 commented on code in PR #18642:
URL: https://github.com/apache/pinot/pull/18642#discussion_r3333999423


##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/materializedview/MaterializedViewTaskExecutor.java:
##########
@@ -449,22 +450,40 @@ public void postProcess(PinotTaskConfig pinotTaskConfig) {
     long windowStartMs = 
Long.parseLong(configs.get(MaterializedViewTask.WINDOW_START_MS_KEY));
     long windowEndMs = 
Long.parseLong(configs.get(MaterializedViewTask.WINDOW_END_MS_KEY));
 
-    updateMaterializedViewRuntime(configs, tableName, taskMode, windowStartMs, 
windowEndMs);
-  }
+    MaterializedViewPartitionManager partitionManager = new 
MaterializedViewPartitionManager(
+        MINION_CONTEXT.getHelixPropertyStore(),
+        MaterializedViewTaskExecutor::readMinionClusterConfig);
 
-  /// Updates [MaterializedViewRuntimeMetadata] in a single CAS write, 
combining:
-  ///
-  ///   - partitions: set VALID with new fingerprint (APPEND/OVERWRITE) or 
remove (DELETE)
-  ///   - watermarkMs: advance on APPEND only (drives both scheduler dispatch 
and the
-  ///       broker's SPLIT_REWRITE boundary)
-  ///
-  // Compile-time default for the CAS retry budget when racing to update 
MaterializedViewRuntimeMetadata.
-  // Up to maxTasksPerBatch executors can contend per batch completion; each 
retry re-fetches the
-  // latest version with jittered backoff (Thread.sleep below). 128 is well 
above any realistic
-  // maxTasksPerBatch and stays low enough that genuinely pathological 
contention still surfaces as
-  // a task failure (caught by Helix and retried at the task level). 
Overridable per cluster via
-  // `MaterializedViewTask.CLUSTER_CONFIG_KEY_MAX_RUNTIME_UPDATE_ATTEMPTS` (no 
minion restart).
-  private static final int DEFAULT_MAX_RUNTIME_UPDATE_ATTEMPTS = 128;
+    if (MaterializedViewTask.TASK_MODE_DELETE.equals(taskMode)) {
+      // DELETE writes `VALID + PartitionFingerprint.EMPTY` rather than 
removing the entry —
+      // see `MaterializedViewPartitionManager#clearValid` for the design 
rationale (no
+      // fingerprint validation needed; the source contained zero overlapping 
segments by
+      // construction at scheduler dispatch time).
+      partitionManager.clearValid(tableName, windowStartMs);

Review Comment:
   This removes the commit-time source check from the DELETE path. If a 
backfill lands after the scheduler emitted DELETE but before `postProcess()` 
runs, the create-segment event is ignored because the partition is already 
STALE; then `clearValid()` flips it to `VALID + EMPTY` and the broker will 
rewrite queries to an empty MV bucket even though source rows now exist. The 
old behavior of removing the entry still fell back to base-table routing, so 
this is a silent wrong-results regression. DELETE needs a commit-time 
fingerprint/emptiness validation (or another way to preserve STALE on 
mid-flight backfills) before writing `VALID + EMPTY`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to