noob-se7en commented on code in PR #14623:
URL: https://github.com/apache/pinot/pull/14623#discussion_r1925918330


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/RealtimeToOfflineSegmentsTaskMetadata.java:
##########
@@ -18,57 +18,183 @@
  */
 package org.apache.pinot.common.minion;
 
+import com.google.common.base.Preconditions;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.StringUtils;
 import org.apache.helix.zookeeper.datamodel.ZNRecord;
 
 
 /**
- * Metadata for the minion task of type 
<code>RealtimeToOfflineSegmentsTask</code>.
- * The <code>watermarkMs</code> denotes the time (exclusive) upto which tasks 
have been executed.
- *
+ * Metadata for the minion task of type 
<code>RealtimeToOfflineSegmentsTask</code>. The <code>_windowStartMs</code>
+ * denotes the time (exclusive) until which it's certain that tasks have been 
completed successfully. The
+ * <code>_expectedSubtaskResultMap</code> contains the expected RTO tasks 
result info. This map can contain both
+ * completed and in-completed Tasks expected Results. This map is used by 
generator to validate whether a potential
+ * segment (for RTO task) has already been successfully processed as a RTO 
task in the past or not. The
+ * <code>_windowStartMs</code> and <code>_windowEndMs</code> denote the window 
bucket time of currently not
+ * successfully completed minion task. bucket: [_windowStartMs, _windowEndMs) 
The window is updated by generator when
+ * it's certain that prev minon task run is successful.
+ * <p>
  * This gets serialized and stored in zookeeper under the path
  * MINION_TASK_METADATA/${tableNameWithType}/RealtimeToOfflineSegmentsTask
- *
- * PinotTaskGenerator:
- * The <code>watermarkMs</code>> is used by the 
<code>RealtimeToOfflineSegmentsTaskGenerator</code>,
- * to determine the window of execution for the task it is generating.
- * The window of execution will be [watermarkMs, watermarkMs + bucketSize)
- *
- * PinotTaskExecutor:
- * The same watermark is used by the 
<code>RealtimeToOfflineSegmentsTaskExecutor</code>, to:
- * - Verify that is is running the latest task scheduled by the task generator
- * - Update the watermark as the end of the window that it executed for
+ * <p>
+ * PinotTaskGenerator: The <code>_windowStartMs</code>> is used by the
+ * <code>RealtimeToOfflineSegmentsTaskGenerator</code>, to determine the 
window of execution of the prev task based on
+ * which it generates new task.
+ * <p>
+ * PinotTaskExecutor: The same windowStartMs is used by the 
<code>RealtimeToOfflineSegmentsTaskExecutor</code>, to:
+ * - Verify that it's running the latest task scheduled by the task generator.
+ * - The _expectedSubtaskResultMap is updated before the offline segments are 
uploaded to the table.
  */
 public class RealtimeToOfflineSegmentsTaskMetadata extends BaseTaskMetadata {
 
-  private static final String WATERMARK_KEY = "watermarkMs";
+  private static final String WINDOW_START_KEY = "watermarkMs";
+  private static final String WINDOW_END_KEY = "windowEndMs";
+  private static final String COMMA_SEPARATOR = ",";
+  private static final String SEGMENT_NAME_TO_EXPECTED_SUBTASK_RESULT_ID_KEY = 
"segmentToExpectedSubtaskResultId";
 
   private final String _tableNameWithType;
-  private final long _watermarkMs;
+  private long _windowStartMs;
+  private long _windowEndMs;
+  private final Map<String, ExpectedSubtaskResult> _expectedSubtaskResultMap;
+  private final Map<String, String> _segmentNameToExpectedSubtaskResultID;

Review Comment:
   Let me see if I can refactor to simplify the readibility of the code. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to