jihoonson commented on a change in pull request #10676:
URL: https://github.com/apache/druid/pull/10676#discussion_r551662522



##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractBatchIndexTask.java
##########
@@ -576,6 +584,64 @@ static Granularity 
findGranularityFromSegments(List<DataSegment> segments)
     }
   }
 
+  /**
+   * Wait for segments to become available on the cluster. If waitTimeout is 
reached, giveup on waiting. This is a
+   * QoS method that can be used to make Batch Ingest tasks wait to finish 
until their ingested data is available on
+   * the cluster. Doing so gives an end user assurance that a Successful task 
status means their data is available
+   * for querying.
+   *
+   * @param toolbox {@link TaskToolbox} object with for assisting with task 
work.
+   * @param segmentsToWaitFor {@link List} of segments to wait for 
availability.
+   * @param waitTimeout Millis to wait before giving up
+   * @return True if all segments became available, otherwise False.
+   */
+  protected boolean waitForSegmentAvailability(

Review comment:
       Wondering if you can reuse `StreamAppenderatorDriver.registerHandoff()` 
or `StreamAppenderatorDriver.publishAndRegisterHandoff()` as they seem pretty 
similar to this new method. You would need to move that method out to 
`BaseAppenderatorDriver`.

##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/IngestionStatsAndErrorsTaskReportData.java
##########
@@ -41,17 +41,22 @@
   @Nullable
   private String errorMsg;
 
+  @JsonProperty
+  private boolean segmentAvailabilityConfirmed;
+
   public IngestionStatsAndErrorsTaskReportData(
       @JsonProperty("ingestionState") IngestionState ingestionState,
       @JsonProperty("unparseableEvents") Map<String, Object> unparseableEvents,
       @JsonProperty("rowStats") Map<String, Object> rowStats,
-      @JsonProperty("errorMsg") @Nullable String errorMsg
+      @JsonProperty("errorMsg") @Nullable String errorMsg,
+      @JsonProperty("segmentAvailabilityConfirmed") boolean 
segmentAvailabilityConfirmed

Review comment:
       Can you give some more details on how this will be used in your 
application? Do you want to track handoff failures of each task? I'm wondering 
if handoff time is also important.

##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
##########
@@ -1058,7 +1058,8 @@ private synchronized void persistSequences() throws 
IOException
                 ingestionState,
                 getTaskCompletionUnparseableEvents(),
                 getTaskCompletionRowStats(),
-                errorMsg
+                errorMsg,
+                false

Review comment:
       Why is this always `false`? Does it make more sense to be always `true` 
because realtime tasks will fail when handoff fails?

##########
File path: 
indexing-service/src/test/java/org/apache/druid/indexing/common/task/TaskReportSerdeTest.java
##########
@@ -55,7 +55,8 @@ public void testSerde() throws Exception
             ImmutableMap.of(
                 "number", 1234
             ),
-            "an error message"
+            "an error message",
+            false

Review comment:
       Testing with `true` would be better because missing booleans are 
defaulted to false by Jackson in Druid.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to