[GitHub] [druid] capistrant commented on a change in pull request #10676: Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other

GitBox Wed, 27 Jan 2021 09:06:14 -0800


capistrant commented on a change in pull request #10676:
URL: https://github.com/apache/druid/pull/10676#discussion_r565480253




##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractBatchIndexTask.java
##########
@@ -576,6 +584,64 @@ static Granularity 
findGranularityFromSegments(List<DataSegment> segments)
     }
   }
 
+  /**
+   * Wait for segments to become available on the cluster. If waitTimeout is 
reached, giveup on waiting. This is a
+   * QoS method that can be used to make Batch Ingest tasks wait to finish 
until their ingested data is available on
+   * the cluster. Doing so gives an end user assurance that a Successful task 
status means their data is available
+   * for querying.
+   *
+   * @param toolbox {@link TaskToolbox} object with for assisting with task 
work.
+   * @param segmentsToWaitFor {@link List} of segments to wait for 
availability.
+   * @param waitTimeout Millis to wait before giving up
+   * @return True if all segments became available, otherwise False.
+   */
+  protected boolean waitForSegmentAvailability(

Review comment:
       > Good point. I agree. I think it would be better to not do such 
refactoring for parallel or hadoop task in this PR. But it would be still nice 
to reuse the same logic in both streaming and batch ingestion. Maybe we can 
extract this logic as a utility method?
   
   Hm, I guess I'm a little bit confused on this comment. which logic are you 
suggesting be shared? Since the code in StreamAppenderatorDriver is so closely 
coupled with the appenderator concept, I struggle to see what can be extracted. 
The callback function required for the appenderator is incompatible with batch 
ingestion as it stands today. Are you suggesting that we use the same method 
for both but use different callback implementations based on the ingestion 
type? I guess I don't understand the value there if that is the case. Otherwise 
I may be missing your point entirely and just need a nudge in the right 
direction




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] capistrant commented on a change in pull request #10676: Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other

Reply via email to