[GitHub] [druid] jihoonson commented on a change in pull request #10676: Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other

GitBox Tue, 26 Jan 2021 16:56:11 -0800


jihoonson commented on a change in pull request #10676:
URL: https://github.com/apache/druid/pull/10676#discussion_r564942716




##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/IngestionStatsAndErrorsTaskReportData.java
##########
@@ -41,17 +41,22 @@
   @Nullable
   private String errorMsg;
 
+  @JsonProperty
+  private boolean segmentAvailabilityConfirmed;
+
   public IngestionStatsAndErrorsTaskReportData(
       @JsonProperty("ingestionState") IngestionState ingestionState,
       @JsonProperty("unparseableEvents") Map<String, Object> unparseableEvents,
       @JsonProperty("rowStats") Map<String, Object> rowStats,
-      @JsonProperty("errorMsg") @Nullable String errorMsg
+      @JsonProperty("errorMsg") @Nullable String errorMsg,
+      @JsonProperty("segmentAvailabilityConfirmed") boolean 
segmentAvailabilityConfirmed

Review comment:
       > my company deploys a large multi-tenant cluster with a services layer 
for ingestion that our tenants use. these tenants don't just want to know when 
their task succeeds, they also want to know when data from batch ingest is 
available for querying. This solution allows us to prevent the ingestion 
services layer and/or individual tenants from banging on Druid APIs trying to 
see if their data is available after ingestion.
   
   I understand this, but my question is more like what people expect when 
segment handoff fails. In streaming ingestion, the handoff failure causes task 
failure (this behavior seems arguable, but that's what it does now) and thus 
people's expectation is that they could see some data dropped after handoff 
failures until new tasks read the same data and publishes the same segments 
again. However, since there is no realtime querying in batch ingestion, I don't 
think tasks should fail on handoff failures (which is what this PR does! 
:slightly_smiling_face:), but then what will be people's expectation? Are they 
going to be just OK with handoff failures and wait indefinitely until 
historicals load new segments (the current behavior)? Do they want to know why 
the handoff failed? Do they want to know how long it took before the handoff 
failed? These questions are not clear to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] jihoonson commented on a change in pull request #10676: Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other

Reply via email to