kfaraz commented on code in PR #19091:
URL: https://github.com/apache/druid/pull/19091#discussion_r2891128326


##########
server/src/main/java/org/apache/druid/segment/realtime/appenderator/TransactionalSegmentPublisher.java:
##########
@@ -33,8 +33,12 @@
 
 public abstract class TransactionalSegmentPublisher
 {
-  private static final int QUIET_RETRIES = 3;
-  private static final int MAX_RETRIES = 5;
+  private static final int QUIET_RETRIES = 5;
+
+  /**
+   * Approximately 10 minutes of retrying using {@link 
RetryUtils#nextRetrySleepMillis(int)}.
+   */
+  private static final int MAX_RETRIES = 13;

Review Comment:
   Yeah, there may still be cases that may not succeed even if after the 10 min 
window.
   Handling that would require some kind of queueing mechanism for publishing 
tasks on the supervisor side.
   
   I intend to explore that angle soon and make the `TaskGroup` more 
auto-scaler friendly as well. The `TaskGroup`/`groupId` concept is currently 
tightly tied to the assumption of a fixed task count.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to