Hangleton commented on code in PR #13558:
URL: https://github.com/apache/kafka/pull/13558#discussion_r1198005710


##########
core/src/main/scala/kafka/zk/KafkaZkClient.scala:
##########
@@ -2113,7 +2113,12 @@ class KafkaZkClient private[zk] (zooKeeperClient: 
ZooKeeperClient, isSecure: Boo
   }
 
   private class CheckedEphemeral(path: String, data: Array[Byte]) extends 
Logging {
+    private var attempt = 0
+    private val maxAttempt = 5
+    private val backoffMs = 1000

Review Comment:
   This could indeed be an important element to consider when configuring the 
retry strategy. We probably don't want to retry past the session timeout 
because that exposes to the possibility that the session used during the first 
call is not valid anymore.
   
   And, the session timeout can be used as well to calibrate the response times 
to be expected, since the Zk client in Kafka does not use request timeouts. 
Another timeout to consider could be the read timeout which is the maximum time 
allowed without activity on a connection before that connection is closed. The 
Zookeeper protocol includes heartbeats (pings) performed at least once every 
half of the read timeout. The read timeout is set to be the session timeout 
multiplied by 2/3 divided by the number of hosts registered in the Zookeeper 
cluster.
   
   I haven't looked at Apache Curator to see what approach is used for retries 
there, but that could be an interesting entry point I suppose.
   
   So the retry strategy here requires a bit more thoughts.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to