tillrohrmann commented on a change in pull request #16801:
URL: https://github.com/apache/flink/pull/16801#discussion_r688573729
##########
File path:
docs/layouts/shortcodes/generated/high_availability_configuration.html
##########
@@ -62,6 +62,12 @@
<td>Integer</td>
<td>Defines the session timeout for the ZooKeeper session in
ms.</td>
</tr>
+ <tr>
+
<td><h5>high-availability.zookeeper.client.tolerate-suspended-connections</h5></td>
+ <td style="word-wrap: break-word;">false</td>
+ <td>Boolean</td>
+ <td>Defines whether a suspended ZooKeeper connection will be
treated as an error that causes the leader information to be invalidated or
not. In case you set this option to <code
class="highlighter-rouge">true</code>, Flink will wait until a ZooKeeper
connection is marked as lost before it revokes the leadership of components.
This has the effect that Flink is more resilient against temporary connection
instabilities at the cost of running more likely into timing issues with
ZooKeeper.</td>
Review comment:
I quote from the Curator documentation:
> Curator has a pluggable error policy. The default policy takes the
conservative approach of treating connection states SUSPENDED and LOST the same
way. i.e. when a recipe sees the state change to SUSPENDED it will assume that
the ZooKeeper session is lost and will clean up any watchers, nodes, etc. You
can choose, however, a more aggressive approach by setting the error policy to
only treat LOST (i.e. true session loss) as an error state.
I guess the risk is that you lose some safety margin for timings between the
ZK cluster and your client. E.g. ephemeral Znodes will be deleted once a client
session expires. This effectively starts another round of leader election. If
now the old leader only revokes leadership upon a lost connection, then it can
more likely happen that it is a bit late and only revokes the leadership after
another component has obtained it (disclaimer: I haven't looked into the ZK
code).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]