virajjasani commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1433513544
How about this? This change is working fine on the cluster as is and it is a real requirement as I explained in the [above comment](https://github.com/apache/hadoop/pull/5396#issuecomment-1432160563). If we do not want to keep this change as is i.e. shutdown datanode if not connected to active namenode, how about we provide a pluggable implementation? Let's say, by default, if the datanode does not stay connected to active namenode for 60s, in the default implementation (that we can provide with this patch) we take action of just logging (or maybe expose metric, whatever reviewers feel feasible) the fact that this datanode is not being useful for client as it has lost connected to active namenode for more than the past 60s. This is the default implementation that we can keep. On the other hand, users are allowed to have their own pluggable implementation so let's say if someone wants to shutdown datanode after 60s (default) of loosing connection, they will have to use new implementation with action as "shutdown datanode". Hence, we have two configs for this change: 1. time duration for loosing connection (`dfs.datanode.health.activennconnect.timeout`) which we already have, but with default value as 60s 2. action to be performed by datanode when above threshold is reached (maybe something like `dfs.datanode.activennconnect.timeout.action.impl`) with default implementation that would take action of just logging or exposing metric as per consensus. Any user can have their own implementation separately maintained and that implementation can take action of shutting down datanode, or running another script that could invoke dfsadmin action. Anything should be fine but now the code stays with users. Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org