virajjasani commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1433513544

   How about this? This change is working fine on the cluster as is and it is a 
real requirement as I explained in the [above 
comment](https://github.com/apache/hadoop/pull/5396#issuecomment-1432160563).
   If we do not want to keep this change as is i.e. shutdown datanode if not 
connected to active namenode, how about we provide a pluggable implementation?
   Let's say, by default, if the datanode does not stay connected to active 
namenode for 60s, in the default implementation (that we can provide with this 
patch) we take action of just logging (or maybe expose metric, whatever 
reviewers feel feasible) the fact that this datanode is not being useful for 
client as it has lost connected to active namenode for more than the past 60s. 
This is the default implementation that we can keep. On the other hand, users 
are allowed to have their own pluggable implementation so let's say if someone 
wants to shutdown datanode after 60s (default) of loosing connection, they will 
have to use new implementation with action as "shutdown datanode".
   
   Hence, we have two configs for this change:
   1. time duration for loosing connection 
(`dfs.datanode.health.activennconnect.timeout`) which we already have, but with 
default value as 60s
   2. action to be performed by datanode when above threshold is reached (maybe 
something like `dfs.datanode.activennconnect.timeout.action.impl`) with default 
implementation that would take action of just logging or exposing metric as per 
consensus.
   
   Any user can have their own implementation separately maintained and that 
implementation can take action of shutting down datanode, or running another 
script that could invoke dfsadmin action. Anything should be fine but now the 
code stays with users.
   Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to