James Clampffer created HDFS-11014: -------------------------------------- Summary: libhdfs++: Make connection to HA clusters faster Key: HDFS-11014 URL: https://issues.apache.org/jira/browse/HDFS-11014 Project: Hadoop HDFS Issue Type: Sub-task Reporter: James Clampffer Assignee: James Clampffer Priority: Minor
Right now when we get a StandbyException from the NN we inject a 20 second delay before we try the alternate NN even if it's the first failover. The first failover shouldn't have a delay (java client skips delay on first failover). Another minor change I'd like to make is to reduce the default number of failover attempts from 15 (used in the apache config) to 4. My impression is that higher numbers of failovers are really handy for longer running batch jobs but in the libhdfs++ case the client is often an interactive application. In this case it's generally preferable to fail sooner so a user doesn't have to wait the ~8 minutes to time out when using default settings. 4 failovers is based on the assumption that if we can't immediately connect there is either a GC pause which will most likely be finished before the second connection attempt or it's a network or config issue that will take some sorting out by an admin. It'd still be possible to override these in the config for more tuning if a specific deployment tends to have more or less network issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org