[ 
https://issues.apache.org/jira/browse/HDFS-11014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410451#comment-16410451
 ] 

Hudson commented on HDFS-11014:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13869 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13869/])
HDFS-11014: libhdfs++: Make connection to HA clusters faster.  
(james.clampffer: rev 59a39269463bc6fd76b1e5e30cc8ccde5250e7fb)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/rpc/rpc_engine.cc
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/retry_policy.cc
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/include/hdfspp/options.h


> libhdfs++: Make connection to HA clusters faster
> ------------------------------------------------
>
>                 Key: HDFS-11014
>                 URL: https://issues.apache.org/jira/browse/HDFS-11014
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>            Priority: Minor
>         Attachments: HDFS-11014.HDFS-8707.000.patch
>
>
> Right now when we get a StandbyException from the NN we inject a 20 second 
> delay before we try the alternate NN even if it's the first failover.  The 
> first failover shouldn't have a delay (java client skips delay on first 
> failover).
> Another minor change I'd like to make is to reduce the default number of 
> failover attempts from 15 (used in the apache config) to 4.  My impression is 
> that higher numbers of failovers are really handy for longer running batch 
> jobs but in the libhdfs++ case the client is often an interactive 
> application.  In this case it's generally preferable to fail sooner so a user 
> doesn't have to wait the ~8 minutes to time out when using default settings.
> 4 failovers is based on the assumption that if we can't immediately connect 
> there is either a GC pause which will most likely be finished before the 
> second connection attempt or it's a network or config issue that will take 
> some sorting out by an admin.  It'd still be possible to override these in 
> the config for more tuning if a specific deployment tends to have more or 
> less network issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to