[ https://issues.apache.org/jira/browse/HDFS-6229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966777#comment-13966777 ]
Hudson commented on HDFS-6229: ------------------------------ SUCCESS: Integrated in Hadoop-trunk-Commit #5503 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5503/]) HDFS-6229. Race condition in failover can cause RetryCache fail to work. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1586714) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RetryCache.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > Race condition in failover can cause RetryCache fail to work > ------------------------------------------------------------ > > Key: HDFS-6229 > URL: https://issues.apache.org/jira/browse/HDFS-6229 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 2.1.0-beta > Reporter: Jing Zhao > Assignee: Jing Zhao > Fix For: 2.4.1 > > Attachments: HDFS-6229.000.patch, retrycache-race.patch > > > Currently when NN failover happens, the old SBN first sets its state to > active, then starts the active services (including tailing all the remaining > editlog and building a complete retry cache based on the editlog). If a retry > request, which has already succeeded in the old ANN (but the client fails to > receive the response), comes in between, this retry may still get served by > the new ANN but miss the retry cache. -- This message was sent by Atlassian JIRA (v6.2#6252)