[ https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881747#comment-13881747 ]
Hadoop QA commented on HADOOP-9640: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12624909/faircallqueue7_with_runtime_swapping.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1547 javac compiler warnings (more than the trunk's current 1546 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestQueueRuntimeReconfigure org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3467//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/3467//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Javac warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/3467//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3467//console This message is automatically generated. > RPC Congestion Control with FairCallQueue > ----------------------------------------- > > Key: HADOOP-9640 > URL: https://issues.apache.org/jira/browse/HADOOP-9640 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 3.0.0, 2.2.0 > Reporter: Xiaobo Peng > Labels: hdfs, qos, rpc > Attachments: MinorityMajorityPerformance.pdf, > NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, > faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, > faircallqueue5.patch, faircallqueue6.patch, > faircallqueue7_with_runtime_swapping.patch, > rpc-congestion-control-draft-plan.pdf > > > Several production Hadoop cluster incidents occurred where the Namenode was > overloaded and failed to respond. > We can improve quality of service for users during namenode peak loads by > replacing the FIFO call queue with a [Fair Call > Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf]. > (this plan supersedes rpc-congestion-control-draft-plan). > Excerpted from the communication of one incident, “The map task of a user was > creating huge number of small files in the user directory. Due to the heavy > load on NN, the JT also was unable to communicate with NN...The cluster > became responsive only once the job was killed.” > Excerpted from the communication of another incident, “Namenode was > overloaded by GetBlockLocation requests (Correction: should be getFileInfo > requests. the job had a bug that called getFileInfo for a nonexistent file in > an endless loop). All other requests to namenode were also affected by this > and hence all jobs slowed down. Cluster almost came to a grinding > halt…Eventually killed jobtracker to kill all jobs that are running.” > Excerpted from HDFS-945, “We've seen defective applications cause havoc on > the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories > (60k files) etc.” -- This message was sent by Atlassian JIRA (v6.1.5#6160)