[ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838464#comment-13838464
 ] 

Andrew Wang commented on HADOOP-9640:
-------------------------------------

Hi Xiaobo and Ming and Chris, thanks for writing this up. It's very interesting 
stuff. I have a few comments/questions:

* Parsing the MapReduce job name out of the DFSClient name is kind of an ugly 
hack. The client name also isn't that reliable since it's formed from the 
client's {{Configuration}}, and more generally anything in the RPC format that 
isn't a Kerberos token can be faked. Are these concerns in scope for your 
proposal?
* Tracking by user is also not going to work so well in a HiveServer2 setup 
where all Hive queries are run as the {{hive}} user. This is a pretty common DB 
security model, since you need this for column/row-level security.
* What's the purpose of separating read and write requests? Write requests take 
the write lock, and are thus more "expensive" in that sense, but your example 
of the listDir of a large directory is a read operation.
* In the "Identify suspects" section, I see that you present three options 
here. Which one do you think is best? Seems like you're leaning toward option 3.
* Does dropping an RPC result in exponential back-off from the client, a la 
TCP? Client backpressure is pretty important to reach a steady state.
* I didn't see any mention of fair share here, are you planning to adjust 
suspect thresholds based on client share?
* Any thoughts on how to automatically determine these thresholds? These seem 
like kind of annoying parameters to tune.
* Maybe admin / superuser commands and service RPCs should be excluded from 
this feature
* Do you have any preliminary benchmarks supporting the design? Performance is 
a pretty important aspect of this design.

> RPC Congestion Control
> ----------------------
>
>                 Key: HADOOP-9640
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9640
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Xiaobo Peng
>              Labels: hdfs, qos, rpc
>         Attachments: NN-denial-of-service-updated-plan.pdf, 
> faircallqueue.patch, rpc-congestion-control-draft-plan.pdf
>
>
> Several production Hadoop cluster incidents occurred where the Namenode was 
> overloaded and failed to be responsive.  This task is to improve the system 
> to detect RPC congestion early, and to provide good diagnostic information 
> for alerts that identify suspicious jobs/users so as to restore services 
> quickly.
> Excerpted from the communication of one incident, “The map task of a user was 
> creating huge number of small files in the user directory. Due to the heavy 
> load on NN, the JT also was unable to communicate with NN...The cluster 
> became responsive only once the job was killed.”
> Excerpted from the communication of another incident, “Namenode was 
> overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
> requests. the job had a bug that called getFileInfo for a nonexistent file in 
> an endless loop). All other requests to namenode were also affected by this 
> and hence all jobs slowed down. Cluster almost came to a grinding 
> halt…Eventually killed jobtracker to kill all jobs that are running.”
> Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
> the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
> (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to