Need to instrument Hadoop to get comprehensive network traffic metrics
----------------------------------------------------------------------

                 Key: HADOOP-2830
                 URL: https://issues.apache.org/jira/browse/HADOOP-2830
             Project: Hadoop Core
          Issue Type: Improvement
            Reporter: Runping Qi



One of most often asked question regarding Hadoop performance is: was the job 
cpu bounded, or disk bounded, or  network bounded.
The first two parts can be answered based on metric data of individual 
machines, thus are relatively easy to answer.
The third part is much harder, especially for a large cluster. To unswer the 
question, we need to know the followings:

1. The network traffic to and from the nodes in the cluster
2. The network traffic going between node pairs through the switch they share
3. The network traffic going through the back links between the switches

With these data, we can get a better insight on the relationship between  
network bandwidth and hadoop performance.

We need to instrument the Hadoop code to obtain the above data.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to