reduce shuffling and break them down by racks

Chris Douglas (JIRA) Mon, 18 Aug 2008 16:10:37 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623482#action_12623482
 ]


Chris Douglas commented on HADOOP-3062:
---------------------------------------

bq.  Should we check whether ClientTraceLog.isInfoEnabled() before logging?

Excluding the string concatenation to produce the actual, the cost of each log 
message is low or infrequent (like the shuffle message). Excluding the new read 
log message, it's comparable to the logging that's already happening. I'm not 
certain if the logging this replaces (for client writes) should occur when 
ClientTraceLog.inInfoEnabled() is false, since nothing would be logged in that 
case...

bq. Should we define an AUDIT_FORMAT for the log messages, like 
FSNamesystem.AUDIT_FORMAT?

Unlike the FSNamesystem audit format, these are going to require some 
additional processing to be useful (e.g. the id param, optional block id), so 
the key/value pairing doesn't offer the same syntactical guarantees. That said, 
you're probably right, but unless we adopt a packaging like what you suggest in 
your following point, we'd introduce a link between hdfs and mapred. For now- 
with only these few messages- I don't think it gains much by being pulled out.

bq. I think it might worth to create a utility class, say 
org.apache.hadoop.log.AuditLog, so that we could put AUDIT_FORMAT, 
isInfoEnabled(), etc. inside it. Then, both DataNode and FSNamesystem can use 
it.

Agreed: it would be better if there were a more central location for Hadoop 
APIs exported through the logging interfaces, like audit logs and these 
metrics. If nothing else, it would let us know which messages have consumers 
(hence the uncertainty for logging client writes). That's likely part of a 
different patch, though.

> Need to capture the metrics for the network ios generate by dfs reads/writes 
> and map/reduce shuffling  and break them down by racks 
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3062
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3062
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Runping Qi
>            Assignee: Chris Douglas
>             Fix For: 0.19.0
>
>         Attachments: 3062-0.patch, 3062-1.patch
>
>
> In order to better understand the relationship between hadoop performance and 
> the network bandwidth, we need to know 
> what the aggregated traffic data in a cluster and its breakdown by racks. 
> With these data, we can determine whether the network 
> bandwidth is the bottleneck when certain jobs are running on a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3062) Need to capture the metrics for the network ios generate by dfs reads/writes and map/reduce shuffling and break them down by racks

Reply via email to