[jira] Commented: (HADOOP-1649) Performance regression with Block CRCs

Raghu Angadi (JIRA) Tue, 31 Jul 2007 16:08:07 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516819
 ]


Raghu Angadi commented on HADOOP-1649:
--------------------------------------


TestDFSIO is a simpler test. After analyzing files written during DFSIO-write 
test, it looks like just handful of slow nodes (disk or network) slowdown the 
over all job. From namenode logs, time take to write a 320 MB file on 500 nodes 
varies from 26 sec to 380 sec (on one of the  runs with avg of 75 sec).  I will 
look at time taken to write these files during sort.

For writes, Hadoop can work around slow nodes problem by avoiding nodes that 
have many pending writes inside chooseTarget. Since we don't keep track of 
reads, adaptively avoiding slow nodes is harder. But this problem is more 
severe for writes. Also once we write less to a node, we will end up reading 
less as well.



> Performance regression with Block CRCs
> --------------------------------------
>
>                 Key: HADOOP-1649
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1649
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: HADOOP-1649.patch
>
>
> Performance is noticeably affected by Block Level CRCs patch (HADOOP-1134). 
> This is more noticeable on writes (randomriter test etc). 
> With random writer, it takes 20-25% on small cluster (20 nodes) and many be 
> 10% on larger cluster. 
> There are a few differences in how data is written with 1134. As soon as I 
> can reproduce this, I think it will be easier to fix. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1649) Performance regression with Block CRCs

Reply via email to