[ 
https://issues.apache.org/jira/browse/HADOOP-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571670#action_12571670
 ] 

lohit vijayarenu commented on HADOOP-2559:
------------------------------------------

I ran same set of experiments 4 times instead of 2. Here are the results

{noformat}
Job                             Trunk                   Trunk+patch1    
Trunk+patch2
RandomWriter    1346                    923                             1607
RandomWriter    743                             571                             
1111
RandomWriter    698                             497                             
1003
RandomWriter    776                             508                             
963
Sort                    1535                    2027                    1802
Sort                    1466                    1869                    1768
Sort                    1618                    1787                    1738
Sort                    1699                    2044                    1515
{noformat}

Interesting note is that Trunk+patch1 writes have better time compared to Trunk.

While sort, I see many tasks failing due to ChecksumException which succeeds on 
other nodes in retry which affects the time show for sort jobs.

log:org.apache.hadoop.fs.ChecksumException: Checksum error: 
/tmps/3/gs203727-22269-2527764705241834/mapred-tt/mapred-local/task_200802222124_0007_m_001749_0/file.out
 at 17018368

Runping suggested we run wordcount instead, will do that and post the results.

> DFS should place one replica per rack
> -------------------------------------
>
>                 Key: HADOOP-2559
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2559
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Runping Qi
>            Assignee: lohit vijayarenu
>         Attachments: HADOOP-2559-1.patch, HADOOP-2559-2.patch
>
>
> Currently, when writing out a block, dfs will place one copy to a local data 
> node, one copy to a rack local node
> and another one to a remote node. This leads to a number of undesired 
> properties:
> 1. The block will be rack-local to two tacks instead of three, reducing the 
> advantage of rack locality based scheduling by 1/3.
> 2. The Blocks of a file (especiallya  large file) are unevenly distributed 
> over the nodes: One third will be on the local node, and two thirds on the 
> nodes on the same rack. This may make some nodes full much faster than 
> others, 
> increasing the need of rebalancing. Furthermore, this also make some nodes 
> become "hot spots" if those big 
> files are popular and accessed by many applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to