[ 
https://issues.apache.org/jira/browse/HADOOP-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi resolved HADOOP-445.
---------------------------------

    Resolution: Won't Fix

The data is no longer written to local disk.

> Parallel data/socket writing for DFSOutputStream
> ------------------------------------------------
>
>                 Key: HADOOP-445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-445
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.5.0
>            Reporter: Benjamin Reed
>            Assignee: Sameer Paranjpye
>         Attachments: fastClientWrite.patch
>
>
> Currently, as DFS clients output blocks they write the entire block to disk 
> before starting to transmit to the datanode. By writing to disk the client is 
> able to retry a block write if the datanode files in the middle of a block 
> transfer. Writing to disk and then to the datanode adds latency. Hopefully, 
> the common case is that block transfers to datanodes are successful. This 
> patch writes to the datanode and the disk in parallel. If the write to the 
> datanode fails, it falls back to current behavior.
> In my tests of transmits of 237M and 946M datasets using -copyFromLocal I'm 
> seeing a 20-25% improvement in throughput.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to