[jira] Updated: (HADOOP-1707) Remove the DFS Client disk-based cache

dhruba borthakur (JIRA) Tue, 27 Nov 2007 11:33:09 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


dhruba borthakur updated HADOOP-1707:
-------------------------------------

    Attachment: clientDiskBuffer8.patch

Fixed two bugs that were exposed while running random writer on a 100 node 
cluster.

1. The code was such that it was waiting for the ResponseThread to exit while 
holding the lock on dataQueue. This caused a deadlock.

2. The DFSClient was sending the packet to the first datanode before it 
inserted the packet into the ackQueue. Now, if the response from the datanode 
arrives before the DFSClient could enqueue the packet into the ackQueue it 
triggered an error. This situation is now avoided because the DFSClient first 
inserts the packet into the ackQueue before sending the packet to the datanode.

> Remove the DFS Client disk-based cache
> --------------------------------------
>
>                 Key: HADOOP-1707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1707
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.16.0
>
>         Attachments: clientDiskBuffer.patch, clientDiskBuffer2.patch, 
> clientDiskBuffer6.patch, clientDiskBuffer7.patch, clientDiskBuffer8.patch, 
> DataTransferProtocol.doc, DataTransferProtocol.html
>
>
> The DFS client currently uses a staging file on local disk to cache all 
> user-writes to a file. When the staging file accumulates 1 block worth of 
> data, its contents are flushed to a HDFS datanode. These operations occur 
> sequentially.
> A simple optimization of allowing the user to write to another staging file 
> while simultaneously uploading the contents of the first staging file to HDFS 
> will improve file-upload performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1707) Remove the DFS Client disk-based cache

Reply via email to