[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

Hairong Kuang (JIRA) Thu, 03 Jun 2010 12:59:55 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875276#action_12875276
 ]


Hairong Kuang commented on HDFS-1057:
-------------------------------------

Thank Sam for working on the patch for trunk. Here are my comments:
# BlockSender.java:
#* the condition replica.getBytesOnDisk() < replicaVisibleLength should be 
gtBytesOnDisk() < startOffset + length. This guarantees that the bytes to be 
read have already flushed to the disk.
#* When the while loop exits and the bytes still have not flushed to disk yet, 
BlockSender should throw an IOException.
#* It seems to me that we should remove the use of replicaVisisbleLength from 
BlockSender;
#* the way to calculate endOffset should be
{code}
if (startOffset + length falls into the same chunk where 
chunkChecksum.getDataLength() is located {
    endOffset = chunkChecksum.getDataLength(); --- case 1
} else {
    endOffset = chunk boundary where (startOffset + length) is located
}
{code}
#* In case 1, the last chunk's checksum does not need to be read from disk.
# ReplicaInPipeline, ReplicaPipeInterface, and ReplicaBeingWritten
#* I do not think we need to make any change to ReplicaInPipeline and 
ReplicaPipeInterface
#* We just need to adds the attribute lastChecksum and two synchronized method 
to ReplicaBeingWritten. Would it be more readable if we use the method names as 
getLastChecksumAndDataLen and setLastChecksumAndDataLen.

> Concurrent readers hit ChecksumExceptions if following a writer to very end 
> of file
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0, 0.20-append
>            Reporter: Todd Lipcon
>            Assignee: sam rash
>            Priority: Blocker
>         Attachments: conurrent-reader-patch-1.txt, 
> conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
> hdfs-1057-trunk-1.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
> calling flush(). Therefore, if there is a concurrent reader, it's possible to 
> race here - the reader will see the new length while those bytes are still in 
> the buffers of BlockReceiver. Thus the client will potentially see checksum 
> errors or EOFs. Additionally, the last checksum chunk of the file is made 
> accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

Reply via email to