[ 
https://issues.apache.org/jira/browse/HDFS-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8233:
----------------------------
    Attachment: hdfs8233-HDFS-7285.000.patch

When we call {{getCurrentBlockGroupBytes}}, the block object in each streamer 
cannot reflect the real size since we cannot guarantee that all the packets 
have been sent out and acks have also been received. The {{bytesCurBlock}} 
field can be set to 0 when an internal block is full. Thus it is hard to 
compute the accurate block size at this time. However, for 
{{writeParityCellsForLastStripe}} what we need is only the parity cell size 
which can be computed based on {{bytesCurBlock}}.

The 000 patch fixes the issue based on the above theory. It also adds a new 
unit test case which fails with original code.

> Fix DFSStripedOutputStream#getCurrentBlockGroupBytes when the last stripe is 
> at the block group boundary
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8233
>                 URL: https://issues.apache.org/jira/browse/HDFS-8233
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: hdfs8233-HDFS-7285.000.patch
>
>
> Currently {{DFSStripedOutputStream#getCurrentBlockGroupBytes}} simply uses 
> {{getBytesCurBlock}} of each streamer to calculate the block group size. This 
> is wrong when the last stripe is at the block group boundary, since the 
> {{bytesCurBlock}} is set to 0 if an internal block is finished. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to