[
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Li Bo updated HDFS-8704:
------------------------
Attachment: HDFS-8704-HDFS-7285-006.patch
Update the patch based on current code of 7285
> Erasure Coding: client fails to write large file when one datanode fails
> ------------------------------------------------------------------------
>
> Key: HDFS-8704
> URL: https://issues.apache.org/jira/browse/HDFS-8704
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Li Bo
> Assignee: Li Bo
> Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch,
> HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch,
> HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch
>
>
> I test current code on a 5-node cluster using RS(3,2). When a datanode is
> corrupt, client succeeds to write a file smaller than a block group but fails
> to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests
> files smaller than a block group, this jira will add more test situations.
> A streamer may encounter some bad datanodes when writing blocks allocated to
> it. When it fails to connect datanode or send a packet, the streamer needs to
> prepare for the next block. First it removes the packets of current block
> from its data queue. If the first packet of next block has already been in
> the data queue, the streamer will reset its state and start to wait for the
> next block allocated for it; otherwise it will just wait for the first packet
> of next block. The streamer will check periodically if it is asked to
> terminate during its waiting.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)