[
https://issues.apache.org/jira/browse/HDFS-11755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997442#comment-15997442
]
Ravi Prakash edited comment on HDFS-11755 at 5/4/17 9:20 PM:
-------------------------------------------------------------
Hi Nathan! Thank you for reporting the bug.
Could you please specify the guarantees for data resiliency we can expect from
HDFS. The way I see it, we have several options. To keep the discussion simple,
I'll only focus on replication, although similar arguments apply to
Erasure-Coding. Two of them are:
1. Only when a file is closed by a client successfully, is data guaranteed to
be resilient. Only if all 3 replicas failed at the same time, can data loss
occur.
2. Another option might be that when a client gets an ack for a packet from the
datanode pipeline, the data is guaranteed to be persistent. Now all three
replicas under construction need to fail at the same time for data to be lost.
Do you know which one makes more sense? My vote would be for the latter. HDFS
clients which write small amounts of data for a long time to a file, shouldn't
have to close and re-open files for getting the resiliency guarantees of 3-way
replication.
Please let me know if my question doesn't make sense.
was (Author: raviprak):
Hi Nathan! Thank you for reporting the bug.
Could you please specify the guarantees for data resiliency we can expect from
HDFS. The way I see it, we have several options. To keep the discussion simple,
I'll only focus on replication, although similar arguments apply to
Erasure-Coding. Two of them are:
1. Only when a file is closed by a client successfully, is data guaranteed to
be resilient. Only if all 3 replicas failed at the same time, can data loss
occur.
1. Another option might be that when a client gets an ack for a packet from the
datanode pipeline, the data is guaranteed to be persistent. Now all three
replicas under construction need to fail at the same time for data to be lost.
Do you know which one makes more sense? My vote would be for the latter. HDFS
clients which write small amounts of data for a long time to a file, shouldn't
have to close and re-open files for getting the resiliency guarantees of 3-way
replication.
Please let me know if my question doesn't make sense.
> Underconstruction blocks can be considered missing
> --------------------------------------------------
>
> Key: HDFS-11755
> URL: https://issues.apache.org/jira/browse/HDFS-11755
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.0.0-alpha2, 2.8.1
> Reporter: Nathan Roberts
> Assignee: Nathan Roberts
>
> Following sequence of events can lead to a block underconstruction being
> considered missing.
> - pipeline of 3 DNs, DN1->DN2->DN3
> - DN3 has a failing disk so some updates take a long time
> - Client writes entire block and is waiting for final ack
> - DN1, DN2 and DN3 have all received the block
> - DN1 is waiting for ACK from DN2 who is waiting for ACK from DN3
> - DN3 is having trouble finalizing the block due to the failing drive. It
> does eventually succeed but it is VERY slow at doing so.
> - DN2 times out waiting for DN3 and tears down its pieces of the pipeline, so
> DN1 notices and does the same. Neither DN1 nor DN2 finalized the block.
> - DN3 finally sends an IBR to the NN indicating the block has been received.
> - Drive containing the block on DN3 fails enough that the DN takes it offline
> and notifies NN of failed volume
> - NN removes DN3's replica from the triplets and then declares the block
> missing because there are no other replicas
> Seems like we shouldn't consider uncompleted blocks for replication.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]