[
https://issues.apache.org/jira/browse/HDFS-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578667#comment-16578667
]
Xiao Chen commented on HDFS-13788:
----------------------------------
Thanks [~knanasi] for the patch and [~zvenczel] for the review.
bq. For rack fault-tolerance, it is also important to have at least as many
racks as the configured number of EC parity cells.
This is technically not correct. In case of RS(3,2), having 2 racks is not safe.
I suggest we word it something like: ... to have enough number of racks, so
that on average, each rack holds number of blocks no more than the number of EC
parity blocks. A formula to calculate this would be (data blocks + parity
blocks) / parity blocks, rounding up.
Then in the 6,3 example, we add the example calculation: ... minimally 3 racks
(calculated by (6 + 3) / 3 = 3) ...
It'd be great if we can add a note in the end as well, after:
bq. ...will still attempt to spread a striped file across multiple nodes to
preserve node-level fault-tolerance.
For this reason, it is recommended to setup racks with similar number of
DataNodes.
> Update EC documentation about rack fault tolerance
> --------------------------------------------------
>
> Key: HDFS-13788
> URL: https://issues.apache.org/jira/browse/HDFS-13788
> Project: Hadoop HDFS
> Issue Type: Task
> Components: documentation, erasure-coding
> Affects Versions: 3.0.0
> Reporter: Xiao Chen
> Assignee: Kitti Nanasi
> Priority: Major
> Attachments: HDFS-13788.001.patch
>
>
> From
> http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html:
> {quote}
> For rack fault-tolerance, it is also important to have at least as many racks
> as the configured EC stripe width. For EC policy RS (6,3), this means
> minimally 9 racks, and ideally 10 or 11 to handle planned and unplanned
> outages. For clusters with fewer racks than the stripe width, HDFS cannot
> maintain rack fault-tolerance, but will still attempt to spread a striped
> file across multiple nodes to preserve node-level fault-tolerance.
> {quote}
> Theoretical minimum is 3 racks, and ideally 9 or more, so the document should
> be updated.
> (I didn't check timestamps, but this is probably due to
> {{BlockPlacementPolicyRackFaultTolerant}} isn't completely done when
> HDFS-9088 introduced this doc. Later there's also examples in
> {{TestErasureCodingMultipleRacks}} to test this explicitly.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]