[
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721176#comment-17721176
]
Ayush Saxena commented on HDFS-17002:
-------------------------------------
When reading an EC file, if all the data blocks are there, as far as I know the
client won’t even bother about the parity blocks.
it can get all the data from the data blocks itself, it doesn’t need to go to
parity blocks. The use case what you are talking about is like that parity
block on the physical datanode got screwed most probably the DirectoryScanner
would detect and take care of that.
doesn’t sound like a bug to me on a quick read
> Erasure coding:Generate parity blocks in time to prevent file corruption
> ------------------------------------------------------------------------
>
> Key: HDFS-17002
> URL: https://issues.apache.org/jira/browse/HDFS-17002
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: erasure-coding
> Affects Versions: 3.4.0
> Reporter: farmmamba
> Priority: Major
>
> In current EC implementation, the corrupted parity block will not be
> regenerated in time.
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not
> aware of it.
> Unfortunately, a data block is also corrupted in this time period, then this
> file will be corrupted and can not be read by decoding.
>
> So, here we should always re-generate parity block in time when it is
> unhealthy.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]