[
https://issues.apache.org/jira/browse/HDFS-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
qinyuren updated HDFS-16575:
----------------------------
Description:
The SPS may have misjudged in the following scenario:
# Create a file with one block and this block have 3 replication with
{color:#de350b}DISK{color} type [DISK, DISK, DISK].
# Set this file with {color:#de350b}ALL_SSD{color} storage policy.
# The replication of this file may become [DISK, DISK,
{color:#de350b}SSD{color}, DISK] with {color:#de350b}decommission{color}.
# Set this file with {color:#de350b}HOT{color} storage policy and satisfy
storage policy on this file.
# The replication finally look like [DISK, DISK, SSD] not [DISK, DISK, DISK]
after decommissioned node offline.
The reason is that SPS get the block replications by
FileStatus.getReplication() which is not the real num of the block.
!image-2022-05-10-11-21-13-627.png|width=432,height=76!
So this block will be ignored, because it have 3 replications with DISK type
already ( one replication in a decommissioning node)
!image-2022-05-10-11-21-31-987.png|width=334,height=179!
I think we can use blockInfo.getLocations().length to count the replication of
block instead of FileStatus.getReplication().
was:
# create a file with one block and this block have 3 replication with
{color:#de350b}DISK{color} type [DISK, DISK, DISK].
# Set this file with {color:#de350b}ALL_SSD{color} storage policy.
# The replication of this file may become [DISK, DISK,
{color:#de350b}SSD{color}, DISK] with {color:#de350b}decommission{color}.
# Set this file with {color:#de350b}HOT{color} storage policy and satisfy
storage policy on this file.
# The replication finally look like [DISK, DISK, SSD] not [DISK, DISK, DISK]
after decommissioned node offline.
The reason is that SPS get the block replications by
FileStatus.getReplication() which is not the real num of the block.
!image-2022-05-10-11-21-13-627.png|width=432,height=76!
So this block will be ignored, because it have 3 replications with DISK type (
one replication in a decommissioning node)
!image-2022-05-10-11-21-31-987.png|width=334,height=179!
> [SPS]: Should use real replication num instead getReplication from namenode
> ---------------------------------------------------------------------------
>
> Key: HDFS-16575
> URL: https://issues.apache.org/jira/browse/HDFS-16575
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: qinyuren
> Priority: Major
> Attachments: image-2022-05-10-11-21-13-627.png,
> image-2022-05-10-11-21-31-987.png
>
>
> The SPS may have misjudged in the following scenario:
> # Create a file with one block and this block have 3 replication with
> {color:#de350b}DISK{color} type [DISK, DISK, DISK].
> # Set this file with {color:#de350b}ALL_SSD{color} storage policy.
> # The replication of this file may become [DISK, DISK,
> {color:#de350b}SSD{color}, DISK] with {color:#de350b}decommission{color}.
> # Set this file with {color:#de350b}HOT{color} storage policy and satisfy
> storage policy on this file.
> # The replication finally look like [DISK, DISK, SSD] not [DISK, DISK,
> DISK] after decommissioned node offline.
> The reason is that SPS get the block replications by
> FileStatus.getReplication() which is not the real num of the block.
> !image-2022-05-10-11-21-13-627.png|width=432,height=76!
> So this block will be ignored, because it have 3 replications with DISK type
> already ( one replication in a decommissioning node)
> !image-2022-05-10-11-21-31-987.png|width=334,height=179!
> I think we can use blockInfo.getLocations().length to count the replication
> of block instead of FileStatus.getReplication().
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]