[
https://issues.apache.org/jira/browse/HDFS-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinglun updated HDFS-14568:
---------------------------
Description:
The quota and consume of the file's ancestors are not handled when the storage
policy of the file is changed. For example:
1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under
it;
3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and
expect a QuotaByStorageTypeExceededException.
Because the quota and consume is not handled, the expected exception is not
threw out.
There are 3 reasons why we should handle the consume and the quota.
1. Replication uses the new storage policy. Considering a file with BlockType
CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT". Now
we change the policy to "ONE_SSD". If a DN goes down and the file needs
replication, the NN will choose storages in policy "ONE_SSD" and replicate the
block to a SSD storage.
2. We acturally have a cluster storaging both HOT and COLD data. We have a
backgroud process searching all the files to find those that are not accessed
for a period of time. Then we set them to COLD and start a mover to move the
replicas. After moving, all the replicas are consistent with the storage policy.
3. The NameNode manages the global state of the cluster. If there is any
inconsistent situation, such as the replicas doesn't match the storage policy
of the file, we should take the NameNode as the standard and make the cluster
to match the NameNode. The block replication is a good example of the rule.
When we count the consume of a file(CONTIGUOUS), we multiply the replication
factor with the file's length, no matter the file is under replicated or
excessed. So does the storage type quota and consume.
was:
The quota and consume of the file's ancestors are not handled when the storage
policy of the file is changed. For example:
1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under
it;
3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and
expect a QuotaByStorageTypeExceededException.
Because the quota and consume is not handled, the expected exception is not
threw out.
> The quota and consume of the file's ancestors are not handled when the
> storage policy of the file is changed.
> -------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-14568
> URL: https://issues.apache.org/jira/browse/HDFS-14568
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 3.1.0
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: HDFS-14568-001.patch, HDFS-14568-unit-test.patch
>
>
> The quota and consume of the file's ancestors are not handled when the
> storage policy of the file is changed. For example:
> 1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
> 2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK}
> under it;
> 3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and
> expect a QuotaByStorageTypeExceededException.
> Because the quota and consume is not handled, the expected exception is not
> threw out.
>
> There are 3 reasons why we should handle the consume and the quota.
> 1. Replication uses the new storage policy. Considering a file with BlockType
> CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT".
> Now we change the policy to "ONE_SSD". If a DN goes down and the file needs
> replication, the NN will choose storages in policy "ONE_SSD" and replicate
> the block to a SSD storage.
> 2. We acturally have a cluster storaging both HOT and COLD data. We have a
> backgroud process searching all the files to find those that are not accessed
> for a period of time. Then we set them to COLD and start a mover to move the
> replicas. After moving, all the replicas are consistent with the storage
> policy.
> 3. The NameNode manages the global state of the cluster. If there is any
> inconsistent situation, such as the replicas doesn't match the storage policy
> of the file, we should take the NameNode as the standard and make the cluster
> to match the NameNode. The block replication is a good example of the rule.
> When we count the consume of a file(CONTIGUOUS), we multiply the replication
> factor with the file's length, no matter the file is under replicated or
> excessed. So does the storage type quota and consume.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]