[
https://issues.apache.org/jira/browse/HDFS-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinglun updated HDFS-14568:
---------------------------
Description:
At present when the storage policy of a directory or a file is changed, we just
simply change the recorded policy. But the change of storage policy also change
the consume, and that will cause violation of Quota, and cause the consume
recorded in DirectoryWithQuotaFeature out of date.
We should do the quota check and consume update when setStoragePolicy(), when
the rpc happens:
# Compute the new consume and check quota on ancestors. If quota exceed
throwing the QuotaExceedException. Else go to 2.
# Update the consume to all ancestors with Quota.
Related to HDFS-14633, see HDFS-14633 to get more discussions.
was:
The quota and consume of the file's ancestors are not handled when the storage
policy of the file is changed. For example:
1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under
it;
3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and
expect a QuotaByStorageTypeExceededException.
Because the quota and consume is not handled, the expected exception is not
threw out.
There are 3 reasons why we should handle the consume and the quota.
1. Replication uses the new storage policy. Considering a file with BlockType
CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT". Now
we change the policy to "ONE_SSD". If a DN goes down and the file needs
replication, the NN will choose storages in policy "ONE_SSD" and replicate the
block to a SSD storage.
2. We acturally have a cluster storaging both HOT and COLD data. We have a
backgroud process searching all the files to find those that are not accessed
for a period of time. Then we set them to COLD and start a mover to move the
replicas. After moving, all the replicas are consistent with the storage policy.
3. The NameNode manages the global state of the cluster. If there is any
inconsistent situation, such as the replicas doesn't match the storage policy
of the file, we should take the NameNode as the standard and make the cluster
to match the NameNode. The block replication is a good example of the rule.
When we count the consume of a file(CONTIGUOUS), we multiply the replication
factor with the file's length, no matter the file is under replicated or
excessed. So does the storage type quota and consume.
> setStoragePolicy should check quota and update consume on storage type quota.
> -----------------------------------------------------------------------------
>
> Key: HDFS-14568
> URL: https://issues.apache.org/jira/browse/HDFS-14568
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 3.1.0
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Labels: imcompatible
> Attachments: HDFS-14568-001.patch, HDFS-14568-unit-test.patch,
> HDFS-14568.002.patch, HDFS-14568.003.patch, HDFS-14568.004.patch
>
>
> At present when the storage policy of a directory or a file is changed, we
> just simply change the recorded policy. But the change of storage policy also
> change the consume, and that will cause violation of Quota, and cause the
> consume recorded in DirectoryWithQuotaFeature out of date.
> We should do the quota check and consume update when setStoragePolicy(), when
> the rpc happens:
> # Compute the new consume and check quota on ancestors. If quota exceed
> throwing the QuotaExceedException. Else go to 2.
> # Update the consume to all ancestors with Quota.
>
> Related to HDFS-14633, see HDFS-14633 to get more discussions.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]