[ 
https://issues.apache.org/jira/browse/HDFS-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-14568:
---------------------------
    Description: 
At present when the storage policy of a directory or a file is changed, we just 
simply change the recorded policy. But the change of storage policy also change 
the consume, and that will cause violation of Quota, and cause the consume 
recorded in DirectoryWithQuotaFeature out of date. 

We should do the quota check and consume update when setStoragePolicy(), when 
the rpc happens:
 # Compute the new consume and check quota on ancestors. If quota exceed 
throwing the QuotaExceedException. Else go to 2.
 # Update the consume to all ancestors with Quota.

 

Related to HDFS-14633,  see HDFS-14633 to get more discussions.

 

  was:
The quota and consume of the file's ancestors are not handled when the storage 
policy of the file is changed. For example:
 1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
 2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under 
it;
 3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and 
expect a QuotaByStorageTypeExceededException.

Because the quota and consume is not handled, the expected exception is not 
threw out.

 

There are 3 reasons why we should handle the consume and the quota.
1. Replication uses the new storage policy. Considering a file with BlockType 
CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT". Now 
we change the policy to "ONE_SSD". If a DN goes down and the file needs 
replication, the NN will choose storages in policy "ONE_SSD" and replicate the 
block to a SSD storage.
2. We acturally have a cluster storaging both HOT and COLD data. We have a 
backgroud process searching all the files to find those that are not accessed 
for a period of time. Then we set them to COLD and start a mover to move the 
replicas. After moving, all the replicas are consistent with the storage policy.
3. The NameNode manages the global state of the cluster. If there is any 
inconsistent situation, such as the replicas doesn't match the storage policy 
of the file, we should take the NameNode as the standard and make the cluster 
to match the NameNode. The block replication is a good example of the rule. 
When we count the consume of a file(CONTIGUOUS), we multiply the replication 
factor with the file's length, no matter the file is under replicated or 
excessed. So does the storage type quota and consume.


> setStoragePolicy should check quota and update consume on storage type quota.
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-14568
>                 URL: https://issues.apache.org/jira/browse/HDFS-14568
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.1.0
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>              Labels: imcompatible
>         Attachments: HDFS-14568-001.patch, HDFS-14568-unit-test.patch, 
> HDFS-14568.002.patch, HDFS-14568.003.patch, HDFS-14568.004.patch
>
>
> At present when the storage policy of a directory or a file is changed, we 
> just simply change the recorded policy. But the change of storage policy also 
> change the consume, and that will cause violation of Quota, and cause the 
> consume recorded in DirectoryWithQuotaFeature out of date. 
> We should do the quota check and consume update when setStoragePolicy(), when 
> the rpc happens:
>  # Compute the new consume and check quota on ancestors. If quota exceed 
> throwing the QuotaExceedException. Else go to 2.
>  # Update the consume to all ancestors with Quota.
>  
> Related to HDFS-14633,  see HDFS-14633 to get more discussions.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to