[ 
https://issues.apache.org/jira/browse/HDFS-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-14568:
---------------------------
    Description: 
The quota and consume of the file's ancestors are not handled when the storage 
policy of the file is changed. For example:
 1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
 2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under 
it;
 3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and 
expect a QuotaByStorageTypeExceededException.

Because the quota and consume is not handled, the expected exception is not 
threw out.

 

There are 3 reasons why we should handle the consume and the quota.
1. Replication uses the new storage policy. Considering a file with BlockType 
CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT". Now 
we change the policy to "ONE_SSD". If a DN goes down and the file needs 
replication, the NN will choose storages in policy "ONE_SSD" and replicate the 
block to a SSD storage.
2. We acturally have a cluster storaging both HOT and COLD data. We have a 
backgroud process searching all the files to find those that are not accessed 
for a period of time. Then we set them to COLD and start a mover to move the 
replicas. After moving, all the replicas are consistent with the storage policy.
3. The NameNode manages the global state of the cluster. If there is any 
inconsistent situation, such as the replicas doesn't match the storage policy 
of the file, we should take the NameNode as the standard and make the cluster 
to match the NameNode. The block replication is a good example of the rule. 
When we count the consume of a file(CONTIGUOUS), we multiply the replication 
factor with the file's length, no matter the file is under replicated or 
excessed. So does the storage type quota and consume.

  was:
The quota and consume of the file's ancestors are not handled when the storage 
policy of the file is changed. For example:
 1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
 2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} under 
it;
 3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and 
expect a QuotaByStorageTypeExceededException.

Because the quota and consume is not handled, the expected exception is not 
threw out.


> The quota and consume of the file's ancestors are not handled when the 
> storage policy of the file is changed.
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14568
>                 URL: https://issues.apache.org/jira/browse/HDFS-14568
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.1.0
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-14568-001.patch, HDFS-14568-unit-test.patch
>
>
> The quota and consume of the file's ancestors are not handled when the 
> storage policy of the file is changed. For example:
>  1. Set quota StorageType.SSD fileSpace-1 to the parent dir;
>  2. Create a file size of fileSpace with storage policy \{DISK,DISK,DISK} 
> under it;
>  3. Change the storage policy of the file to ALLSSD_STORAGE_POLICY_NAME and 
> expect a QuotaByStorageTypeExceededException.
> Because the quota and consume is not handled, the expected exception is not 
> threw out.
>  
> There are 3 reasons why we should handle the consume and the quota.
> 1. Replication uses the new storage policy. Considering a file with BlockType 
> CONTIGUOUS. It's replication factor is 3 and it's storage policy is "HOT". 
> Now we change the policy to "ONE_SSD". If a DN goes down and the file needs 
> replication, the NN will choose storages in policy "ONE_SSD" and replicate 
> the block to a SSD storage.
> 2. We acturally have a cluster storaging both HOT and COLD data. We have a 
> backgroud process searching all the files to find those that are not accessed 
> for a period of time. Then we set them to COLD and start a mover to move the 
> replicas. After moving, all the replicas are consistent with the storage 
> policy.
> 3. The NameNode manages the global state of the cluster. If there is any 
> inconsistent situation, such as the replicas doesn't match the storage policy 
> of the file, we should take the NameNode as the standard and make the cluster 
> to match the NameNode. The block replication is a good example of the rule. 
> When we count the consume of a file(CONTIGUOUS), we multiply the replication 
> factor with the file's length, no matter the file is under replicated or 
> excessed. So does the storage type quota and consume.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to