[ https://issues.apache.org/jira/browse/HDFS-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ke Han updated HDFS-17219: -------------------------- Description: When upgrading hdfs cluster from 2.10.2 to 3.3.6, the results returned from *dfs count* command is inconsistent. h1. Reproduce Start up 2.10.2 hdfs cluster (1 NN, 2 DN, 1 SNN), execute the following commands {code:java} dfs -mkdir /GscWZRxS dfs -put -f -d /tmp/hpLjvJVW/cl /GscWZRxS/ dfs -put -f -d /tmp/hpLjvJVW/Zjpk /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR dfsadmin -clrQuota /GscWZRxS/cl dfsadmin -refreshSuperUserGroupsConfiguration dfs -mkdir /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf/mGpVA dfsadmin -refreshCallQueue dfsadmin -clrQuota /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd dfsadmin -setSpaceQuota 2 -storageType DISK /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf dfsadmin -refreshNodes dfsadmin -setSpaceQuota 2 -storageType DISK /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd dfsadmin -clrSpaceQuota -storageType ARCHIVE /GscWZRxS/cl dfsadmin -restoreFailedStorage true{code} before upgrade, check the quota results {code:java} bin/hdfs dfs -count -q -h -u /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf none inf none inf /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} Then prepare the upgrade. Enter safemode, {*}create image{*}, shutdown the cluster and start up the new cluster {code:java} bin/hdfs dfs -count -q -h -u /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf 8.0 E 8.0 E none inf /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} The values of the first two columns are inconsistent with the quota I set before. I have attached the file used by the command. I am digging out the root cause, I'll try to submit a patch once I can fix it. Any help is appreciated! h1. Root Cause The issue exists when persisting data to FSImage. The quota values stored in Edit Logs are correct. However, once HDFS creates an FSImage, the edit logs will be discarded. Therefore, the quota information is lost. was: When upgrading hdfs cluster from 2.10.2 to 3.3.6, the results returned from *dfs count* command is inconsistent. h1. Reproduce Start up 2.10.2 hdfs cluster (1 NN, 2 DN, 1 SNN), execute the following commands {code:java} dfs -mkdir /GscWZRxS dfs -put -f -d /tmp/hpLjvJVW/cl /GscWZRxS/ dfs -put -f -d /tmp/hpLjvJVW/Zjpk /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR dfsadmin -clrQuota /GscWZRxS/cl dfsadmin -refreshSuperUserGroupsConfiguration dfs -mkdir /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf/mGpVA dfsadmin -refreshCallQueue dfsadmin -clrQuota /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd dfsadmin -setSpaceQuota 2 -storageType DISK /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf dfsadmin -refreshNodes dfsadmin -setSpaceQuota 2 -storageType DISK /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd dfsadmin -clrSpaceQuota -storageType ARCHIVE /GscWZRxS/cl dfsadmin -restoreFailedStorage true{code} before upgrade, check the quota results {code:java} bin/hdfs dfs -count -q -h -u /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf none inf none inf /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} Then prepare the upgrade. Enter safemode, create image, shutdown the cluster and start up the new cluster {code:java} bin/hdfs dfs -count -q -h -u /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf 8.0 E 8.0 E none inf /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} The values of the first two columns are inconsistent with the quota I set before. I have attached the file used by the command. I am digging out the root cause, I'll try to submit a patch once I can fix it. Any help is appreciated! > Inconsistent count results when upgrading hdfs cluster from 2.10.2 to 3.3.6 > --------------------------------------------------------------------------- > > Key: HDFS-17219 > URL: https://issues.apache.org/jira/browse/HDFS-17219 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.10.2, 3.3.6 > Reporter: Ke Han > Priority: Major > Attachments: hpLjvJVW.tar.gz > > > When upgrading hdfs cluster from 2.10.2 to 3.3.6, the results returned from > *dfs count* command is inconsistent. > h1. Reproduce > Start up 2.10.2 hdfs cluster (1 NN, 2 DN, 1 SNN), execute the following > commands > {code:java} > dfs -mkdir /GscWZRxS > dfs -put -f -d /tmp/hpLjvJVW/cl /GscWZRxS/ > dfs -put -f -d /tmp/hpLjvJVW/Zjpk /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR > dfsadmin -clrQuota /GscWZRxS/cl > dfsadmin -refreshSuperUserGroupsConfiguration > dfs -mkdir /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf/mGpVA > dfsadmin -refreshCallQueue > dfsadmin -clrQuota /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd > dfsadmin -setSpaceQuota 2 -storageType DISK > /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf > dfsadmin -refreshNodes > dfsadmin -setSpaceQuota 2 -storageType DISK /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd > dfsadmin -clrSpaceQuota -storageType ARCHIVE /GscWZRxS/cl > dfsadmin -restoreFailedStorage true{code} > before upgrade, check the quota results > {code:java} > bin/hdfs dfs -count -q -h -u > /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf > none inf none inf > /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} > Then prepare the upgrade. Enter safemode, {*}create image{*}, shutdown the > cluster and start up the new cluster > {code:java} > bin/hdfs dfs -count -q -h -u > /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf > 8.0 E 8.0 E none inf > /GscWZRxS/cl/lBsmFBlyBd/pozIeNFjzd/PsLbgpR/Zjpk/Cf {code} > The values of the first two columns are inconsistent with the quota I set > before. > I have attached the file used by the command. I am digging out the root > cause, I'll try to submit a patch once I can fix it. Any help is appreciated! > h1. Root Cause > The issue exists when persisting data to FSImage. > The quota values stored in Edit Logs are correct. However, once HDFS creates > an FSImage, the edit logs will be discarded. Therefore, the quota information > is lost. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org