[ https://issues.apache.org/jira/browse/HDFS-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861740#comment-16861740 ]
Jinglun edited comment on HDFS-14547 at 6/12/19 6:17 AM: --------------------------------------------------------- The EnumCounters is widely used in DirectoryWithQuotaFeature and quota checks, so save storage without introducing additional cpu overhead will benefit a lot. In production clusters, the quota is used more frequently than the storage type quota. The unset quota / storage type quota costs an array filled with -1. Since all the unset quota / storage tye quota are the same, a natural idea it to make all the unset quota pointing to one pre-defined quota object. When the quota is changed from unset to a new value, we then construct a new quota object for it. It's very like the String.class. The idea to save storage is to make EnumCounters.class works like String.class. We define 4 most common used EnumCounters in QuotaCounts. For each DirectoryQuotaFeature, if the quota / storage type quota is one of the pre-defined value, it will be pointed to the pre-defined one rather than creating new EnumCounter Objects. {code:java} public final static EnumCounters<Quota> QUOTA_RESET; public final static EnumCounters<Quota> QUOTA_DEFAULT; public final static EnumCounters<StorageType> STORAGE_TYPE_RESET; public final static EnumCounters<StorageType> STORAGE_TYPE_DEFAULT; static { QUOTA_DEFAULT = new ConstEnumCounters(Quota.class, 0); QUOTA_RESET = new ConstEnumCounters(Quota.class, HdfsConstants.QUOTA_RESET); STORAGE_TYPE_DEFAULT = new ConstEnumCounters(StorageType.class, 0); STORAGE_TYPE_RESET = new ConstEnumCounters(StorageType.class, HdfsConstants.QUOTA_RESET); } {code} The pacth introduces a new class ConstEnumCounters.class which extends EnumCounter.class for the 4 pre-defined EnumCounters. ConstEnumCounters.class works just like EnumCounters except any modification on ConstEnumCounters will end with a ConstEnumException. Supposing the 4 pre-defined EnemCounters are type of EnemCounters.class, then each time modifying the value for an EnumCounter object, we need to check the object's address with the 4 pre-defined EnumCounters. If the object's address is the pre-defined EnumCounters, we must clone the object first and then modify the value. The address check will indtroduce additional overheads. With ConstEnumCounters.class, we try to modify the value first and if any ConstEnumException occurs, we will clone the object and then modify the value. It saves the additional check. In ConstEnumCounters we pre-defined a static final ConstEnumException named cee. Each time the modification of a ConstEnumCounters object occurs, rather than constructing and throwing a new ConstEnumException object, the cee will be throwed. So we can avoid the overheads of constructing stack trace. Another benefit of this patch is HDFS-14542. I change the method QuotaCounts.anyNsSsCountGreaterOrEqual() and QuotaCounts.anyTypeSpaceCountGreaterOrEqual() so the additional check is solved. was (Author: lijinglun): The EnumCounters is widely used in DirectoryWithQuotaFeature and quota checks, so save memory without introducing additional cpu overhead will benefit a lot. In production clusters, the quota is used more frequently than the storage type quota. The unset quota / storage type quota costs an array filled with -1. Since all the unset quota / storage tye quota are the same, a natural idea it to make all the unset quota pointing to one pre-defined quota object. When the quota is changed from unset to a new value, we then construct a new quota object for it. It's very like the String.class. The idea to save memory is to make EnumCounters.class works like String.class. We define 4 most common used EnumCounters in QuotaCounts. For each DirectoryQuotaFeature, if the quota / storage type quota is one of the pre-defined value, it will be pointed to the pre-defined one rather than creating new EnumCounter Objects. {code:java} public final static EnumCounters<Quota> QUOTA_RESET; public final static EnumCounters<Quota> QUOTA_DEFAULT; public final static EnumCounters<StorageType> STORAGE_TYPE_RESET; public final static EnumCounters<StorageType> STORAGE_TYPE_DEFAULT; static { QUOTA_DEFAULT = new ConstEnumCounters(Quota.class, 0); QUOTA_RESET = new ConstEnumCounters(Quota.class, HdfsConstants.QUOTA_RESET); STORAGE_TYPE_DEFAULT = new ConstEnumCounters(StorageType.class, 0); STORAGE_TYPE_RESET = new ConstEnumCounters(StorageType.class, HdfsConstants.QUOTA_RESET); } {code} The pacth introduces a new class ConstEnumCounters.class which extends EnumCounter.class for the 4 pre-defined EnumCounters. ConstEnumCounters.class works just like EnumCounters except any modification on ConstEnumCounters will end with a ConstEnumException. Supposing the 4 pre-defined EnemCounters are type of EnemCounters.class, then each time modifying the value for an EnumCounter object, we need to check the object's address with the 4 pre-defined EnumCounters. If the object's address is the pre-defined EnumCounters, we must clone the object first and then modify the value. The address check will indtroduce additional overheads. With ConstEnumCounters.class, we try to modify the value first and if any ConstEnumException occurs, we will clone the object and then modify the value. It saves the additional check. In ConstEnumCounters we pre-defined a static final ConstEnumException named cee. Each time the modification of a ConstEnumCounters object occurs, rather than constructing and throwing a new ConstEnumException object, the cee will be throwed. So we can avoid the overheads of constructing stack trace. Another benefit of this patch is HDFS-14542. I change the method QuotaCounts.anyNsSsCountGreaterOrEqual() and QuotaCounts.anyTypeSpaceCountGreaterOrEqual() so the additional check is solved. > DirectoryWithQuotaFeature.quota costs additional memory even the storage type > quota is not set. > ----------------------------------------------------------------------------------------------- > > Key: HDFS-14547 > URL: https://issues.apache.org/jira/browse/HDFS-14547 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 3.1.0 > Reporter: Jinglun > Assignee: Jinglun > Priority: Major > Attachments: HDFS-14547-design, HDFS-14547.001.patch > > > Our XiaoMi HDFS is considering upgrading from 2.6 to 3.1. We notice the > storage type quota 'tsCounts' is instantiated to > EnumCounters<StorageType>(StorageType.class), so it will cost a long[5] even > if we don't have any storage type quota on this inode(only space quota or > name quota). > In our cluster we have many dirs with quota and the NameNode's memory is in > tension, so the additional cost will be a problem. > See DirectoryWithQuotaFeature.Builder(). > > {code:java} > class DirectoryWithQuotaFeature$Builder { > public Builder() { > this.quota = new QuotaCounts.Builder().nameSpace(DEFAULT_NAMESPACE_QUOTA). > storageSpace(DEFAULT_STORAGE_SPACE_QUOTA). > typeSpaces(DEFAULT_STORAGE_SPACE_QUOTA).build();// set default value -1. > this.usage = new QuotaCounts.Builder().nameSpace(1).build(); > } > public Builder typeSpaces(long val) {// set default value. > this.tsCounts.reset(val); > return this; > } > } > class QuotaCounts$Builder { > public Builder() { > this.nsSsCounts = new EnumCounters<Quota>(Quota.class); > this.tsCounts = new EnumCounters<StorageType>(StorageType.class); > } > } > class EnumCounters { > public EnumCounters(final Class<E> enumClass) { > final E[] enumConstants = enumClass.getEnumConstants(); > Preconditions.checkNotNull(enumConstants); > this.enumClass = enumClass; > this.counters = new long[enumConstants.length];// new a long array here. > } > } > {code} > Related to HDFS-14542. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org