[ https://issues.apache.org/jira/browse/HBASE-25510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266600#comment-17266600 ]
Viraj Jasani commented on HBASE-25510: -------------------------------------- Thanks for filing this Jira [~zhengzhuobinzzb]. Since you have raised PR and you are working on this, I have provided Jira contributor access to you and assigned this Jira to you. Going forward, you will be able to assign Jira to yourself. > Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the > number of tables in the cluster is greater than dozens > ---------------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-25510 > URL: https://issues.apache.org/jira/browse/HBASE-25510 > Project: HBase > Issue Type: Improvement > Components: master, Replication > Affects Versions: 1.2.12, 1.4.13, 2.4.1 > Reporter: zhuobin zheng > Assignee: zhuobin zheng > Priority: Major > Attachments: optimiz_benchmark, origin_benchmark, stucks-profile-info > > > Now, TableName.valueOf will try to find TableName Object in cache > linearly(code show as below). So it is too slow when we has thousands of > tables on cluster. > {code:java} > // code placeholder > for (TableName tn : tableCache) { > if (Bytes.equals(tn.getQualifier(), qns) && Bytes.equals(tn.getNamespace(), > bns)) { > return tn; > } > }{code} > I try to store the object in the hash table, so it can look up more quickly. > code like this > {code:java} > // code placeholder > TableName oldTable = tableCache.get(nameAsStr);{code} > > In our cluster which has tens thousands of tables. (Most of that is KYLIN > table). > We found that in the following two cases, the TableName.valueOf method will > severely restrict our performance. > > Common premise: tens of thousands table in cluster > cause: TableName.valueOf with low performance. (because we need to traverse > all caches linearly) > > Case1. Replication > premise1: one of table write with high qps, small value, Non-batch request. > cause too much wal entry > premise2: deserialize WAL Entry includes calling the TableName.valueOf method. > Cause: Replicat Stuck. A lot of WAL files pile up. > > Case2. Active Master Start up > NamespaceStateManager init should init all RegionInfo, and regioninfo init > will call TableName.valueOf. It will cost some time if TableName.valueOf is > slow. > -- This message was sent by Atlassian Jira (v8.3.4#803005)