[
https://issues.apache.org/jira/browse/HIVE-29165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhihua Deng resolved HIVE-29165.
--------------------------------
Fix Version/s: 4.1.0
Resolution: Fixed
> PartColNameInfo could introduce high hash collision due to the wide table
> -------------------------------------------------------------------------
>
> Key: HIVE-29165
> URL: https://issues.apache.org/jira/browse/HIVE-29165
> Project: Hive
> Issue Type: Improvement
> Components: Standalone Metastore
> Reporter: Zhihua Deng
> Assignee: Zhihua Deng
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> In {{{}DirectSqlUpdatePart{}}}, the {{PartColNameInfo}} acts as a map key,
> referring to those statistics to be updated or inserted. If the current table
> has lots of columns, say 1000, then for each {{{}PartColNameInfo{}}}, it’s
> assumed to hash into the same map buckets more than 1000 times. If the table
> has thousands of partitions, locating or inserting the {{PartColNameInfo}}
> could be very slow.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)