Quanlong Huang created IMPALA-10727:
---------------------------------------
Summary: Share identical CachedHmsPartitionDescriptor across
HdfsPartitions
Key: IMPALA-10727
URL: https://issues.apache.org/jira/browse/IMPALA-10727
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Reporter: Quanlong Huang
In catalogd, we keep one CachedHmsPartitionDescriptor for each HdfsPartition.
Many fields in it could be identical, e.g. sdBucketCols, sdSortCols. We can
keep different {{CachedHmsPartitionDescriptor}} in HdfsTable instead and share
them to the HdfsPartition. For fields that differs across partitions, e.g.
msCreateTime, msLastAccessTime, we can move them to HdfsPartition.
https://github.com/apache/impala/blob/1a84a1420c5d517f43e4c7e90ee204db30f27d57/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L543
{code:java}
// TODO: Cache this descriptor in HdfsTable so that identical descriptors are
shared
// between HdfsPartition instances.
// TODO: sdInputFormat and sdOutputFormat can be mutated by Impala when the
file format
// of a partition changes; move these fields to HdfsPartition.
private static class CachedHmsPartitionDescriptor {
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]