Quanlong Huang created IMPALA-10727:
---------------------------------------

             Summary: Share identical CachedHmsPartitionDescriptor across 
HdfsPartitions
                 Key: IMPALA-10727
                 URL: https://issues.apache.org/jira/browse/IMPALA-10727
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog
            Reporter: Quanlong Huang


In catalogd, we keep one CachedHmsPartitionDescriptor for each HdfsPartition. 
Many fields in it could be identical, e.g. sdBucketCols, sdSortCols. We can 
keep different {{CachedHmsPartitionDescriptor}} in HdfsTable instead and share 
them to the HdfsPartition. For fields that differs across partitions, e.g. 
msCreateTime, msLastAccessTime, we can move them to HdfsPartition.

https://github.com/apache/impala/blob/1a84a1420c5d517f43e4c7e90ee204db30f27d57/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L543
{code:java}
  // TODO: Cache this descriptor in HdfsTable so that identical descriptors are 
shared
  // between HdfsPartition instances.
  // TODO: sdInputFormat and sdOutputFormat can be mutated by Impala when the 
file format
  // of a partition changes; move these fields to HdfsPartition.
  private static class CachedHmsPartitionDescriptor {
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to