[jira] [Created] (IMPALA-13453) REFRESH PARTITION always update the partition

Tue, 15 Oct 2024 19:45:26 -0700

Quanlong Huang created IMPALA-13453:
---------------------------------------

             Summary: REFRESH <table> PARTITION <partition> always update the 
partition
                 Key: IMPALA-13453
                 URL: https://issues.apache.org/jira/browse/IMPALA-13453
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


In table level REFRESH, we check whether the partition is actually changed and 
skip updating unchanged partitions in catalog:
[https://github.com/apache/impala/blob/42fda24364786cc1a457890bd212bb3922479e95/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1098-L1101]
{code:java}
  public void updatePartition(HdfsPartition.Builder partBuilder) throws 
CatalogException {
    HdfsPartition oldPartition = partBuilder.getOldInstance();
    ...
    boolean partitionNotChanged = partBuilder.equalsToOriginal(oldPartition);
    LOG.trace("Partition {} {}", oldPartition.getName(),
        partitionNotChanged ? "changed" : "unchanged");
    if (partitionNotChanged) return;
    HdfsPartition newPartition = partBuilder.build();
    // Partition is reloaded and hence cache directives are not dropped.
    dropPartition(oldPartition, false);
    addPartition(newPartition);
  }{code}
However, in partition REFRESH, we always drop and add the partition:
[https://github.com/apache/impala/blob/42fda24364786cc1a457890bd212bb3922479e95/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L3093-L3096]
{code:java}
    for (Map.Entry<HdfsPartition.Builder, HdfsPartition> entry :
        partBuilderToPartitions.entrySet()) {
      if (entry.getValue() != null) {
        dropPartition(entry.getValue(), false);
      }
      addPartition(entry.getKey().build());
    }{code}
We should add the same check to avoid updating unchanged partitions.

CC [~csringhofer], [~hemanth619] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to