Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 

Patch Set 8:


PS8, Line 20: import java.util.List;
            : import java.util.Set;
> nit: let's try to keep the import sorted if possible (move them below L 33)

PS8, Line 59: if (getDb() != null) {
            :       sb.append(getDb() + ".");
            :     }
> single line
File fe/src/main/java/com/cloudera/impala/analysis/

PS8, Line 3: import java.util.List;
           : import java.util.Set;
           : import java.util.ArrayList;
           : import java.util.Collections;
           : import java.util.Comparator;
> nit: unordered imports

PS8, Line 189: represantation
> typo: representation
File fe/src/main/java/com/cloudera/impala/service/

PS8, Line 375: // If 'IfNotExists' is false and there is a conflict with the 
partitions that
             :           // already exist in Hive, then:
             :           // - leave existing partitions intact
             :           // - don't add any new partitions to the table
             :           // - throw ImpalaRuntimeException.
             :           // If 'IfNotExists' is true, conflicts are hadled as 
             :           // - If some (but not all) partitions already exist in 
Hive, then ignore
             :           // the partitions that already exist and add the rest.
             :           // - If all of the partitions already exist, ignore 
them and return null.
> I believe this comment section is almost duplicate to the one in alterTable

PS8, Line 1642: alterTableCachePartitions
> can you swap this function with alterTableAddPartitions? (top down reading 

PS8, Line 1686: hmsPartitions
> Will this try to alter all the partitions even though only one has a cache 
Changed the code to call alter_partitions() only with those partitions that 
should be cached.

I've added an E2E test (test_add_overlapping_partitions()) to to test the scenario you described:

After ALTER TABLE Impala knows only about the newly added partitions. 
Partitions that have already existed in HMS are silently ignored, they are not 
added to catalog cache and their caching directives are ignored as well.

I thought that this is the expected behavior. The previous implementation of 
ALTER TABLE ADD PARTITION behaved in a similar fashion. Another alter statement 
(ALTER TABLE name PARTITION(partition_spec) SET) is used to change caching for 
an existing partition.

PS8, Line 1696: Hive
> nit: replace Hive with HMS

PS8, Line 1718: hmsPartitionValues2CachingOp
> An alternative for hmsPartitionValues2CachingOp would be partitionCachingOp

PS8, Line 1718: ImmutableList
> Why using ImmutableLists here? These variables are never exposed outside th
I guess, I wanted to communicate very directly and explicitly to the reader 
that the lists won't change once they are in the map.

Got rid of ImmutableList.

PS8, Line 1720:  
> nit: extra space

Line 1727:       } else {
> You may add a "continue;" at the end of the if block and skip the "else"

PS8, Line 1733: ImmutableList.copyOf(hmsPartition.getValues()
> Partitions have unique ids, you don't need to use strings constructed from 
A unique id would work, but I couldn't find one in the documentation of 

Maybe I am missing something?

PS8, Line 1745: catch (AlreadyExistsException e) {
              :         throw new ImpalaRuntimeException(
              :             String.format(HMS_RPC_ERROR_FORMAT_STR, 
"add_partitions"), e);
              :       } catch (TException e) {
              :         throw new ImpalaRuntimeException(
              :             String.format(HMS_RPC_ERROR_FORMAT_STR, 
"add_partitions"), e);
              :       }
> You don't really treat these two exceptions differently. Since Java 7, you 
According to the documentation add_partitions thows the following exceptions:

The first three are all inherited from TException.

Changed the code to catch TException only.

PS8, Line 1766: for (Partition partition: hmsPartitions) {
              :         // Create and add the HdfsPartition. Return the table 
object with an updated
              :         // catalog version.
              :         addHdfsPartition(tbl, partition);
              :       }
> I think there is a scenario that may result in an inconsistent state. Say y
Correct, this is exactly what happens. I've added an E2E test 
(test_add_overlapping_partitions) to to verify this.

Do you think this behavior is acceptable or we should do something about it?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

Reply via email to