Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 

Patch Set 10:

File fe/src/main/java/com/cloudera/impala/analysis/

PS10, Line 75: params.setLocation
> Why are you setting the location if it's null? isn't this an optional field

PS10, Line 90: table_ instanceof HdfsTable)
> Is PartitionParams class applicable to other table types? If not, set a Pre
After going through the code once again, I realized that during the analysis, 
the Table object is available through the 'analyzer' object, therefore 
PartitionParams.setTable() is not needed at all.

The partitionSpec_.analyze() call ensures that the target table is an HdfsTable 
instance, so it is safe to remove the checks from PartitionParams.analyze().
File fe/src/main/java/com/cloudera/impala/analysis/

PS10, Line 210: new ArrayList<PartitionKeyValue>(partitionSpec_);
> = Lists.newArrayList(partitionSpec_)?
File fe/src/main/java/com/cloudera/impala/service/

PS10, Line 1640: for partitions not yet in catalog cache, add cache directives
> I don't think the comment is accurate. We shouldn't by default add cache di
Correct, the comment was fixed.

PS10, Line 1644: TAlterTableAddPartitionParams addPartParams)
               :       throws ImpalaException {
> nit: can these two be in the same line?

PS10, Line 1651: hmsPartitions
> Maybe rename to "hmsPartitionsToAdd"?

PS10, Line 1653: new HashMap<List<String>, THdfsCachingOp>();
> Maps.newHashMap()?

PS10, Line 1657: partitionSpec)) {
> formatting nit: 2 more spaces are needed since this is part of a statement

PS10, Line 1673: partitionCachingOpMap.put(hmsPartition.getValues(), cacheOp);
> Why do you populate this map with all the new partitions? Not all the parti
Correct, if cacheOp is null it doesn't have to be added to the map.

I'm not sure I understood the 2nd part of your comment correctly, please 
clarify it for me if I didn't:
If I create a list of Partitions that have a non-null cacheOp and pass that 
instead of 'hmsPartitions' to alterTableCachePartitions(), then I still need to 
map partitions to their corresponding cacheOps somehow. Otherwise I couldn't 
tell what cache pool name/replication to use when caching a partition. Also 
note that if the table is cached, its partitions should be cached as well when 
their cacheOp is null, so it is not safe to exclude them.

Here's what I did to simplify the code and get rid off partitionCachingOpMap:
Create a list of cacheOps for all the new partitions and pass that instead of 
'partitionCachingOpMap' to 'alterTableCachePartitions()' along with the list of 
new partitions. The list of cacheOps and the list of new partitions are of 
equal size and their elements can be paired up to get the partitions with their 
corresponding cacheOps.

PS10, Line 1689: hmsPartitions = getPartitionsFromHMS(msClient, tableName, 
               :         addedHmsPartitions);
> Do we always need to call this function? What if the size of addedHmsPartit
Correct, no need to make the call in that case.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

Reply via email to