Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 
PARTITION
......................................................................


Patch Set 7:

(7 comments)

As I suggested in the comments below, I think there is an addPartitions() 
function in HMS. We should be using that instead of adding one partition at a 
time.

http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS7, Line 879: ArrayList
If possible, can you switch to using the interface (List) instead?


http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java
File 
fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java:

PS7, Line 43: @param tableName Name of the table to add partitions to
            :    * @param ifNotExists If true, no error is raised if a 
partition with the same spec
            :    * already exists. If multiple partitions are specified, the 
statement will ignore
            :    * those that exist and add the rest.
            :    * @param partitions The list of partitions to add to the table.
As you've probably realized, we don't use Javadoc. Can you plz remove this? 
Also it is pretty obvious that the constructor constructs the 
AlterTableAddPartitionsStmt obj. So you don't need the comment.


PS7, Line 87: Set<Set<String>> partitionSpecs = Sets.newHashSet();
I wonder if this is the most efficient way to ensure unique partition specs . 
Couldn't you just create a function in the PartitionSpec class to return a 
string of key=value pairs orderer by key and then insert that in a Set<String>? 
The analysis of PartitionSpec already ensures that there are no duplicate keys.


http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java
File fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java:

PS7, Line 40: // Set during analysis
I don't see this being set in the analyze() fn. Actually, in that function you 
expect that table_ has already been set. Can you update the comment?


PS7, Line 52: public void setTable(Table table) {
            :     table_ = table;
            :   }
single line


http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

PS7, Line 358: Table refreshedTable = null;
Sorry didn't realize it was used in L400. You can leave it here.


Line 375:           // already exist in Hive, then throw ImpalaRuntimeException.
You may want to comment on the consistency guarantees (if any) here. For 
example, say 'if not exists' is false, you manage to insert some partitions and 
then you get a conflict on a partition. What happens then? What state is the 
table going to be in HMS and catalog cache?

Also, I think the HiveMetaStoreClient API has a function to add multiple 
partitions. Your changes seem to add one partition at a time which may result 
in inconsistent states as I described above. Can you please check the HMS API 
and use the bulk partition insert call, if any?


-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-HasComments: Yes

Reply via email to