Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION ......................................................................
Patch Set 7: (7 comments) As I suggested in the comments below, I think there is an addPartitions() function in HMS. We should be using that instead of adding one partition at a time. http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: PS7, Line 879: ArrayList If possible, can you switch to using the interface (List) instead? http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java File fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java: PS7, Line 43: @param tableName Name of the table to add partitions to : * @param ifNotExists If true, no error is raised if a partition with the same spec : * already exists. If multiple partitions are specified, the statement will ignore : * those that exist and add the rest. : * @param partitions The list of partitions to add to the table. As you've probably realized, we don't use Javadoc. Can you plz remove this? Also it is pretty obvious that the constructor constructs the AlterTableAddPartitionsStmt obj. So you don't need the comment. PS7, Line 87: Set<Set<String>> partitionSpecs = Sets.newHashSet(); I wonder if this is the most efficient way to ensure unique partition specs . Couldn't you just create a function in the PartitionSpec class to return a string of key=value pairs orderer by key and then insert that in a Set<String>? The analysis of PartitionSpec already ensures that there are no duplicate keys. http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java File fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java: PS7, Line 40: // Set during analysis I don't see this being set in the analyze() fn. Actually, in that function you expect that table_ has already been set. Can you update the comment? PS7, Line 52: public void setTable(Table table) { : table_ = table; : } single line http://gerrit.cloudera.org:8080/#/c/4144/7/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java: PS7, Line 358: Table refreshedTable = null; Sorry didn't realize it was used in L400. You can leave it here. Line 375: // already exist in Hive, then throw ImpalaRuntimeException. You may want to comment on the consistency guarantees (if any) here. For example, say 'if not exists' is false, you manage to insert some partitions and then you get a conflict on a partition. What happens then? What state is the table going to be in HMS and catalog cache? Also, I think the HiveMetaStoreClient API has a function to add multiple partitions. Your changes seem to add one partition at a time which may result in inconsistent states as I described above. Can you please check the HMS API and use the bulk partition insert call, if any? -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-HasComments: Yes
