Henry Robinson has posted comments on this change. Change subject: IMPALA-1683: Allow REFRESH on a single partition ......................................................................
Patch Set 9: (17 comments) Overall, this looks pretty high quality. Well done. http://gerrit.cloudera.org:8080/#/c/3385/9/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: PS9, Line 1182: ((HdfsTable) tbl); nit: extraneous parentheses Line 1185: // Retrieve partition name from existing partition or construct it from does it make sense to do some basic sanity testing here? like checking the number of columns is equal to the table's number of clustering columns. The concern I have is that all the code below gets executed even with a malformed partition spec. I'd rather detect that case and early-exit from this method. You could add HdfsTable.isCompatiblePartitionSpec() or something to do this checking for you. PS9, Line 1204: if (hdfspartition != null) { : hdfsTable.dropPartition(partitionSpec); : tbl.setCatalogVersion(newCatalogVersion); : } is this path covered by your tests at all? http://gerrit.cloudera.org:8080/#/c/3385/9/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java: PS9, Line 1929: public String can this be static? PS9, Line 1931: for (int i = 0; i < getNumClusteringCols(); ++i) { : partitionCols.add(getColumns().get(i).getName()); : } seems like a good idea to add Table.getClusteringCols() for this purpose. PS9, Line 1936: kv : tiny nit: we don't put a space before a colon in for expressions like this. PS9, Line 1952: Partition hmsPartition) can this parameter fit on the previous line? Line 1953: dropPartition(oldPartition); can you check (via precondition) that oldPartition and hmsPartition refer to the same partition? http://gerrit.cloudera.org:8080/#/c/3385/9/tests/metadata/test_refresh_partition.py File tests/metadata/test_refresh_partition.py: PS9, Line 21: nsion, \ long line PS9, Line 27: class ImpalaTableWrapper(object): this seems useful! I wonder if there's a better place to put it so that others can take advantage. PS9, Line 41: (self.table_name, self.table_spec)) combine with previous line? PS9, Line 109: cant can't PS9, Line 168: s exist PS9, Line 172: db_name + '.' + self.unique_string() since you do this on every (?) construction of an ImpalaTableWrapper, why not pass the db and table name as separate parameters, and ITW do the concatenation? PS9, Line 266: \ nit: remove PS9, Line 281: dst_path = "%s/year=2010/month=1/%s" % (table_location, file_name) : check_call(["hadoop", "fs", "-cp", "-f", src_file, dst_path], shell=False) : dst_path = "%s/year=2010/month=2/%s" % (table_location, file_name) : check_call(["hadoop", "fs", "-cp", "-f", src_file, dst_path], shell=False) nit: it's maybe worth, in this case, replacing this with a two or three liner to make the code less dense: dst_path = table_location + "/year=2010/month=%s" + file_name for month in [1, 2]: check_call["hadoop", "fs", "-cp", "-f", src_file, dst_path % month], shell=False) PS9, Line 295: % long line -- To view, visit http://gerrit.cloudera.org:8080/3385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4bf5dd630f2e175be5800187bbf9337c91f82bd8 Gerrit-PatchSet: 9 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: David Knupp <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-HasComments: Yes
