Henry Robinson has posted comments on this change.

Change subject: IMPALA-1683: Allow REFRESH on a single partition
......................................................................


Patch Set 9:

(17 comments)

Overall, this looks pretty high quality. Well done.

http://gerrit.cloudera.org:8080/#/c/3385/9/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

PS9, Line 1182: ((HdfsTable) tbl);
nit: extraneous parentheses


Line 1185:       // Retrieve partition name from existing partition or 
construct it from
does it make sense to do some basic sanity testing here? like checking the 
number of columns is equal to the table's number of clustering columns. 

The concern I have is that all the code below gets executed even with a 
malformed partition spec. I'd rather detect that case and early-exit from this 
method. You could add HdfsTable.isCompatiblePartitionSpec() or something to do 
this checking for you.


PS9, Line 1204: if (hdfspartition != null) {
              :             hdfsTable.dropPartition(partitionSpec);
              :             tbl.setCatalogVersion(newCatalogVersion);
              :           }
is this path covered by your tests at all?


http://gerrit.cloudera.org:8080/#/c/3385/9/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java:

PS9, Line 1929: public String
can this be static?


PS9, Line 1931: for (int i = 0; i < getNumClusteringCols(); ++i) {
              :       partitionCols.add(getColumns().get(i).getName());
              :     }
seems like a good idea to add Table.getClusteringCols() for this purpose.


PS9, Line 1936: kv :
tiny nit: we don't put a space before a colon in for expressions like this.


PS9, Line 1952: Partition hmsPartition)
can this parameter fit on the previous line?


Line 1953:     dropPartition(oldPartition);
can you check (via precondition) that oldPartition and hmsPartition refer to 
the same partition?


http://gerrit.cloudera.org:8080/#/c/3385/9/tests/metadata/test_refresh_partition.py
File tests/metadata/test_refresh_partition.py:

PS9, Line 21: nsion, \
long line


PS9, Line 27: class ImpalaTableWrapper(object):
this seems useful! I wonder if there's a better place to put it so that others 
can take advantage.


PS9, Line 41: (self.table_name, self.table_spec))
combine with previous line?


PS9, Line 109: cant
can't


PS9, Line 168: s
exist


PS9, Line 172: db_name + '.' + self.unique_string()
since you do this on every (?) construction of an ImpalaTableWrapper, why not 
pass the db and table name as separate parameters, and ITW do the concatenation?


PS9, Line 266: \
nit: remove


PS9, Line 281: dst_path = "%s/year=2010/month=1/%s" % (table_location, 
file_name)
             :     check_call(["hadoop", "fs", "-cp", "-f", src_file, 
dst_path], shell=False)
             :     dst_path = "%s/year=2010/month=2/%s" % (table_location, 
file_name)
             :     check_call(["hadoop", "fs", "-cp", "-f", src_file, 
dst_path], shell=False)
nit: it's maybe worth, in this case, replacing this with a two or three liner 
to make the code less dense:

  dst_path = table_location + "/year=2010/month=%s" + file_name
  for month in [1, 2]:
    check_call["hadoop", "fs", "-cp", "-f", src_file, dst_path % month], 
shell=False)


PS9, Line 295: %
long line


-- 
To view, visit http://gerrit.cloudera.org:8080/3385
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4bf5dd630f2e175be5800187bbf9337c91f82bd8
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-HasComments: Yes

Reply via email to