Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2840: Don't store table location in partition location ......................................................................
Patch Set 4: (26 comments) http://gerrit.cloudera.org:8080/#/c/2355/4//COMMIT_MSG Commit Message: Line 11: we can compress it down to one : bit in the common case. It this still relevant? Line 14: TODO: Since each partition stores the literal values for the : partitioning columns, we could also elide the column names. If we don't already have one, can you file a JIRA for that task? http://gerrit.cloudera.org:8080/#/c/2355/4/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: Line 41: `thrift_table nit: missing "'" http://gerrit.cloudera.org:8080/#/c/2355/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 225: prefix maybe rename to prefix_ndx. Every time I see "prefix" I expect a string. http://gerrit.cloudera.org:8080/#/c/2355/4/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java: Line 180: locationPrefixDecompressor_ I think the name should indicate what this data structure stores (e.g. locationPrefixes_) not how it is used (to decompress locations). Line 181: ocationPrefixCompressor_ Same comment as above. It should be clear what String and Integers represent. Also, plz update the comment above. Line 184: decompressLocationPrefix how about getLocationPrefix()? Also add a comment on what is the expected behavior for invalid values of "i". Example, returns an empty string if "-1" is specified and throws a runtime exception when an invalid index is specified. Is the caller expected to handle this? Line 193: compressLocationPrefix Maybe storeLocationPrefix()? Line 197: int index = locationPrefixDecompressor_.size() - 1; : locationPrefixCompressor_.put(s, index); : return index; you can simply do: return locationPrefixCompressor_.put(s, locationPrefixDecompressor_.size() - 1); Line 202: , which represents a partition's location : // relative to its parent table. I think this is no longer necessarily the case, no? Line 204: public class Location Why is this class here instead of HdfsPartition? Line 205: `prefix_` nit: we typically use regular ticks for quoting variable names Line 207: this.locationPrefixDecompressor. locationPrefixDecompresson_ Line 208: // partition location. You also need to comment what happens if the partition location does not follow the expected directory structure (key=value, etc.) Line 213: prefix_ prefix_ndx_ Line 223: thrift Add a Preconditions.checkNotNull(thrift). Line 231: public @Override Line 237: // Combine with some random constants (from uuidgen -r) to make Locations with : // identical 'suffix_'s but different 'prefix_'s hash to different values. : final long m = 0xc6bfaf3a929b49e1L; : final long a = 0xa9591152f59b46d7L; : long long_prefix = prefix_; : long_prefix = (long_prefix * m + a) >>> 32; : int prefix_hash = (int)long_prefix; : return suffix_.hashCode() ^ prefix_hash; can't you just use toString().hashCode()? Line 247: equals() and compareTo() Why don't you use the toString().equals()|compareTo() directly? Line 252: if (obj == null) return false; this is handled by instanceof, you can remove it Line 268: String[] of size 2 alternatively you can return a Pair<String, String> Line 269: If the input does not have N '/ Why does the input need to have N '/'? test-warehouse/functional.db/alltypestiny/year=2010/month=10/ has more than 2 '/', where 2 is the number of partition columns. Unless I am misreading the comment... Line 275: What is left is the prefix. So, for test-warehouse/functional.db/dimitris/is/great/partition_dir_for_alltypestiny/, if the table has 2 partition columns, prefix is test-warehouse/functional.db/dimitris/is/? Doesn't look right Line 595: public int getNumPartitionKeys() { return nullPartitionIds_.size(); } Table.getNumClusteringCols() Line 1565: new ArrayList<String> Lists.newArrayList(hdfsTable.getPartition_prefixes()); http://gerrit.cloudera.org:8080/#/c/2355/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 845: ==== Can we create/expand the test where: 1. we create a partition table with some partitions under the table dir and some under a different path. 2. add some data 3. ensure they are accessible by querying the table 4. show partitions 4. do a refresh <table> 5. query table 6. show partitions 7. alter partition/table location 8. query table -- To view, visit http://gerrit.cloudera.org:8080/2355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Jim Apple <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
