Jim Apple has posted comments on this change. Change subject: IMPALA-2840: Don't store table location in partition location ......................................................................
Patch Set 4: (25 comments) http://gerrit.cloudera.org:8080/#/c/2355/4//COMMIT_MSG Commit Message: Line 11: we can compress it down to one : bit in the common case. > It this still relevant? Expanded and fixed. Line 14: TODO: Since each partition stores the literal values for the : partitioning columns, we could also elide the column names. > If we don't already have one, can you file a JIRA for that task? IMPALA-3198 http://gerrit.cloudera.org:8080/#/c/2355/4/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: Line 41: `thrift_table > nit: missing "'" Done http://gerrit.cloudera.org:8080/#/c/2355/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 225: prefix > maybe rename to prefix_ndx. Every time I see "prefix" I expect a string. addes "_index" http://gerrit.cloudera.org:8080/#/c/2355/4/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java: Line 180: locationPrefixDecompressor_ > I think the name should indicate what this data structure stores (e.g. loca I have changed the names to indexToPrefix_ and prefixToIndex_. Line 181: ocationPrefixCompressor_ > Same comment as above. It should be clear what String and Integers represen See above. Line 184: decompressLocationPrefix > how about getLocationPrefix()? Also add a comment on what is the expected b Changed the name; added comment. Line 193: compressLocationPrefix > Maybe storeLocationPrefix()? Changed to represent the map-like nature. Line 197: int index = locationPrefixDecompressor_.size() - 1; : locationPrefixCompressor_.put(s, index); : return index; > you can simply do: That actually returns the old value. (This was that bug I talked to you about in person). Line 202: , which represents a partition's location : // relative to its parent table. > I think this is no longer necessarily the case, no? Clarified. Line 204: public class Location > An HdfsPartition still has access to the parent table. So instead of access I made it a util class. Line 205: `prefix_` > nit: we typically use regular ticks for quoting variable names Done Line 207: this.locationPrefixDecompressor. > locationPrefixDecompresson_ Done Line 213: prefix_ > prefix_ndx_ used "index" for consistency with the rest of the code. Line 223: thrift > Add a Preconditions.checkNotNull(thrift). Done Line 231: public > @Override Done Line 237: // Combine with some random constants (from uuidgen -r) to make Locations with : // identical 'suffix_'s but different 'prefix_'s hash to different values. : final long m = 0xc6bfaf3a929b49e1L; : final long a = 0xa9591152f59b46d7L; : long long_prefix = prefix_; : long_prefix = (long_prefix * m + a) >>> 32; : int prefix_hash = (int)long_prefix; : return suffix_.hashCode() ^ prefix_hash; > I would say yes because it simplifies the code. I don't believe the overhea Done Line 247: equals() and compareTo() > Why don't you use the toString().equals()|compareTo() directly? Done Line 252: if (obj == null) return false; > this is handled by instanceof, you can remove it Done Line 268: String[] of size 2 > alternatively you can return a Pair<String, String> Done Line 269: If the input does not have N '/ > Why does the input need to have N '/'? test-warehouse/functional.db/alltype Clarified. Line 275: What is left is the prefix. > Sorry if it wasn't clear. What I had in mind was detecting N key=value pair We talked about this out-of-band, and didn't really reach consensus. I prefer it this way because it requires less code to achieve better compression (assuming that we may have locations like /foo/bar=1/baz=2 and /foo/noncanonical1 and /foo/noncanonical2). Dimitris finds it hacky. Line 595: public int getNumPartitionKeys() { return nullPartitionIds_.size(); } > Table.getNumClusteringCols() Now use numClusteringCols_ directly. Line 1565: new ArrayList<String> > Lists.newArrayList(hdfsTable.getPartition_prefixes()); Removed. http://gerrit.cloudera.org:8080/#/c/2355/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 845: ==== > Can we create/expand the test where: Done -- To view, visit http://gerrit.cloudera.org:8080/2355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Jim Apple <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
