Jim Apple has posted comments on this change.

Change subject: IMPALA-2840: Don't store table location in partition location
......................................................................


Patch Set 4:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/2355/4//COMMIT_MSG
Commit Message:

Line 11: we can compress it down to one
       : bit in the common case.
> It this still relevant?
Expanded and fixed.


Line 14: TODO: Since each partition stores the literal values for the
       : partitioning columns, we could also elide the column names.
> If we don't already have one, can you file a JIRA for that task?
IMPALA-3198


http://gerrit.cloudera.org:8080/#/c/2355/4/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 41: `thrift_table
> nit: missing "'"
Done


http://gerrit.cloudera.org:8080/#/c/2355/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 225: prefix
> maybe rename to prefix_ndx. Every time I see "prefix" I expect a string.
addes "_index"


http://gerrit.cloudera.org:8080/#/c/2355/4/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java:

Line 180: locationPrefixDecompressor_
> I think the name should indicate what this data structure stores (e.g. loca
I have changed the names to indexToPrefix_ and prefixToIndex_.


Line 181: ocationPrefixCompressor_ 
> Same comment as above. It should be clear what String and Integers represen
See above.


Line 184: decompressLocationPrefix
> how about getLocationPrefix()? Also add a comment on what is the expected b
Changed the name; added comment.


Line 193: compressLocationPrefix
> Maybe storeLocationPrefix()?
Changed to represent the map-like nature.


Line 197: int index = locationPrefixDecompressor_.size() - 1;
        :     locationPrefixCompressor_.put(s, index);
        :     return index;
> you can simply do:
That actually returns the old value. (This was that bug I talked to you about 
in person).


Line 202: , which represents a partition's location
        :   // relative to its parent table.
> I think this is no longer necessarily the case, no?
Clarified.


Line 204: public class Location
> An HdfsPartition still has access to the parent table. So instead of access
I made it a util class.


Line 205: `prefix_`
> nit: we typically use regular ticks for quoting variable names
Done


Line 207: this.locationPrefixDecompressor.
> locationPrefixDecompresson_
Done


Line 213: prefix_
> prefix_ndx_
used "index" for consistency with the rest of the code.


Line 223: thrift
> Add a Preconditions.checkNotNull(thrift).
Done


Line 231: public
> @Override
Done


Line 237: // Combine with some random constants (from uuidgen -r) to make 
Locations with
        :       // identical 'suffix_'s but different 'prefix_'s hash to 
different values.
        :       final long m = 0xc6bfaf3a929b49e1L;
        :       final long a = 0xa9591152f59b46d7L;
        :       long long_prefix = prefix_;
        :       long_prefix = (long_prefix * m + a) >>> 32;
        :       int prefix_hash = (int)long_prefix;
        :       return suffix_.hashCode() ^ prefix_hash;
> I would say yes because it simplifies the code. I don't believe the overhea
Done


Line 247: equals() and compareTo()
> Why don't you use the toString().equals()|compareTo() directly?
Done


Line 252: if (obj == null) return false;
> this is handled by instanceof, you can remove it
Done


Line 268: String[] of size 2 
> alternatively you can return a Pair<String, String>
Done


Line 269: If the input does not have N '/
> Why does the input need to have N '/'? test-warehouse/functional.db/alltype
Clarified.


Line 275: What is left is the prefix.
> Sorry if it wasn't clear. What I had in mind was detecting N key=value pair
We talked about this out-of-band, and didn't really reach consensus. I prefer 
it this way because it requires less code to achieve better compression 
(assuming that we may have locations like /foo/bar=1/baz=2 and 
/foo/noncanonical1 and /foo/noncanonical2). Dimitris finds it hacky.


Line 595: public int getNumPartitionKeys() { return nullPartitionIds_.size(); }
> Table.getNumClusteringCols()
Now use numClusteringCols_ directly.


Line 1565: new ArrayList<String>
> Lists.newArrayList(hdfsTable.getPartition_prefixes());
Removed.


http://gerrit.cloudera.org:8080/#/c/2355/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

Line 845: ====
> Can we create/expand the test where:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to