Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2840: Don't store table location in partition location
......................................................................


Patch Set 4:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/2355/4//COMMIT_MSG
Commit Message:

Line 11: we can compress it down to one
       : bit in the common case.
It this still relevant?


Line 14: TODO: Since each partition stores the literal values for the
       : partitioning columns, we could also elide the column names.
If we don't already have one, can you file a JIRA for that task?


http://gerrit.cloudera.org:8080/#/c/2355/4/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 41: `thrift_table
nit: missing "'"


http://gerrit.cloudera.org:8080/#/c/2355/4/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 225: prefix
maybe rename to prefix_ndx. Every time I see "prefix" I expect a string.


http://gerrit.cloudera.org:8080/#/c/2355/4/fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
File fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java:

Line 180: locationPrefixDecompressor_
I think the name should indicate what this data structure stores (e.g. 
locationPrefixes_) not how it is used (to decompress locations).


Line 181: ocationPrefixCompressor_ 
Same comment as above. It should be clear what String and Integers represent. 
Also, plz update the comment above.


Line 184: decompressLocationPrefix
how about getLocationPrefix()? Also add a comment on what is the expected 
behavior for invalid values of "i". Example, returns an empty string if "-1" is 
specified and throws a runtime exception when an invalid index is specified. Is 
the caller expected to handle this?


Line 193: compressLocationPrefix
Maybe storeLocationPrefix()?


Line 197: int index = locationPrefixDecompressor_.size() - 1;
        :     locationPrefixCompressor_.put(s, index);
        :     return index;
you can simply do:
return locationPrefixCompressor_.put(s, locationPrefixDecompressor_.size() - 1);


Line 202: , which represents a partition's location
        :   // relative to its parent table.
I think this is no longer necessarily the case, no?


Line 204: public class Location
Why is this class here instead of HdfsPartition?


Line 205: `prefix_`
nit: we typically use regular ticks for quoting variable names


Line 207: this.locationPrefixDecompressor.
locationPrefixDecompresson_


Line 208:     // partition location.
You also need to comment what happens if the partition location does not follow 
the expected directory structure (key=value, etc.)


Line 213: prefix_
prefix_ndx_


Line 223: thrift
Add a Preconditions.checkNotNull(thrift).


Line 231: public
@Override


Line 237: // Combine with some random constants (from uuidgen -r) to make 
Locations with
        :       // identical 'suffix_'s but different 'prefix_'s hash to 
different values.
        :       final long m = 0xc6bfaf3a929b49e1L;
        :       final long a = 0xa9591152f59b46d7L;
        :       long long_prefix = prefix_;
        :       long_prefix = (long_prefix * m + a) >>> 32;
        :       int prefix_hash = (int)long_prefix;
        :       return suffix_.hashCode() ^ prefix_hash;
can't you just use toString().hashCode()?


Line 247: equals() and compareTo()
Why don't you use the toString().equals()|compareTo() directly?


Line 252: if (obj == null) return false;
this is handled by instanceof, you can remove it


Line 268: String[] of size 2 
alternatively you can return a Pair<String, String>


Line 269: If the input does not have N '/
Why does the input need to have N '/'? 
test-warehouse/functional.db/alltypestiny/year=2010/month=10/ has more than 2 
'/', where 2 is the number of partition columns. Unless I am misreading the 
comment...


Line 275: What is left is the prefix.
So, for 
test-warehouse/functional.db/dimitris/is/great/partition_dir_for_alltypestiny/, 
if the table has 2 partition columns, prefix is 
test-warehouse/functional.db/dimitris/is/? Doesn't look right


Line 595: public int getNumPartitionKeys() { return nullPartitionIds_.size(); }
Table.getNumClusteringCols()


Line 1565: new ArrayList<String>
Lists.newArrayList(hdfsTable.getPartition_prefixes());


http://gerrit.cloudera.org:8080/#/c/2355/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

Line 845: ====
Can we create/expand the test where:
1. we create a partition table with some partitions under the table dir and 
some under a different path. 
2. add some data
3. ensure they are accessible by querying the table
4. show partitions
4. do a refresh <table>
5. query table
6. show partitions
7. alter partition/table location
8. query table


-- 
To view, visit http://gerrit.cloudera.org:8080/2355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to