Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2840: Don't store table location in partition location
......................................................................


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/2355/5/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 38: namespace
Out of curiosity, why don't you put this function under impala namespace?


Line 41: // format that references the 'partition_prefixes' of 'thrift_table'.
Can you also comment on the return value when location isn't set?


http://gerrit.cloudera.org:8080/#/c/2355/5/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 223: HDFS partition's location
There is no reason why we can't use this to compress other locations as well. 
For example: hdfs://host_name/my_database/table1 and 
hdfs://host_name/my_database/tabl2 share a pretty long prefix. Maybe leave a 
TODO somewhere to generalize this functionality.


http://gerrit.cloudera.org:8080/#/c/2355/5/fe/src/main/java/com/cloudera/impala/catalog/HdfsPartitionLocationCompressor.java
File 
fe/src/main/java/com/cloudera/impala/catalog/HdfsPartitionLocationCompressor.java:

Line 28: This class is not thread-safe. Clients of this class need to protect 
against
       :  * concurrent updates using external locking.
       :  *
       :  * Owned by HdfsTable instance.
Since this is owned by HdfsTable and the catalog uses table-level locking, we 
don't need to synchronize accesses to instances of this class. Can you update 
the comment?


Line 35: numClusteringColumns_
This doesn't seem to be used anywhere besides getter and setter. Why storing it 
in this class?


Line 39: indexToPrefix
This is a bit misleading as it sort of indicates that this is a map. Maybe call 
it prefixes_ instead?


Line 88: public class Location
How about the following idea for organizing this code:
1. HdfsPartitionLocationCompressor has a compress method that takes a string 
(partition location) and returns a Pair<Integer, String> representing a 
prefix-suffix pair. 
2. It also has a decompress method that converts a Pair<Integer, String> into a 
partition location string. 
3. Location class is deleted. Each partition can simply store the prefix-suffix 
pair. HdfsPartition has two util methods compress/decompress that simply 
delegate the work to the parent table object.


-- 
To view, visit http://gerrit.cloudera.org:8080/2355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to