Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-2840: Don't store table location in partition location ......................................................................
IMPALA-2840: Don't store table location in partition location For a table with location "ABC", most partitions will have locations like "ABC/DEF=2". The "ABC" part of the location does not need to be stored in Catalog for each partition; we can compress it down to one int in the common case. This is done by stripping from each partition location the last N directories (where N is the number of clustering columns) and storing the resulting string in a cache of partition location prefixes. In the cache, this location prefix string is mapped to an int. Partition locations are then stored as a tuple consisting of that int and a suffix string; the partition location can be reconstructed as the concatenation of the prefix string (from the cache) and the suffix. Though this scheme was designed in the expectation that most partitions will be stored in directories like "/part_col_1=1.23/part_col_2=234/", it works even when that is not the case. TODO: Since each partition stores the literal values for the partitioning columns, we could also elide the column names and values when partitions are placed in directories like "/part_col_1=1.23/part_col_2=234/" Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb Reviewed-on: http://gerrit.cloudera.org:8080/2355 Reviewed-by: Jim Apple <[email protected]> Tested-by: Internal Jenkins --- M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/com/cloudera/impala/analysis/LoadDataStmt.java M fe/src/main/java/com/cloudera/impala/catalog/HdfsPartition.java A fe/src/main/java/com/cloudera/impala/catalog/HdfsPartitionLocationCompressor.java M fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java M fe/src/main/java/com/cloudera/impala/util/ListMap.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M tests/metadata/test_ddl.py M tests/metadata/test_hdfs_encryption.py 12 files changed, 424 insertions(+), 54 deletions(-) Approvals: Jim Apple: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/2355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8c67b6ce0f83de2f5277a528a9ce67e47d638adb Gerrit-PatchSet: 11 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Jim Apple <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]>
