Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10713 )
Change subject: IMPALA-7140 (part 3): load partitions for FS tables ...................................................................... Patch Set 2: (16 comments) http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@171 PS2, Line 171: for (String partitionKey: hmsPartitionValues) > this'll allow for a subset of partitioning columns to be returned for table OK, I'll add a Preconditions check that the length of this list matches the number of clustering columns for the table. http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@546 PS2, Line 546: // only to end up throwing away the result. > yikes... yes, just ran across this today as well. yea, luckily this only seems to be called from ComputeStatsStmt analysis, and computing stats is already slow, so the overhead probably isn't disastrous http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@132 PS2, Line 132: Warehouse.makeSpecFromName > something seems off here-- we're going back to hms for partitioning informa Good point. I think the most straight-forward would be to just make this API in metaprovider take a List<String> partitionColumnNames. It's a little odd feeling from a clean API perspective, but maybe better than backing the partition column names out from the first partition name? I think that's preferable to extracting out SchemaInfo as more than an implementation detail of the FsTable class hierarchy, since keeping the API arguments here minimal makes it easier to test, and unit-testing at the level of MetaProvider will probably make sense once we add more complex implementations (eg a caching one). That said, I could be convinced otherwise. http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@143 PS2, Line 143: > nit: consistent spacing Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@148 PS2, Line 148: name > how does name correspond to the partition name that's passed in? I assume i Yea, it should. Do you think it's worth adding an assertion here that ret.keySet() is a subset of ImmutableSet.of(partitionNames)? http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@122 PS2, Line 122: // TODO Auto-generated catch block > remove Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@144 PS2, Line 144: // TODO(todd): copy-paste from HdfsPartition > mark it as expensive (like in the other place) Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@63 PS2, Line 63: /** > nit: newline Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@183 PS2, Line 183: List<String> names = Lists.newArrayList(); > is loadPartitionSpecs() missing? ah, I'll add a Preconditions.checkState here that the partition specs are loaded. Given that ids are identifiers local to a given FsTable instance, it wouldn't really make sense to call this without having already retrieved partition specs to get their IDs. http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@198 PS2, Line 198: > nit: remove empty line Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@214 PS2, Line 214: if (spec.getName().isEmpty()) continue; > when does this happen? ah, I think never :) removed. http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@279 PS2, Line 279: if (getNumClusteringCols() == 0) { > perhaps this is modeled better as two classes, unpartitioned and partitione I can see the appeal but seems like almost all the rest of the code is the same, and I'm hesitant to introduce an inheritance relationship like 'UnpartitionedFsTable <- PartitionedFsTable' or vice versa. I could keep these methods abstract and then have 'UnpartitionedFsTable' and 'PartitionedFsTable' both inherit from 'AbstractFsTable', but my personal taste is that deep implementation inheritance hierarchies become hard to understand (see Fragile Base Class Problem for example). If you feel strongly about it I'll give it a go, but otherwise I'm inclined to keep as is http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java File fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java@43 PS2, Line 43: private final ImmutableList<LiteralExpr> partitionValues; > nit: _ suffix Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/LocalPartitionSpec.java@51 PS2, Line 51: getPartValuesFromPartName > I see a similar method in CatalogOpExecutor (getPartValsFromName). Perhaps Will check this out. http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@60 PS2, Line 60: List<String> partitionNames) throws MetaException, TException; > add a comment about what happens when not all partitions are loaded (could Done http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java File fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java: http://gerrit.cloudera.org:8080/#/c/10713/2/fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java@111 PS2, Line 111: : /*checkFileDescriptors=*/false); > nit: one line Done -- To view, visit http://gerrit.cloudera.org:8080/10713 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddf2edbd6bdc0684560b2ecca9c4c6b6819ef1d3 Gerrit-Change-Number: 10713 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Fri, 15 Jun 2018 17:49:23 +0000 Gerrit-HasComments: Yes
