Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables ......................................................................
Patch Set 5: (24 comments) http://gerrit.cloudera.org:8080/#/c/6559/4/common/thrift/DataSinks.thrift File common/thrift/DataSinks.thrift: Line 59: // Creates a new Hdfs files according to the evaluation of the partitionKeyExprs, > how is this related to TDataPartition.partition_exprs? As discussed, we're going to move this to TDataPartition.partition_exprs. http://gerrit.cloudera.org:8080/#/c/6559/4/common/thrift/Exprs.thrift File common/thrift/Exprs.thrift: Line 128: struct TKuduPartitionExpr { > this feels redundant As discussed, we're keeping it but eliminating some unnecessary members. Line 129: // The Kudu table to use the partitioning scheme from. > how does TTableSink.target_table_id not capture that? The TTableSink is not accessible in the expr. Line 134: 2: required list<i32> referenced_columns > shouldn't this be part of the descriptor of the target table? ie, shouldn't Done Line 138: struct TExprNode { > what about TKuduTableSink.referenced_columns? The TKuduTableSink is not accessible from the expr. We could potentially use the KuduTableDescriptor but its not currently possible to eliminate this without other changes, because the KuduTableDescriptor exposes the primary key columns, but not the partition columns. The partition columns are contained within the TKuduTable that's used to construct the KuduTableDescriptor, but not in a particularly convenient form to extract the info we need here, due to the indirection of TKuduPartitionParam. I'm still happy to make this change if you feel its an improvement (and in the long run it would be nice to be able to get info about the partition cols from the KuduTableDescriptor), but I wanted to check before writing the code. That's especially since there's some design decisions - e.g. we could provide KuduTableDescriptor->partition_columns() that returns a list of column names, the way KuduTableDescriptor->key_columns() does, which would be sufficient for our needs here, but then we wouldn't know if they were hash or range partitions, what the range is, etc, so maybe instead just provide a list of TKuduPartitionParams or something. http://gerrit.cloudera.org:8080/#/c/6559/4/common/thrift/Partitions.thrift File common/thrift/Partitions.thrift: Line 40: // and then this can be removed. (IMPALA-5255) > could you file a jira for this? the reason this may not just be a cosmetic I filed: IMPALA-5255. http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: Line 109: // Set in analyze(). Exprs correspond to the partitionKeyValues, if specified, or to > correspond Done Line 114: // table, eg. partitionKeyExprs_[i] corresponds to table_.columns(partitionKeyIdx_[i]). > the equality is wrong. partitionkeyexprs_ are the exprs from the select lis Done Line 115: private List<Integer> partitionKeyIdxs_ = Lists.newArrayList(); > why wouldn't this just be 1, 2, 3, ...? partitionkeyexprs is already in tab The primary keys are always the first n columns, but the partition keys can be any subset of the primary keys, so the values in this list will be in ascending order, but it may skip some values. Line 699: // Make sure we have stats for partitionKeyExprs > odd indentation Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java: Line 36: * Internal expr that calls into the Kudu client to determine the partition index for > does this really need to exist in the fe, or is this an execution-/be-only As discussed, this is necessary, for example because we need an Expr that we can pass to the sorter. Line 90: protected String toSqlImpl() { return "KuduPartition()"; } > this feels wrong. will this ever get called? Yes, it is included in the explain output. See the planner test files for examples. That does bring up though that we may want to include more info here, such as the table name, a list of the partition columns, or the toSql of the partition exprs. http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: Line 154: public Set<String> getPartitionColumnNames() { > call these partition column names? we also use 'partition exprs' and not 'p Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/planner/DataPartition.java File fe/src/main/java/org/apache/impala/planner/DataPartition.java: Line 45: // for hash partition: exprs used to compute hash value > pointless changes, there's no style requirement for this. Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java File fe/src/main/java/org/apache/impala/planner/DataStreamSink.java: Line 35: Preconditions.checkNotNull(partition); > same naming comment as for TDataStreamSink Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: Line 247: exchNode.init(analyzer); > partitionexpr and partitionexprs sounds too similar, and they are distinct Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: Line 76: public class PlanFragment extends TreeNode<PlanFragment> { > use blank lines where appropriate Done Line 99: // it's sent to its destination); > this seems like an operational parameter of the partition (=dataPartition_) Done Line 107: fragmentId_ = id; > same here Done http://gerrit.cloudera.org:8080/#/c/6559/4/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 541: if (inputFragment.getDataPartition().getType() == TPartitionType.KUDU) { > so If we're not doing to repartitioning step, which happens if this is a single node exec. Line 543: inputFragment.getDataPartition().getPartitionExprs().size() == 1); > why clone? Done http://gerrit.cloudera.org:8080/#/c/6559/4/testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test: Line 8: 00:SCAN HDFS [functional.alltypes] > why does sorting not make sense here? This is a single node exec, and we felt it was best to not do either the partitioning or the sorting for single node exec to avoid adding unnecessary overhead to small inserts. We're planning on doing a more careful analysis of when partitioning/sorting is beneficial for things like small inserts as followup work. Line 80: # upsert with inline view > an interesting new test case would be an upsert/insert with data supplied b Done Line 131: 01:EXCHANGE [KUDU(KuduPartition())] > that's unfortunate. can't you very easily detect compatible partitions? if We've left this as future work. There's a TODO in DistributedPlanner and I filed IMPALA-5254. -- To view, visit http://gerrit.cloudera.org:8080/6559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
