Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization ......................................................................
Patch Set 2: (17 comments) http://gerrit.cloudera.org:8080/#/c/6812/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 428: RETURN_IF_ERROR( > In most cases we're allocating way too much here. We can get a more accurat Done Line 437: int64_t* dst_slot = reinterpret_cast<int64_t*>(dst_tuple->GetSlot(0)); > The dst_tuple->GetSlot(0) looks wrong. There's no guarantee that the target Done http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: Line 333: FunctionCallExpr f = (FunctionCallExpr) substitutedAgg; > remove Done Line 359: //Preconditions.checkState(mergeAggInfo_ == null); > ? Looks like I didn't undo all the changes I was making when experimenting. Fixed. http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: Line 571: public FunctionCallExpr getMergeAggInputFn() { return mergeAggInputFn_; } > Do we need these getters/setters? Yes, they are used on line 617, for example. I don't think it would be a good idea to make mergeAggInputFn_ public. Line 621: if (!(substitutedFn instanceof FunctionCallExpr)) return e; > This looks like an error we should not ignore. Convert into Preconditions c Done http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java File fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java: Line 334: ArrayList<SlotDescriptor> materializedSlots = getMaterializedSlots(); > The previous implementation seemed cheaper, I'd prefer to leave it. Using h Done http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: Line 105: * TODO: pass in range restrictions. > Might be good to add a word or two about the agg optimization flow, i.e., t Done Line 134: // Set to true if the count(*) aggregation can be optimized by populating the tuple with > Set to true if the query block of this scan has a count(*) aggregation that Done Line 238: // Create two functions that we will put into an smap. We want to replace the > Integrate this into a function comment. It should describe what this functi Done Line 247: sd.setType(Type.BIGINT); > set slot as non-nullable Done Line 253: sumFn.analyze(analyzer); > use analyzeNoThrow() and remove the throws declaration Done Line 288: (getTupleDesc().getMaterializedSlots().isEmpty() || > use desc_ instead of getTupleDesc() Done http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1207: * If 'hdfsTblRef' only contains partition columns and 'fastPartitionKeyScans' > comment on new param Done Line 1268: * 'fastPartitionKeyScans' indicates whether to try to produce the slots with > comment new param Done Line 1512: * 'fastPartitionKeyScans' indicates whether to try to produce the slots with > comment new param Done http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: Line 378: RewritesOk("count(id)", rule, null); > Also check count(1+1) and count(1+NULL) Done -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-HasComments: Yes