jihoonson commented on a change in pull request #9648: SQL: More
straightforward handling of join planning.
URL: https://github.com/apache/druid/pull/9648#discussion_r406396533
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/rule/DruidJoinRule.java
##########
@@ -148,4 +246,148 @@ static boolean canHandleCondition(final RexNode
condition, final RelDataType lef
return retVal;
}
+
+ private static boolean isLeftExpression(final RexNode rexNode, final int
numLeftFields)
+ {
+ return
ImmutableBitSet.range(numLeftFields).contains(RelOptUtil.InputFinder.bits(rexNode));
+ }
+
+ private static boolean isRightInputRef(final RexNode rexNode, final int
numLeftFields)
+ {
+ return rexNode.isA(SqlKind.INPUT_REF) && ((RexInputRef)
rexNode).getIndex() >= numLeftFields;
+ }
+
+ @VisibleForTesting
+ static class ConditionAnalysis
+ {
+ /**
+ * Number of fields on the left-hand side. Useful for identifying if a
particular field is from on the left
+ * or right side of a join.
+ */
+ private final int numLeftFields;
+
+ /**
+ * Each equality subcondition is an equality of the form f(LeftRel) =
g(RightRel).
+ */
+ private final List<Pair<RexNode, RexInputRef>> equalitySubConditions;
+
+ /**
+ * Each literal subcondition is... a literal.
+ */
+ private final List<RexLiteral> literalSubConditions;
+
+ ConditionAnalysis(
+ int numLeftFields,
+ List<Pair<RexNode, RexInputRef>> equalitySubConditions,
+ List<RexLiteral> literalSubConditions
+ )
+ {
+ this.numLeftFields = numLeftFields;
+ this.equalitySubConditions = equalitySubConditions;
+ this.literalSubConditions = literalSubConditions;
+ }
+
+ public ConditionAnalysis pushThroughLeftProject(final Project leftProject)
+ {
+ // Pushing through the project will shift right-hand field references by
this amount.
+ final int rhsShift =
+ leftProject.getInput().getRowType().getFieldCount() -
leftProject.getRowType().getFieldCount();
+
+ return new ConditionAnalysis(
+ leftProject.getInput().getRowType().getFieldCount(),
+ equalitySubConditions
+ .stream()
Review comment:
I have some benchmark results for stream
([StreamBenchmark](https://github.com/jihoonson/druid/blob/stream-benchmark/benchmarks/src/test/java/org/apache/druid/StreamBenchmark.java)
and
[HighlyNestedStreamBenchmark](https://github.com/jihoonson/druid/blob/stream-benchmark/benchmarks/src/test/java/org/apache/druid/HighlyNestedStreamBenchmark.java)).
```Benchmark Mode Cnt Score Error
Units
StreamBenchmark.flatMapIterator avgt 10 9.584 ± 0.055 ms/op
StreamBenchmark.flatMapSequenceAccumulate avgt 10 5.908 ± 0.041 ms/op
StreamBenchmark.flatMapSequenceYielder avgt 10 22.976 ± 0.064 ms/op
StreamBenchmark.flatMapStream avgt 10 6.249 ± 0.050 ms/op
StreamBenchmark.fluentIteratorToList avgt 10 18.455 ± 0.254 ms/op
StreamBenchmark.sequenceToList avgt 10 14.326 ± 0.662 ms/op
StreamBenchmark.streamToList avgt 10 17.720 ± 0.173 ms/op
StreamBenchmark.sumIterator avgt 10 2.648 ± 0.023 ms/op
<-- native for loop
StreamBenchmark.sumIteratorFlatMap avgt 10 7.270 ± 0.095 ms/op
StreamBenchmark.sumSequence avgt 10 6.654 ± 0.137 ms/op
StreamBenchmark.sumStream avgt 10 2.162 ± 0.048 ms/op
```
```
Benchmark Mode Cnt Score
Error Units
HighlyNestedStreamBenchmark.flatMapIterator avgt 10 2413.571 ±
15.151 ms/op
HighlyNestedStreamBenchmark.flatMapSequenceAccumulate avgt 10 1157.817 ±
13.575 ms/op
HighlyNestedStreamBenchmark.flatMapSequenceYielder avgt 10 6308.657 ±
28.802 ms/op
HighlyNestedStreamBenchmark.flatMapStream avgt 10 953.539 ±
6.755 ms/op
HighlyNestedStreamBenchmark.sumIteratorFlatMap avgt 10 2129.499 ±
20.541 ms/op
HighlyNestedStreamBenchmark.sumNestedFor avgt 10 297.307 ±
0.626 ms/op <-- native for loop
HighlyNestedStreamBenchmark.sumSequence avgt 10 1503.816 ±
5.720 ms/op
HighlyNestedStreamBenchmark.sumStream avgt 10 1136.636 ±
12.353 ms/op
```
I did notice that stream is sometimes bad and these benchmarks were to see
what makes it bad. Unfortunately, these benchmarks show that stream is pretty
good for computing a sum, it can compete with even native for loop when the
stream is not highly nested. For highly nested stream, it's worse than native
for loop, but still best among others. For `toList`, stream seems worse than
others which I'm not sure how much it matters.
I'm suspecting that stream could be very efficient in these benchmarks
because the benchmark code could be easily inlined with stream. I was planning
to do another benchmark with a `map` function which is complex enough to hinder
inlining, but haven't done yet.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]