> Do you have test code at hand? (I mean the boilerplate code to prepare
> the environment for "getXXX" metadata call)
The easiest thing is to run RelMetadataTest; run with
"-Dcalcite.debug" so see the code generated. Add extra "mq.xxx" calls
and you'll be calling into the generated handler.
I'm struggling to find an end-to-end case (i.e. one that parses,
optimizes and executes a SQL statement) that demonstrates a clear
performance difference. One test is JdbcTest.testJoinFiveWay. The new
metadata seems to improve runs from 17s to 15s seconds.
By the way, that test makes 200k metadata calls. Here's how I found out:
diff --git
a/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
b/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
index 8e246ee..3227436 100644
---
a/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
+++
b/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
@@ -91,6 +91,8 @@
* a class that dispatches to the underlying providers.
*/
public class JaninoRelMetadataProvider implements RelMetadataProvider {
+ public static int N;
+
private final RelMetadataProvider provider;
// Constants and static fields
@@ -242,6 +244,9 @@ public static JaninoRelMetadataProvider
of(RelMetadataProvider provider) {
.append(" org.apache.calcite.rel.metadata.RelMetadataQuery mq");
paramList(buff, method.e)
.append(") {\n");
+ buff.append(" System.out.println(\"").append(method.e.getName())
+ .append(" \" + (").append(JaninoRelMetadataProvider.class.getName())
+ .append(".N++));\n");
buff.append(" final java.util.List key = ")
.append(
(method.e.getParameterTypes().length < 4
Other counts:
* RelMetadataTest 9,301
* RelOptRulesTest 4,474
* JdbcTest.testJoinFiveWay 200,385
* JdbcTest.testJoinManyWay 200,457
* JdbcTest 715,204 (not including testJoinManyWay, testJoinFiveWay)
* CalciteSuite 8,248,498
> Have you tested compilation into switch over a hashcode?
I haven't tested switching using a hashCode (it sounds like you're
talking about perfect hashing, or something similar). It's definitely
well worth investigating.
I also thought of replacing
switch (relClasses.indexOf(r.getClass()))
with
Map<Class, Integer> relClassIds = new IdentityHashMap<>();
...
switch (relClassIds.get(r.getClass())
because at present we scan a list of ~80 classes each call.
Julian
On Fri, Jan 22, 2016 at 11:57 AM, Vladimir Sitnikov
<[email protected]> wrote:
> Julian>My dev branch is complete and ready for review.
>
> Should we conduct some performance tests?
> It would be interesting to compare old vs new, and to identify the
> bottlenecks of the new approach.
>
> Do you have test code at hand? (I mean the boilerplate code to prepare
> the environment for "getXXX" metadata call)
> Have you planned adding perf tests?
>
> I'm not up to speed with metadata queries, however I can help with
> benchmarks/analysis.
>
> Have you tested compilation into switch over a hashcode?
> I mean
>
> switch(class.hashCode()%37) { // this 37 might be picked on a
> case-by-case basis to minimize collisions
> case 1: // LogicalAggregate or HiveProject
> case 23: // LogicalProject
> ...
> }
>
> Vladimir