> Do you have test code at hand? (I mean the boilerplate code to prepare
> the environment for "getXXX" metadata call)

The easiest thing is to run RelMetadataTest; run with
"-Dcalcite.debug" so see the code generated. Add extra "mq.xxx" calls
and you'll be calling into the generated handler.

I'm struggling to find an end-to-end case (i.e. one that parses,
optimizes and executes a SQL statement) that demonstrates a clear
performance difference. One test is JdbcTest.testJoinFiveWay. The new
metadata seems to improve runs from 17s to 15s seconds.

By the way, that test makes 200k metadata calls. Here's how I found out:

diff --git 
a/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
b/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
index 8e246ee..3227436 100644
--- 
a/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
+++ 
b/core/src/main/java/org/apache/calcite/rel/metadata/JaninoRelMetadataProvider.java
@@ -91,6 +91,8 @@
  * a class that dispatches to the underlying providers.
  */
 public class JaninoRelMetadataProvider implements RelMetadataProvider {
+  public static int N;
+
   private final RelMetadataProvider provider;

   // Constants and static fields
@@ -242,6 +244,9 @@ public static JaninoRelMetadataProvider
of(RelMetadataProvider provider) {
           .append("      org.apache.calcite.rel.metadata.RelMetadataQuery mq");
       paramList(buff, method.e)
           .append(") {\n");
+      buff.append("    System.out.println(\"").append(method.e.getName())
+          .append(" \" + (").append(JaninoRelMetadataProvider.class.getName())
+          .append(".N++));\n");
       buff.append("    final java.util.List key = ")
           .append(
               (method.e.getParameterTypes().length < 4

Other counts:
* RelMetadataTest 9,301
* RelOptRulesTest 4,474
* JdbcTest.testJoinFiveWay 200,385
* JdbcTest.testJoinManyWay 200,457
* JdbcTest 715,204 (not including testJoinManyWay, testJoinFiveWay)
* CalciteSuite 8,248,498

> Have you tested compilation into switch over a hashcode?

I haven't tested switching using a hashCode (it sounds like you're
talking about perfect hashing, or something similar). It's definitely
well worth investigating.

I also thought of replacing

 switch (relClasses.indexOf(r.getClass()))

with

  Map<Class, Integer> relClassIds = new IdentityHashMap<>();
  ...
  switch (relClassIds.get(r.getClass())

because at present we scan a list of ~80 classes each call.

Julian

On Fri, Jan 22, 2016 at 11:57 AM, Vladimir Sitnikov
<[email protected]> wrote:
> Julian>My dev branch is complete and ready for review.
>
> Should we conduct some performance tests?
> It would be interesting to compare old vs new, and to identify the
> bottlenecks of the new approach.
>
> Do you have test code at hand? (I mean the boilerplate code to prepare
> the environment for "getXXX" metadata call)
> Have you planned adding perf tests?
>
> I'm not up to speed with metadata queries, however I can help with
> benchmarks/analysis.
>
> Have you tested compilation into switch over a hashcode?
> I mean
>
> switch(class.hashCode()%37) { // this 37 might be picked on a
> case-by-case basis to minimize collisions
>   case 1: // LogicalAggregate or HiveProject
>   case 23: // LogicalProject
> ...
> }
>
> Vladimir

Reply via email to