[ 
https://issues.apache.org/jira/browse/CALCITE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

neoremind updated CALCITE-3873:
-------------------------------
    Description: 
For curiosity, I use flame graph to profiling a simple query. The code snippet 
looks like below.
{code:java}
    String sql = "select empno, gender, name from EMPS where name = 'John'";
    Connection connection = null;
    Statement statement = null;
    try {
      Properties info = new Properties();
      info.put("model", jsonPath("smart"));
      connection = DriverManager.getConnection("jdbc:calcite:", info);      
String x = null;
      long start = System.currentTimeMillis();
      for (int i = 0; i < 50000; i++) {
        statement = connection.createStatement();
        final ResultSet resultSet =
            statement.executeQuery(
                sql);
        while (resultSet.next()) {
          x = resultSet.getInt(1)
              + resultSet.getString(2)
              + resultSet.getString(3);
        }      }
    } catch (SQLException e) {
      e.printStackTrace();
    } finally {
      close(connection, statement);
    }
{code}
 

I attach the generated flame graph [^pic1.svg]
{code:java}
3% on sql2rel
9% on query optimizing,
62% of the time is spent on code gen and implementation,
20% on result set iterating and checking,
… 
{code}
Hope this graph is informative. Since I start to learn Calcite recently, I 
cannot tell where to start tuning, but from the graph one tiny point catches my 
attention, I find there are many reflection invocations in 
_Prepare#trimUnusedFields_. So, I spent some time trying to mitigate the small 
overhead.

I optimize _ReflectiveVisitDispatcher_ by introducing a global _Guava_ cache 
with limited size to cache methods, also I add full unit tests for 
_ReflectUtil_.

I count the reference of the method: _ReflectUtil#createMethodDispatcher and_

_ReflectUtil#createDispatcher (see below)._ Total 68 possible invocations, so 
the cache size is limited, by caching all the methods during the lifecycle of 
the process, we can eliminate reflection looking up methods overhead.
{code:java}
org.apache.calcite.rel.rel2sql.RelToSqlConverter: 18 possible invocations.
org.apache.calcite.sql2rel.RelDecorrelator: 15 possible invocations.
org.apache.calcite.sql2rel.RelFieldTrimmer: 11 possible invocations.
org.apache.calcite.sql2rel.RelStructuredTypeFlattener.RewriteRelVisitor: 22 
possible invocations.
org.apache.calcite.interpreter static class Interpreter.CompilerImpl: 2 
possible invocations.
{code}
 Before introducing the global caching, caching is shared per 
_ReflectiveVisitDispatcher_ instance, now different _ReflectiveVisitDispatcher_ 
in different thread is able to reuse the cached methods.

See [^pic2.svg], after tuning, _trimUnusedFields_ only takes 0.64% of the 
sampling time compared with 1.38% previously. I think this will help in a lot 
more places.

 

  was:
For curiosity, I use flame graph to profiling a simple query. The code snippet 
looks like below.

 
{code:java}
    String sql = "select empno, gender, name from EMPS where name = 'John'";
    Connection connection = null;
    Statement statement = null;
    try {
      Properties info = new Properties();
      info.put("model", jsonPath("smart"));
      connection = DriverManager.getConnection("jdbc:calcite:", info);      
String x = null;
      long start = System.currentTimeMillis();
      for (int i = 0; i < 50000; i++) {
        statement = connection.createStatement();
        final ResultSet resultSet =
            statement.executeQuery(
                sql);
        while (resultSet.next()) {
          x = resultSet.getInt(1)
              + resultSet.getString(2)
              + resultSet.getString(3);
        }      }
    } catch (SQLException e) {
      e.printStackTrace();
    } finally {
      close(connection, statement);
    }
{code}
 

I attach the generated flame graph [^pic1.svg]
{code:java}
3% on sql2rel
9% on query optimizing,
62% of the time is spent on code gen and implementation,
20% on result set iterating and checking,
… 
{code}
Hope this graph is informative. Since I start to learn Calcite recently, I 
cannot tell where to start tuning, but from the graph one tiny point catches my 
attention, I find there are many reflection invocations in 
_Prepare#trimUnusedFields_. So, I spent some time trying to mitigate the small 
overhead.

I optimize _ReflectiveVisitDispatcher_ by introducing a global _Guava_ cache 
with limited size to cache methods, also I add full unit tests for 
_ReflectUtil_.

I count the reference of the method: _ReflectUtil#createMethodDispatcher and_

_ReflectUtil#createDispatcher (see below)._ Total 68 possible invocations, so 
the cache size is limited, by caching all the methods during the lifecycle of 
the process, we can eliminate reflection looking up methods overhead.
{code:java}
org.apache.calcite.rel.rel2sql.RelToSqlConverter: 18 possible invocations.
org.apache.calcite.sql2rel.RelDecorrelator: 15 possible invocations.
org.apache.calcite.sql2rel.RelFieldTrimmer: 11 possible invocations.
org.apache.calcite.sql2rel.RelStructuredTypeFlattener.RewriteRelVisitor: 22 
possible invocations.
org.apache.calcite.interpreter static class Interpreter.CompilerImpl: 2 
possible invocations.
{code}
 Before introducing the global caching, caching is shared per 
_ReflectiveVisitDispatcher_ instance, now different _ReflectiveVisitDispatcher_ 
in different thread is able to reuse the cached methods.

See [^pic2.svg], after tuning, _trimUnusedFields_ only takes 0.64% of the 
sampling time compared with 1.38% previously. I think this will help in a lot 
more places.

 


> Use global caching for ReflectiveVisitDispatcher implementation
> ---------------------------------------------------------------
>
>                 Key: CALCITE-3873
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3873
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.22.0
>            Reporter: neoremind
>            Priority: Minor
>         Attachments: pic1.svg, pic2.svg
>
>
> For curiosity, I use flame graph to profiling a simple query. The code 
> snippet looks like below.
> {code:java}
>     String sql = "select empno, gender, name from EMPS where name = 'John'";
>     Connection connection = null;
>     Statement statement = null;
>     try {
>       Properties info = new Properties();
>       info.put("model", jsonPath("smart"));
>       connection = DriverManager.getConnection("jdbc:calcite:", info);      
> String x = null;
>       long start = System.currentTimeMillis();
>       for (int i = 0; i < 50000; i++) {
>         statement = connection.createStatement();
>         final ResultSet resultSet =
>             statement.executeQuery(
>                 sql);
>         while (resultSet.next()) {
>           x = resultSet.getInt(1)
>               + resultSet.getString(2)
>               + resultSet.getString(3);
>         }      }
>     } catch (SQLException e) {
>       e.printStackTrace();
>     } finally {
>       close(connection, statement);
>     }
> {code}
>  
> I attach the generated flame graph [^pic1.svg]
> {code:java}
> 3% on sql2rel
> 9% on query optimizing,
> 62% of the time is spent on code gen and implementation,
> 20% on result set iterating and checking,
> … 
> {code}
> Hope this graph is informative. Since I start to learn Calcite recently, I 
> cannot tell where to start tuning, but from the graph one tiny point catches 
> my attention, I find there are many reflection invocations in 
> _Prepare#trimUnusedFields_. So, I spent some time trying to mitigate the 
> small overhead.
> I optimize _ReflectiveVisitDispatcher_ by introducing a global _Guava_ cache 
> with limited size to cache methods, also I add full unit tests for 
> _ReflectUtil_.
> I count the reference of the method: _ReflectUtil#createMethodDispatcher and_
> _ReflectUtil#createDispatcher (see below)._ Total 68 possible invocations, so 
> the cache size is limited, by caching all the methods during the lifecycle of 
> the process, we can eliminate reflection looking up methods overhead.
> {code:java}
> org.apache.calcite.rel.rel2sql.RelToSqlConverter: 18 possible invocations.
> org.apache.calcite.sql2rel.RelDecorrelator: 15 possible invocations.
> org.apache.calcite.sql2rel.RelFieldTrimmer: 11 possible invocations.
> org.apache.calcite.sql2rel.RelStructuredTypeFlattener.RewriteRelVisitor: 22 
> possible invocations.
> org.apache.calcite.interpreter static class Interpreter.CompilerImpl: 2 
> possible invocations.
> {code}
>  Before introducing the global caching, caching is shared per 
> _ReflectiveVisitDispatcher_ instance, now different 
> _ReflectiveVisitDispatcher_ in different thread is able to reuse the cached 
> methods.
> See [^pic2.svg], after tuning, _trimUnusedFields_ only takes 0.64% of the 
> sampling time compared with 1.38% previously. I think this will help in a lot 
> more places.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to