[ 
https://issues.apache.org/jira/browse/CALCITE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108183#comment-17108183
 ] 

Stamatis Zampetakis commented on CALCITE-3999:
----------------------------------------------

Adding a Guava cache is already an improvement from the previous 
implementation, thanks [~jcamachorodriguez]. 

More general, I was thinking that this {{DialectPool}} may not be really 
necessary, at least from a performance perspective. I was looking through the 
code and the most costly operation seems to be establishing the JDBC connection 
(which should be in the order of a few hundred milliseconds depending on the 
latency between client and DB server). However, it is rather unlikely that we 
are going to establish a connection at this point since most of the time when 
we have a {{DataSource}} we have also a connection pool. 

Moreover, most of the methods in {{DatabaseMetaData}} just return some 
hardcoded values so there are no real queries to the database. One exception is 
the validation query that may be sent to the database for validating the 
connection before returning it to from the DBCP pool.  

Taking the establish connection overhead out of the loop creating a dialect 
should be in the order of micro seconds if we drop entirely the {{DialectPool}} 
to nano seconds if we use the cache (assuming that we have cachehit). I think 
the difference between cache and no-cache for dialects can shrink even more if 
we tune better the {{DataSource}}. For instance, we can opt to not send the 
query for validating the connection every time since at small intervals it is 
unlikely that the connection is dead. 

I think I can easily provide a JMH bench to validate the above if you are 
interested. 

To sum up, dropping the {{DialectPool}} entirely may be a realistic option.  No 
additional cache, no additional locks, no soft values and additional burden to 
the GC. Looking forward for your thoughts.  

> Simplify DialectPool implementation using Guava cache
> -----------------------------------------------------
>
>                 Key: CALCITE-3999
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3999
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> JdbcUtils contains a pool to cache SqlDialect objects. Currently, it relies 
> on multiple maps and a synchronized {{get}} method. Although I am not very 
> familiar with that code, it seems the implementation could be made simpler 
> and more efficient by using a Guava cache. In addition, since we would not 
> have a single synchronized get method, multiple threads could concurrently 
> create dialects for distinct data sources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to