Nice!
-- C



> On Oct 19, 2021, at 8:36 AM, James Turton <[email protected]> wrote:
> 
> Having kicked the tyres on this idea, I can report that it works nicely.  I 
> went one step further and made the default idle pool size 0, rather than 1, 
> which has a side benefit that Drill does not try to connect out when it 
> starts up at all, only upon receiving the first query (and then HikariCP 
> caches that connection for some amount of time).  The advantage here is that 
> if Drill gets restarted in the middle of the night when some JDBC data source 
> happens not to be available, that doesn't kick the storage config into the 
> disabled state.
> 
> When I send in a rapid spate of queries, the HikariCP pool grows accordingly, 
> up to the configured max.
> 
> On 2021/10/19 06:42, James Turton wrote:
>> Hi devs
>> 
>> I'd like to propose a change to the defaults for our outbound connection 
>> pool management, at least for JDBC but perhaps ultimately wherever we can 
>> manage it.  Currently we are eager about initiating outbound JDBC 
>> connections, bringing up 10 per storage config per drillbit.  For example, 
>> if a user creates 3 storage configs pointing to a single DBMS (the configs 
>> differing in their DB path and credentials, say) on a cluster of 5 drillbits 
>> then we'll bring up 10x3x5 = 150 connections as soon as we can and try to 
>> keep them up permanently.  The fixed pool size of 10 is a default we picked 
>> up from HikariCP which surely set it with application servers in mind.
>> 
>> We've had a report from the field of a MySQL server declining to provide 
>> said 150 connections, leaving the Drill user unable to proceed.  
>> Additionally, as you can imagine, almost all 150 connections will be idle 
>> most of the time for typical Drill cluster workloads.  Furthermore, while 
>> connections pools are ubiquitous in the OLTP world they are rare in the OLAP 
>> world where the cost of creating and destroying them is negligible compared 
>> to the cost of a single user query, while the benefits of per-user access 
>> control, resource management and session management which they bring over 
>> shared pools are valuable.  Bringing these latter benefits to Drill's 
>> outbound JDBC connections is not in the scope of this email, the point made 
>> is in only "traditionally, OLAP environments have avoided connection pools 
>> because the losses far outweigh the gains".
>> 
>> In light of the above I suggest that we transition from eager to lazy 
>> outbound JDBC connections, more like Apache Spark (I'm told). I propose 
>> initially that we only change our *default* HikariCP configuration to 
>> maintain small, finitely scalable pools (e.g. baseline 1, up to 10) instead 
>> of fixed pools.  The HikariCP configuration is already overridable today for 
>> users that prefer the current eager connection behaviour.
>> 
>> James
>> 
> 

Reply via email to