Nice! -- C
> On Oct 19, 2021, at 8:36 AM, James Turton <[email protected]> wrote: > > Having kicked the tyres on this idea, I can report that it works nicely. I > went one step further and made the default idle pool size 0, rather than 1, > which has a side benefit that Drill does not try to connect out when it > starts up at all, only upon receiving the first query (and then HikariCP > caches that connection for some amount of time). The advantage here is that > if Drill gets restarted in the middle of the night when some JDBC data source > happens not to be available, that doesn't kick the storage config into the > disabled state. > > When I send in a rapid spate of queries, the HikariCP pool grows accordingly, > up to the configured max. > > On 2021/10/19 06:42, James Turton wrote: >> Hi devs >> >> I'd like to propose a change to the defaults for our outbound connection >> pool management, at least for JDBC but perhaps ultimately wherever we can >> manage it. Currently we are eager about initiating outbound JDBC >> connections, bringing up 10 per storage config per drillbit. For example, >> if a user creates 3 storage configs pointing to a single DBMS (the configs >> differing in their DB path and credentials, say) on a cluster of 5 drillbits >> then we'll bring up 10x3x5 = 150 connections as soon as we can and try to >> keep them up permanently. The fixed pool size of 10 is a default we picked >> up from HikariCP which surely set it with application servers in mind. >> >> We've had a report from the field of a MySQL server declining to provide >> said 150 connections, leaving the Drill user unable to proceed. >> Additionally, as you can imagine, almost all 150 connections will be idle >> most of the time for typical Drill cluster workloads. Furthermore, while >> connections pools are ubiquitous in the OLTP world they are rare in the OLAP >> world where the cost of creating and destroying them is negligible compared >> to the cost of a single user query, while the benefits of per-user access >> control, resource management and session management which they bring over >> shared pools are valuable. Bringing these latter benefits to Drill's >> outbound JDBC connections is not in the scope of this email, the point made >> is in only "traditionally, OLAP environments have avoided connection pools >> because the losses far outweigh the gains". >> >> In light of the above I suggest that we transition from eager to lazy >> outbound JDBC connections, more like Apache Spark (I'm told). I propose >> initially that we only change our *default* HikariCP configuration to >> maintain small, finitely scalable pools (e.g. baseline 1, up to 10) instead >> of fixed pools. The HikariCP configuration is already overridable today for >> users that prefer the current eager connection behaviour. >> >> James >> >
