I don’t think a lot of thought went into how adapters use native objects. For any given adapter we could start a discussion about to to manage the lifecycle (e.g. a pool or cache or factory).
Sent from my iPad > On Apr 10, 2022, at 10:33 PM, James Turton <[email protected]> wrote: > > Hi Calcite devs! > > There are resource leaks affecting some Calcite adapters, including ES and > Cassandra and probably some others, whereby Calcite internally creates > clients objects from external libraries in order to access the system in > question and never closes those clients. The reason that there is no trivial > fix available is that it is only the application that knows when said clients > may be closed, and Calcite offers the application no way to signal this to it. > > The case I have real world information about, ES, is a serious problem > because the resource leak is unbounded. Here Calcite creates an ES > "RestClient" for every call to create() in the schema factory and the > RestClient leaks at least a file descriptor if it is not closed. Operating > systems enforce a per-process file descriptor quota. If your application, > Drill in our case, makes one too many calls to the ES schema factory's > create() method, then the JVM hosting it is summarily executed by the OS. > > In the case of Cassandra, the situation looks less bad to me in that client > objects are reused by Calcite. This means that if the application only ever > wants to talk to a finite distinct number of Cassandra endpoints then the > resource leak is bounded and, most likely, quite small. In > https://github.com/apache/calcite/pull/2698, I've been revising a patch to > the ES schema factory to introduce the same sort of reuse to constrain, but > not cure, the resource leak there. The patch is unquestionably a nasty "band > aid" and that prompted some discussion with its reviewers and a request that > I email this list. > > I think Calcite might have to make a design decision, perhaps one that means > that either > > * it abstains from connection management entirely in adapters, which > might break some of its public APIs since then applications must > start to pass connections in (or might they be smuggled inside the > operand Map?) or > * it starts to use connections in single-use way, freeing them > immediately and taking a performance hit or > * it makes Schema, or some other, objects closeable by the application > and propagates these events to the adapter code responsible for > managing connections. > > Thanks > > James
