I suppose the goal is to have something like the EXPLAIN ANALYZE command that exists in Postgres (and in other DBMS under different names). It would be a really nice feature for those using the Enumerable convention. I am not aware of existing works in this area but would be curious to know if there are.
The real execution time may be quite tricky to measure accurately but other perf counters such as row number, first/last row time, input size, etc., are probably easier to collect. As Mihai mentioned, instrumentation can get expensive so this behavior should be configurable. Apart from instrumentation another way to enrich the plan with profiling insights would be to use stacktrace sampling while the query is running and then create mappings between stacktraces and plan nodes based on predefined patterns. Essentially this approach is like a separate tool similar to async-profiler [1] but tailored around Calcite plan nodes. All in all, this is an interesting and ambitious topic and would be nice to see contributions in this area. Best, Stamatis [1] https://github.com/async-profiler/async-profiler On Mon, Dec 9, 2024 at 8:00 PM Mihai Budiu <mbu...@gmail.com> wrote: > > Calcite is really a compiler framework. > > It does come with a runtime, based on the Enumerable implementation. I assume > this is the one you want to measure. > The enumerable implementation uses an iterator-based design, where each "row" > may actually flow through many operators. Collections are not necessarily > processed operator by operator, but rather operators may pull data from > inputs row by row and emit them to the output. Only "blocking" operators, > such as sorting, may accumulate a whole collection. So accounting for time > per operator may not be trivial. > > Moreover, the iterator-based implementation is materialized as a Java program > which is compiled using the Janino compiler, and executed. It will be a > significant piece of work to actually generate additional instrumentation > code. > > Finally, the instrumentation code itself may take a non-negligible amount of > time, so it may perturb the measurement, since data flows as such small > granularities, that you will have to read some timer for every record. > > But there is no reason it cannot be done. The first thing you should do is to > inspect the code generated and think about how you could instrument it. You > can put a breakpoint in RexExecutorImpl.reduce and look at the "code" > variable. > > Mihai > > ________________________________ > From: Wegdan Ghazi <wegdan.gh...@teampicnic.com.INVALID> > Sent: Monday, December 9, 2024 5:44 AM > To: dev@calcite.apache.org <dev@calcite.apache.org> > Subject: Deriving Calcite queries' execution statistics > > Hello Calcite devs, > > > We’re trying to build a tool to help us analyze the performance of Calcite > queries to give insights on better practices, beyond simply evaluating the > execution time of the entire query. > > > What we’d like to achieve is an analysis of the nodes of an executable > plan, indicating the time of execution as well as the number of rows > produced for each `RelNode`. > > > Example: for a query SELECT COUNT(*) FROM test_table_1 WHERE id = '1', we’d > like to achieve something similar to: > > > EnumerableAggregate(group=[{}], EXPR$0=[COUNT()]) → time = 1s, rows = 1 > > EnumerableCalc(expr#0..10=[{inputs}], expr#11=['1':VARCHAR], > expr#12=[=($t4, $t11)], proj#0..10=[{exprs}], $condition=[$t12]) → time = > 0.5s, rows = 2 > > TableScan(table=[[test_table]], filters=[null]) → time = 0.4s, rows = 3 > > > We’re thinking about devising a way to modify the generated code of > `RelNode`s to extend their results with execution statistics, but we’re > also curious if you know of any prior art in this area. If not, do you > think that attempting to instrument the generated code is a reasonable > solution to pursue? > > > Thank you,