If you can put everything you need in moveNext it's great, since that will involve no code generation for instrumentation, which makes debugging much easier. You may not even need to change the RelNodes at all - after all, all implementations that you want to instrument are already there, in EnumerableDefaults.
Mihai ________________________________ From: Austin Richardson <austin.richard...@teampicnic.com.INVALID> Sent: Wednesday, December 11, 2024 7:36 AM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: Deriving Calcite queries' execution statistics Indeed, our ultimate goal is to achieve something similar to what EXPLAIN ANALYZE provides for Postgres. Regarding achieving this using the Enumerable convention: one idea we’ve started exploring at a high level is whether adding instrumentation to the moveNext() method would work. For example, given the RelNodes in the final execution plan, maybe we could wrap them with “InstrumentedRelNode”s. The InstrumentedRelNode’s Enumerable would then delegate to the inner RelNode’s Enumerable, but count (for example) the number of rows emitted from moveNext(). We can imagine this is much more complicated for timing, but are curious if you have thoughts about whether this would work at all, as it is built on an assumption that Enumerables and moveNext() are generally used for data flow when executing a query. Best, Austin On Tue, 10 Dec 2024 at 14:46, Stamatis Zampetakis <zabe...@apache.org> wrote: > I suppose the goal is to have something like the EXPLAIN ANALYZE > command that exists in Postgres (and in other DBMS under different > names). It would be a really nice feature for those using the > Enumerable convention. I am not aware of existing works in this area > but would be curious to know if there are. > > The real execution time may be quite tricky to measure accurately but > other perf counters such as row number, first/last row time, input > size, etc., are probably easier to collect. As Mihai mentioned, > instrumentation can get expensive so this behavior should be > configurable. > > Apart from instrumentation another way to enrich the plan with > profiling insights would be to use stacktrace sampling while the query > is running and then create mappings between stacktraces and plan nodes > based on predefined patterns. Essentially this approach is like a > separate tool similar to async-profiler [1] but tailored around > Calcite plan nodes. > > All in all, this is an interesting and ambitious topic and would be > nice to see contributions in this area. > > Best, > Stamatis > > [1] https://github.com/async-profiler/async-profiler > > On Mon, Dec 9, 2024 at 8:00 PM Mihai Budiu <mbu...@gmail.com> wrote: > > > > Calcite is really a compiler framework. > > > > It does come with a runtime, based on the Enumerable implementation. I > assume this is the one you want to measure. > > The enumerable implementation uses an iterator-based design, where each > "row" may actually flow through many operators. Collections are not > necessarily processed operator by operator, but rather operators may pull > data from inputs row by row and emit them to the output. Only "blocking" > operators, such as sorting, may accumulate a whole collection. So > accounting for time per operator may not be trivial. > > > > Moreover, the iterator-based implementation is materialized as a Java > program which is compiled using the Janino compiler, and executed. It will > be a significant piece of work to actually generate additional > instrumentation code. > > > > Finally, the instrumentation code itself may take a non-negligible > amount of time, so it may perturb the measurement, since data flows as such > small granularities, that you will have to read some timer for every record. > > > > But there is no reason it cannot be done. The first thing you should do > is to inspect the code generated and think about how you could instrument > it. You can put a breakpoint in RexExecutorImpl.reduce and look at the > "code" variable. > > > > Mihai > > > > ________________________________ > > From: Wegdan Ghazi <wegdan.gh...@teampicnic.com.INVALID> > > Sent: Monday, December 9, 2024 5:44 AM > > To: dev@calcite.apache.org <dev@calcite.apache.org> > > Subject: Deriving Calcite queries' execution statistics > > > > Hello Calcite devs, > > > > > > We’re trying to build a tool to help us analyze the performance of > Calcite > > queries to give insights on better practices, beyond simply evaluating > the > > execution time of the entire query. > > > > > > What we’d like to achieve is an analysis of the nodes of an executable > > plan, indicating the time of execution as well as the number of rows > > produced for each `RelNode`. > > > > > > Example: for a query SELECT COUNT(*) FROM test_table_1 WHERE id = '1', > we’d > > like to achieve something similar to: > > > > > > EnumerableAggregate(group=[{}], EXPR$0=[COUNT()]) → time = 1s, rows = 1 > > > > EnumerableCalc(expr#0..10=[{inputs}], expr#11=['1':VARCHAR], > > expr#12=[=($t4, $t11)], proj#0..10=[{exprs}], $condition=[$t12]) → time = > > 0.5s, rows = 2 > > > > TableScan(table=[[test_table]], filters=[null]) → time = 0.4s, rows > = 3 > > > > > > We’re thinking about devising a way to modify the generated code of > > `RelNode`s to extend their results with execution statistics, but we’re > > also curious if you know of any prior art in this area. If not, do you > > think that attempting to instrument the generated code is a reasonable > > solution to pursue? > > > > > > Thank you, >