Hi, On 2020-08-19 18:58:11 -0400, Alvaro Herrera wrote: > On 2020-Aug-19, David Rowley wrote: > > > Andres' suggestion: > > > > regression=# explain (analyze, costs off, timing off, summary off) > > select count(*) from tenk1 t1 inner join tenk1 t2 on > > t1.twenty=t2.unique1; > > QUERY PLAN > > --------------------------------------------------------------------------------------- > > Aggregate (actual rows=1 loops=1) > > -> Nested Loop (actual rows=10000 loops=1) > > Cache Key: t1.twenty Hits: 9980 Misses: 20 Evictions: 0 > > Overflows: 0 > > -> Seq Scan on tenk1 t1 (actual rows=10000 loops=1) > > -> Index Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 > > loops=20) > > Index Cond: (unique1 = t1.twenty) > > (6 rows) > > I think it doesn't look terrible in the SubPlan case -- it kinda makes > sense there -- but for nested loop it appears really strange.
While I'm against introducing a separate node for the caching, I'm *not* against displaying a different node type when caching is present. E.g. it'd be perfectly reasonable from my POV to have a 'Cached Nested Loop' join and a plain 'Nested Loop' node in the above node. I'd probably still want to display the 'Cache Key' similar to your example, but I don't see how it'd be better to display it with one more intermediary node. > On the performance aspect, I wonder what the overhead is, particularly > considering Tom's point of making these nodes more expensive for cases > with no caching. I doubt it, due to being a well predictable branch. But it's also easy enough to just have a different Exec* function for the caching and non-caching case, should that turn out to be a problem. > And also, as the JIT saga continues, aren't we going to get plan trees > recompiled too, at which point it won't matter much? That's a fair bit out, I think. And even then it'll only help for queries that run long enough (eventually also often enough, if we get prepared statement JITing) to be worth JITing. Greetings, Andres Freund