Hi, On 2026-01-26 17:23:16 +0200, Heikki Linnakangas wrote: > On 23/01/2026 22:16, Andres Freund wrote: > > Hi, > > > > In [1] I was looking at the profile of a seqscan with a where clause that > > doesn't match any of the many rows. I was a bit saddened by where we were > > spending time. > > > > > > - The fetching of variables, as well as the null check of scandesc, in > > SeqNext() is repeated in every loop iteration of ExecScanExtended, > > despite > > that obviously not being required after the first iteration > > > > We could perhaps address this by moving the check to the callers of > > ExecScanExtended() or by extending ExecScanExtended to have an explicit > > beginscan callback that it calls after. > > For context, we're talking about this in SeqNext: > > > /* > > * get information from the estate and scan state > > */ > > scandesc = node->ss.ss_currentScanDesc; > > estate = node->ss.ps.state; > > direction = estate->es_direction; > > slot = node->ss.ss_ScanTupleSlot; > > Hmm. I guess the compiler doesn't know that the variables don't change > between calls, so it has to fetch them on every iteration. Passing them > through a 'const' pointer might give it clue, but I'm not sure how to > shoehorn that here.
My understanding is that compilers very rarely utilize const on pointers for optimization. The problem is that just because the current pointer is const doesn't mean there aren't other *non-const* pointers. And since there are a lot of external function calls involved here, there's no way the compiler could provide that that isn't the case :(. > Perhaps we should turn the ExecScanExtended() function inside out. Instead > of passing SeqNext as a callback to ExecScanExtended(), we would have a > function like this (for illustration purposes only, doesn't compile): That would be one approach, would require structural changes in a fair number of places though :/. A slightly simpler approach could be for ExecScanExtended to pass in these parameters as arguments to the callbacks. For things like estate, direction and scanslot, that makes plenty sense. It's a bit more problematic for the scan descriptor, due to the "lazy start" we have in a few places. I very briefly prototyped that (relying on the fact that all callers cast to the callback type and that passing unused arguments just works even if the function definition doesn't expect them), and that seems to do the trick. What shows up more visibly afterwards is that we set ExecScanExtended() sets econtext->ecxt_scantuple in every iteration, despite that not changing in almost all cases (I think for FDWs it could, fdwhandler.sgml just says that the scanslot "should" be used). Greetings, Andres Freund
