On Mon, Jan 6, 2025 at 10:18 PM David Rowley <dgrowle...@gmail.com> wrote: > On Sat, 21 Dec 2024 at 00:41, Amit Langote <amitlangot...@gmail.com> wrote: > > To address (1), I tried assigning specialized functions to > > PlanState.ExecProcNode in ExecInitSeqScan() based on whether qual or > > projInfo are NULL. Inspired by David Rowley’s suggestion to look at > > ExecHashJoinImpl(), I wrote variants like ExecSeqScanNoQual() (for > > qual == NULL) and ExecSeqScanNoProj() (for projInfo == NULL). These > > call a local version of ExecScan() that lives in nodeSeqScan.c, marked > > always-inline. This local copy takes qual and projInfo as arguments, > > letting compilers inline and optimize unnecessary branches away. > > I tested the performance of this and I do see close to a 5% > performance increase in TPC-H Q1. Nice.
Thanks David for looking at this. > I'm a little concerned with the method the patch takes where it copies > most of ExecScan and includes it in nodeSeqscan.c. If there are any > future changes to ExecScan, someone might forget to propagate those > changes into nodeSeqscan.c's version. What if instead you moved > ExecScan() into a header file and made it static inline? That way > callers would get their own inlined copy with the callback functions > inlined too, which for nodeSeqscan is good, since the recheck callback > does nothing. Yeah, having an inline-able version of ExecScan() in a separate header sounds better than what I proposed. > Just as an additional reason for why I think this might be a better > idea is that the patch doesn't seem to quite keep things equivalent as > in the process of having ExecSeqScanNoEPQImpl() directly call > SeqNext() without going through ExecScanFetch is that you've lost a > call to CHECK_FOR_INTERRUPTS(). Yeah, that was clearly a bug in my patch. > On the other hand, one possible drawback from making ExecScan a static > inline is that any non-core code that uses ExecScan won't get any bug > fixes if we were to fix some bug in ExecScan in a minor release unless > the extension is compiled again. That could be fixed by keeping > ExecScan as an extern function and maybe just having ExecScanExtended > as the static inline version. Yes, keeping ExecScan()'s interface unchanged seems better for the considerations you mention. > Another thing I wondered about is the naming conversion you're using > for these ExecSeqScan variant functions. > > +ExecSeqScanNoQualNoProj(PlanState *pstate) > +ExecSeqScanNoQual(PlanState *pstate) > +ExecSeqScanNoProj(PlanState *pstate) > +ExecSeqScanNoEPQ(PlanState *pstate) > > I think it's better to have a naming convention that aims to convey > what the function does do rather than what it does not do. Agreed. > I've attached my workings of what I was messing around with. It seems > to perform about the same as your version. I think maybe we'd need > some sort of execScan.h instead of where I've stuffed the functions > in. I've done that in the attached v2. > It would also be good if there was some way to give guarantees to the > compiler that a given pointer isn't NULL. For example in: > > return ExecScanExtended(&node->ss, > (ExecScanAccessMtd) SeqNext, > (ExecScanRecheckMtd) SeqRecheck, > NULL, > pstate->qual, > NULL); > > It would be good if when ExecScanExtended is inlined the compiler > wouldn't emit code for the "if (qual == NULL)" ... part. I don't know > if there's any way to do that. I thought I'd mention it in case > someone can think of a way... I guess you could add another parameter > that gets passed as a const and have the "if" test look at that > instead, that's a bit ugly though. I too am not sure of a way short of breaking ExecScanExtended() down into individual functions, each for the following cases: 1. qual != NULL && projInfo != NULL 2. qual != NULL (&& projInfo == NULL) 3. projInfo != NULL (&& qual == NULL) So basically, mirroring the variants we now have in nodeSeqScan.c to the execScan.h. To avoid inlining of the EPQ code when epqstate == NULL, rename ExecScanFetch() to ExecScanGetEPQTuple() and move the (*accessMtd) call to the caller when epqstate == NULL. CHECK_FOR_INTERRUPTS() is now repeated at every place that needs it. Attached 0002 shows a PoC of that. -- Thanks, Amit Langote
v2-0001-Refactor-ExecScan-to-inline-scan-filtering-and-pr.patch
Description: Binary data
v2-0002-Break-ExecScanExtended-into-variants-based-on-qua.patch
Description: Binary data