Re: POC: converting Lists into arrays
Hello, here some macros for list_make, now we can using list_make(...), not list_make1/2/3 ... #define MACRO_ARGS(...) __VA_ARGS__ #define LIST_MAKE_1_(narg_, postfix_, ...) list_make ## narg_ ## postfix_(__VA_ARGS__) #define LIST_MAKE_2_(...) LIST_MAKE_1_(__VA_ARGS__) #define LIST_MAKE_3_(...) LIST_MAKE_2_(__VA_ARGS__) #define list_make(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), /*empty*/, __VA_ARGS__) #define list_make_int(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), _int, __VA_ARGS__) #define list_make_oid(...) LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), _oid, __VA_ARGS__) macro VA_ARGS_NARGS defined in c.h How to work: for list_make_int(4,5,6) step 1: LIST_MAKE_3_(MACRO_ARGS VA_ARGS_NARGS(4,5,6), _int, 4,5,6) setp 2: LIST_MAKE_2_(MACRO_ARGS (3), _int, 4,5,6) step 3: LIST_MAKE_1_(3, _int, 4,5,6) step 4: list_make3_int(4,5,6) step 5: list_make3_impl(T_IntList, ((ListCell) {.int_value = (4)}), ((ListCell) {.int_value = (5)}), ((ListCell) {.int_value = (6)})) Or we can define some public macros, like this: #define MACRO_ARGS(...) __VA_ARGS__ #define MACRO_COMBIN_1(prefix_, center_, postfix_, ...) prefix_ ## center_ ## postfix_(__VA_ARGS__) #define MACRO_COMBIN_2(...) MACRO_COMBIN_1(__VA_ARGS__) #define MACRO_COMBIN_3(...) MACRO_COMBIN_2(__VA_ARGS__) #define list_make(...) MACRO_COMBIN_3(list_make, MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), /*empty*/, __VA_ARGS__) #define list_make_int(...) MACRO_COMBIN_3(list_make, MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), _int, __VA_ARGS__) #define list_make_oid(...) MACRO_COMBIN_3(list_make, MACRO_ARGS VA_ARGS_NARGS(__VA_ARGS__), _oid, __VA_ARGS__) bu...@sohu.com From: Tom Lane Date: 2019-02-24 10:24 To: pgsql-hackers Subject: POC: converting Lists into arrays For quite some years now there's been dissatisfaction with our List data structure implementation. Because it separately palloc's each list cell, it chews up lots of memory, and it's none too cache-friendly because the cells aren't necessarily adjacent. Moreover, our typical usage is to just build a list by repeated lappend's and never modify it, so that the flexibility of having separately insertable/removable list cells is usually wasted. Every time this has come up, I've opined that the right fix is to jack up the List API and drive a new implementation underneath, as we did once before (cf commit d0b4399d81). I thought maybe it was about time to provide some evidence for that position, so attached is a POC patch that changes Lists into expansible arrays, while preserving most of their existing API. The big-picture performance change is that this makes list_nth() a cheap O(1) operation, while lappend() is still pretty cheap; on the downside, lcons() becomes O(N), as does insertion or deletion in the middle of a list. But we don't use lcons() very much (and maybe a lot of the existing usages aren't really necessary?), while insertion/deletion in long lists is a vanishingly infrequent operation. Meanwhile, making list_nth() cheap is a *huge* win. The most critical thing that we lose by doing this is that when a List is modified, all of its cells may need to move, which breaks a lot of code that assumes it can insert or delete a cell while hanging onto a pointer to a nearby cell. In almost every case, this takes the form of doing list insertions or deletions inside a foreach() loop, and the pointer that's invalidated is the loop's current-cell pointer. Fortunately, with list_nth() now being so cheap, we can replace these uses of foreach() with loops using an integer index variable and fetching the next list element directly with list_nth(). Most of these places were loops around list_delete_cell calls, which I replaced with a new function list_delete_nth_cell to go along with the emphasis on the integer index. I don't claim to have found every case where that could happen, although I added debug support in list.c to force list contents to move on every list modification, and this patch does pass check-world with that support turned on. I fear that some such bugs remain, though. There is one big way in which I failed to preserve the old API syntactically: lnext() now requires a pointer to the List as well as the current ListCell, so that it can figure out where the end of the cell array is. That requires touching something like 150 places that otherwise wouldn't have had to be touched, which is annoying, even though most of those changes are trivial. I thought about avoiding that by requiring Lists to keep a "sentinel" value in the cell after the end of the active array, so that lnext() could look for the sentinel to detect list end. However, that idea doesn't really work, because if the list array has been moved, the spot where the sentinel had been could have been reallocated and filled with something else. So this'd offer no defense against the possibility of a stale ListCell pointer, which is something that we absolutely nee
Re: POC: converting Lists into arrays
Alvaro Herrera writes: > On 2019-Aug-09, Tom Lane wrote: >> I poked at this, and attached is a patch, but again I'm not seeing >> that there's any real performance-based argument for it. > I'm confused. I thought that the point of doing this wasn't that we > wanted to improve performance, but rather that we're now able to remove > the array without *losing* performance. I mean, those arrays were there > to improve performance for code that wanted fast access to specific list > items, but now we have fast access to the list items without it. So a > measurement that finds no performance difference is good news, and we > can get rid of the now-pointless optimization ... Yeah, fair enough, so pushed. In principle, this change adds an extra indirection in exec_rt_fetch, so I went looking to see if there were any such calls in arguably performance-critical paths. Unsurprisingly, most calls are in executor initialization, and they tend to be adjacent to table_open() or other expensive operations, so it's pretty hard to claim that there could be any measurable hit. However, I did notice that trigger.c uses ExecUpdateLockMode() and GetAllUpdatedColumns() in ExecBRUpdateTriggers which executes per-row, and so might be worth trying to optimize. exec_rt_fetch itself is not the main cost in either of those, but I wonder why we are doing those calculations over again for each row in the first place. I'm not excited enough about the issue to do anything right now, but the next time somebody whines about trigger-firing overhead, there might be an easy win available there. regards, tom lane
Re: POC: converting Lists into arrays
David Rowley writes: > On Fri, 9 Aug 2019 at 04:24, Tom Lane wrote: >> Has anyone got further thoughts about naming around list_concat >> and friends? >> If not, I'm inclined to go ahead with the concat-improvement patch as >> proposed in [1], modulo the one improvement David spotted. >> [1] https://www.postgresql.org/message-id/6704.1563739...@sss.pgh.pa.us > I'm okay with the patch once that one improvement is done. Pushed with that fix. > I think if we want to think about freeing the 2nd input List then we > can do that in another commit. Removing the redundant list_copy() > calls seems quite separate from that. The reason I was holding off is that this patch obscures the distinction between places that needed to preserve the second input (which were doing list_copy on it) and those that didn't (and weren't). If somebody wants to rethink the free-second-input business they'll now have to do a bit of software archaeology to determine which calls to change. But I don't think we're going to bother. regards, tom lane
Re: POC: converting Lists into arrays
On Sat, 10 Aug 2019 at 09:03, Tom Lane wrote: > > David Rowley writes: > > On Fri, 9 Aug 2019 at 09:55, Tom Lane wrote: > >> I still have hopes for getting rid of es_range_table_array though, > >> and will look at that tomorrow or so. > > > Yes, please. I've measured that to be quite an overhead with large > > partitioning setups. However, that was with some additional code which > > didn't lock partitions until it was ... well too late... as it > > turned out. But it seems pretty good to remove code that could be a > > future bottleneck if we ever manage to do something else with the > > locking of all partitions during UPDATE/DELETE. > > I poked at this, and attached is a patch, but again I'm not seeing > that there's any real performance-based argument for it. So far > as I can tell, if we've got a lot of RTEs in an executable plan, > the bulk of the startup time is going into lock (re) acquisition in > AcquirePlannerLocks, and/or permissions scanning in ExecCheckRTPerms; > both of those have to do work for every RTE including ones that > run-time pruning drops later on. ExecInitRangeTable just isn't on > the radar. In the code I tested with locally I ended up with a Bitmapset that marked which RTEs required permission checks so that ExecCheckRTPerms() could quickly skip RTEs with requiredPerms == 0. The Bitmapset was set in the planner. Note: expand_single_inheritance_child sets childrte->requiredPerms = 0, so there's nothing to do there for partitions, which is the most likely reason that the rtable list would be big. Sadly the locking is still a big overhead even with that fixed. Robert threw around some ideas in [1], but that seems like a pretty big project. I don't think removing future bottlenecks is such a bad idea if it can be done in such a way that the code remains clean. It may serve to increase our motivation later to solve the remaining issues. We tend to go to greater lengths when there are more gains, and more gains are more easily visible by removing more bottlenecks. Another reason to remove the es_range_table_array is that the reason it was added in the first place is no longer valid. We'd never have added it if we had array-based lists back then. (Reading below, it looks like Alvaro agrees with this too) [1] https://www.postgresql.org/message-id/CA%2BTgmoYbtm1uuDne3rRp_uNA2RFiBwXX1ngj3RSLxOfc3oS7cQ%40mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
On 2019-Aug-09, Tom Lane wrote: > I poked at this, and attached is a patch, but again I'm not seeing > that there's any real performance-based argument for it. So far > as I can tell, if we've got a lot of RTEs in an executable plan, > the bulk of the startup time is going into lock (re) acquisition in > AcquirePlannerLocks, and/or permissions scanning in ExecCheckRTPerms; > both of those have to do work for every RTE including ones that > run-time pruning drops later on. ExecInitRangeTable just isn't on > the radar. I'm confused. I thought that the point of doing this wasn't that we wanted to improve performance, but rather that we're now able to remove the array without *losing* performance. I mean, those arrays were there to improve performance for code that wanted fast access to specific list items, but now we have fast access to the list items without it. So a measurement that finds no performance difference is good news, and we can get rid of the now-pointless optimization ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: POC: converting Lists into arrays
David Rowley writes: > On Fri, 9 Aug 2019 at 09:55, Tom Lane wrote: >> I still have hopes for getting rid of es_range_table_array though, >> and will look at that tomorrow or so. > Yes, please. I've measured that to be quite an overhead with large > partitioning setups. However, that was with some additional code which > didn't lock partitions until it was ... well too late... as it > turned out. But it seems pretty good to remove code that could be a > future bottleneck if we ever manage to do something else with the > locking of all partitions during UPDATE/DELETE. I poked at this, and attached is a patch, but again I'm not seeing that there's any real performance-based argument for it. So far as I can tell, if we've got a lot of RTEs in an executable plan, the bulk of the startup time is going into lock (re) acquisition in AcquirePlannerLocks, and/or permissions scanning in ExecCheckRTPerms; both of those have to do work for every RTE including ones that run-time pruning drops later on. ExecInitRangeTable just isn't on the radar. If we wanted to try to improve things further, it seems like we'd have to find a way to not lock unreferenced partitions at all, as you suggest above. But combining that with run-time pruning seems like it'd be pretty horrid from a system structural standpoint: if we acquire locks only during execution, what happens if we find we must invalidate the query plan? Anyway, the attached might be worth committing just on cleanliness grounds, to avoid two-sources-of-truth issues in the executor. But it seems like there's no additional performance win here after all ... unless you've got a test case that shows differently? regards, tom lane diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index dbd7dd9..7f494ab 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -2790,7 +2790,6 @@ EvalPlanQualStart(EPQState *epqstate, EState *parentestate, Plan *planTree) estate->es_snapshot = parentestate->es_snapshot; estate->es_crosscheck_snapshot = parentestate->es_crosscheck_snapshot; estate->es_range_table = parentestate->es_range_table; - estate->es_range_table_array = parentestate->es_range_table_array; estate->es_range_table_size = parentestate->es_range_table_size; estate->es_relations = parentestate->es_relations; estate->es_queryEnv = parentestate->es_queryEnv; diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c index c1fc0d5..afd9beb 100644 --- a/src/backend/executor/execUtils.c +++ b/src/backend/executor/execUtils.c @@ -113,7 +113,6 @@ CreateExecutorState(void) estate->es_snapshot = InvalidSnapshot; /* caller must initialize this */ estate->es_crosscheck_snapshot = InvalidSnapshot; /* no crosscheck */ estate->es_range_table = NIL; - estate->es_range_table_array = NULL; estate->es_range_table_size = 0; estate->es_relations = NULL; estate->es_rowmarks = NULL; @@ -720,29 +719,17 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags) * ExecInitRangeTable * Set up executor's range-table-related data * - * We build an array from the range table list to allow faster lookup by RTI. - * (The es_range_table field is now somewhat redundant, but we keep it to - * avoid breaking external code unnecessarily.) - * This is also a convenient place to set up the parallel es_relations array. + * In addition to the range table proper, initialize arrays that are + * indexed by rangetable index. */ void ExecInitRangeTable(EState *estate, List *rangeTable) { - Index rti; - ListCell *lc; - /* Remember the range table List as-is */ estate->es_range_table = rangeTable; - /* Set up the equivalent array representation */ + /* Set size of associated arrays */ estate->es_range_table_size = list_length(rangeTable); - estate->es_range_table_array = (RangeTblEntry **) - palloc(estate->es_range_table_size * sizeof(RangeTblEntry *)); - rti = 0; - foreach(lc, rangeTable) - { - estate->es_range_table_array[rti++] = lfirst_node(RangeTblEntry, lc); - } /* * Allocate an array to store an open Relation corresponding to each @@ -753,8 +740,8 @@ ExecInitRangeTable(EState *estate, List *rangeTable) palloc0(estate->es_range_table_size * sizeof(Relation)); /* - * es_rowmarks is also parallel to the es_range_table_array, but it's - * allocated only if needed. + * es_rowmarks is also parallel to the es_range_table, but it's allocated + * only if needed. */ estate->es_rowmarks = NULL; } diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 1fb28b4..39c8b3b 100644 --- a/src/include/executor/executor.h +++ b/src/include/executor/executor.h @@ -535,8 +535,7 @@ extern void ExecInitRangeTable(EState *estate, List *rangeTable); static inline RangeTblEntry * exec_rt_fetch(Index rti, EState *estate) { - Assert(rti > 0 && rti <= estate->es_range_table_size); - return
Re: POC: converting Lists into arrays
On Fri, 9 Aug 2019 at 09:55, Tom Lane wrote: > Attached is a patch that removes simple_rte_array in favor of just > accessing the query's rtable directly. I concluded that there was > not much point in messing with simple_rel_array or append_rel_array, > because they are not in fact just mirrors of some List. There's no > List at all of baserel RelOptInfos, and while we do have a list of > AppendRelInfos, it's a compact, random-order list not one indexable > by child relid. > > Having done this, though, I'm a bit discouraged about whether to commit > it. In light testing, it's not any faster than HEAD and in complex > queries seems to actually be a bit slower. I suspect the reason is > that we've effectively replaced > root->simple_rte_array[i] > with > root->parse->rtable->elements[i-1] > and the two extra levels of indirection are taking their toll. If there are no performance gains from this then -1 from me. We're all pretty used to it the way it is > I realized that it's pretty dumb not to > just merge setup_append_rel_array into setup_simple_rel_arrays. > The division of labor there is inconsistent with the fact that > there's no such division in expand_planner_arrays. ha, yeah I'd vote for merging those. It was coded that way originally until someone objected! :) > I still have hopes for getting rid of es_range_table_array though, > and will look at that tomorrow or so. Yes, please. I've measured that to be quite an overhead with large partitioning setups. However, that was with some additional code which didn't lock partitions until it was ... well too late... as it turned out. But it seems pretty good to remove code that could be a future bottleneck if we ever manage to do something else with the locking of all partitions during UPDATE/DELETE. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
On Fri, 9 Aug 2019 at 04:24, Tom Lane wrote: > I wrote: > > Perhaps there's an argument for doing something to change the behavior > > of list_union and list_difference and friends. Not sure --- it could > > be a foot-gun for back-patching. I'm already worried about the risk > > of back-patching code that assumes the new semantics of list_concat. > > (Which might be a good argument for renaming it to something else? > > Just not list_union, please.) > > Has anyone got further thoughts about naming around list_concat > and friends? > > If not, I'm inclined to go ahead with the concat-improvement patch as > proposed in [1], modulo the one improvement David spotted. > > regards, tom lane > > [1] https://www.postgresql.org/message-id/6704.1563739...@sss.pgh.pa.us I'm okay with the patch once that one improvement is done. I think if we want to think about freeing the 2nd input List then we can do that in another commit. Removing the redundant list_copy() calls seems quite separate from that. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
I wrote: > BTW, further on the subject of performance --- I'm aware of at least > these topics for follow-on patches: > * Fix places that are maintaining arrays parallel to Lists for > access-speed reasons (at least simple_rte_array, append_rel_array, > es_range_table_array). Attached is a patch that removes simple_rte_array in favor of just accessing the query's rtable directly. I concluded that there was not much point in messing with simple_rel_array or append_rel_array, because they are not in fact just mirrors of some List. There's no List at all of baserel RelOptInfos, and while we do have a list of AppendRelInfos, it's a compact, random-order list not one indexable by child relid. Having done this, though, I'm a bit discouraged about whether to commit it. In light testing, it's not any faster than HEAD and in complex queries seems to actually be a bit slower. I suspect the reason is that we've effectively replaced root->simple_rte_array[i] with root->parse->rtable->elements[i-1] and the two extra levels of indirection are taking their toll. It'd be possible to get rid of one of those indirections by maintaining a copy of root->parse->rtable directly in PlannerInfo; but that throws away most of the intellectual appeal of not having two sources of truth to maintain, and it won't completely close the performance gap. Other minor objections include: * Many RTE accesses now look randomly different from adjacent RelOptInfo accesses. * The inheritance-expansion code is a bit sloppy about how much it will expand these arrays, which means it's possible in corner cases for length(parse->rtable) to be less than root->simple_rel_array_size-1. This resulted in a crash in add_other_rels_to_query, which was assuming it could fetch a possibly-null RTE pointer from indexes up to simple_rel_array_size-1. While that wasn't hard to fix, I wonder whether any third-party code has similar assumptions. So on the whole, I'm inclined not to do this. There are some cosmetic bits of this patch that I do want to commit though: I found some out-of-date comments, and I realized that it's pretty dumb not to just merge setup_append_rel_array into setup_simple_rel_arrays. The division of labor there is inconsistent with the fact that there's no such division in expand_planner_arrays. I still have hopes for getting rid of es_range_table_array though, and will look at that tomorrow or so. regards, tom lane diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 06a2058..8bd1c47 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2198,7 +2198,7 @@ postgresPlanDirectModify(PlannerInfo *root, } else foreignrel = root->simple_rel_array[resultRelation]; - rte = root->simple_rte_array[resultRelation]; + rte = planner_rt_fetch(resultRelation, root); fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; /* diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index e9ee32b..fe7f8b1 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -307,7 +307,7 @@ set_base_rel_sizes(PlannerInfo *root) if (rel->reloptkind != RELOPT_BASEREL) continue; - rte = root->simple_rte_array[rti]; + rte = planner_rt_fetch(rti, root); /* * If parallelism is allowable for this query in general, see whether @@ -349,7 +349,7 @@ set_base_rel_pathlists(PlannerInfo *root) if (rel->reloptkind != RELOPT_BASEREL) continue; - set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti]); + set_rel_pathlist(root, rel, rti, planner_rt_fetch(rti, root)); } } @@ -1008,7 +1008,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, continue; childRTindex = appinfo->child_relid; - childRTE = root->simple_rte_array[childRTindex]; + childRTE = planner_rt_fetch(childRTindex, root); /* * The child rel's RelOptInfo was already created during @@ -1239,7 +1239,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, /* Re-locate the child RTE and RelOptInfo */ childRTindex = appinfo->child_relid; - childRTE = root->simple_rte_array[childRTindex]; + childRTE = planner_rt_fetch(childRTindex, root); childrel = root->simple_rel_array[childRTindex]; /* @@ -3742,9 +3742,8 @@ print_relids(PlannerInfo *root, Relids relids) { if (!first) printf(" "); - if (x < root->simple_rel_array_size && - root->simple_rte_array[x]) - printf("%s", root->simple_rte_array[x]->eref->aliasname); + if (x <= list_length(root->parse->rtable)) + printf("%s", planner_rt_fetch(x, root)->eref->aliasname); else printf("%d", x); first = false; diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index d19ff41..2576439 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -609,7 +609,7 @@
Re: POC: converting Lists into arrays
Andres Freund writes: > On 2019-07-31 19:40:09 -0400, Tom Lane wrote: >> That makes the other idea (of a foreach-ish macro declaring the >> listcell value variable) problematic, too :-(. > Hm. One way partially around that would be using an anonymous struct > inside the for(). Something like > #define foreach_node(membertype, name, lst) \ > for (struct {membertype *node; ListCell *lc; const List *l; int i;} name = > {...}; \ > ...) > which then would allow code like > foreach_node(OpExpr, cur, list) > { > do_something_with_node(cur.node); > foreach_delete_current(cur); > } I'm hesitant to change the look of our loops quite that much, mainly because it'll be a pain for back-patching. If you write some code for HEAD like this, and then have to back-patch it, you'll need to insert/change significantly more code than if it's just a matter of whether there's a ListCell variable or not. I experimented with the "aforeach" idea I suggested upthread, to the extent of writing the macros and then converting parse_clause.c (a file chosen more or less at random) to use aforeach instead of foreach. I was somewhat surprised to find that every single foreach() did convert pleasantly. (There are several forboth's that I didn't try to do anything with, though.) If we do go in this direction, I wouldn't suggest trying to actually do wholesale conversion of existing code like this; that seems more likely to create back-patching land mines than do anything helpful. I am slightly tempted to try to convert everyplace using foreach_delete_current, though, since those loops are different from v12 already. Thoughts? regards, tom lane diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 2a6b2ff..39d8d8e 100644 --- a/src/backend/parser/parse_clause.c +++ b/src/backend/parser/parse_clause.c @@ -117,8 +117,6 @@ static Node *transformFrameOffset(ParseState *pstate, int frameOptions, void transformFromClause(ParseState *pstate, List *frmList) { - ListCell *fl; - /* * The grammar will have produced a list of RangeVars, RangeSubselects, * RangeFunctions, and/or JoinExprs. Transform each one (possibly adding @@ -128,9 +126,9 @@ transformFromClause(ParseState *pstate, List *frmList) * Note we must process the items left-to-right for proper handling of * LATERAL references. */ - foreach(fl, frmList) + aforeach(frmList) { - Node *n = lfirst(fl); + Node *n = (Node *) aforeach_current(); RangeTblEntry *rte; int rtindex; List *namespace; @@ -267,11 +265,10 @@ extractRemainingColumns(List *common_colnames, { char *colname = strVal(lfirst(lnames)); bool match = false; - ListCell *cnames; - foreach(cnames, common_colnames) + aforeach(common_colnames) { - char *ccolname = strVal(lfirst(cnames)); + char *ccolname = strVal(aforeach_current()); if (strcmp(colname, ccolname) == 0) { @@ -475,7 +472,6 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) List *coldeflists = NIL; bool is_lateral; RangeTblEntry *rte; - ListCell *lc; /* * We make lateral_only names of this level visible, whether or not the @@ -505,9 +501,9 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) * Likewise, collect column definition lists if there were any. But * complain if we find one here and the RangeFunction has one too. */ - foreach(lc, r->functions) + aforeach(r->functions) { - List *pair = (List *) lfirst(lc); + List *pair = aforeach_current_node(List); Node *fexpr; List *coldeflist; Node *newfexpr; @@ -551,11 +547,9 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) fc->over == NULL && coldeflist == NIL) { -ListCell *lc; - -foreach(lc, fc->args) +aforeach(fc->args) { - Node *arg = (Node *) lfirst(lc); + Node *arg = (Node *) aforeach_current(); FuncCall *newfc; last_srf = pstate->p_last_srf; @@ -700,7 +694,6 @@ transformRangeTableFunc(ParseState *pstate, RangeTableFunc *rtf) Oid docType; RangeTblEntry *rte; bool is_lateral; - ListCell *col; char **names; int colno; @@ -743,9 +736,9 @@ transformRangeTableFunc(ParseState *pstate, RangeTableFunc *rtf) names = palloc(sizeof(char *) * list_length(rtf->columns)); colno = 0; - foreach(col, rtf->columns) + aforeach(rtf->columns) { - RangeTableFuncCol *rawc = (RangeTableFuncCol *) lfirst(col); + RangeTableFuncCol *rawc = aforeach_current_node(RangeTableFuncCol); Oid typid; int32 typmod; Node *colexpr; @@ -837,15 +830,13 @@ transformRangeTableFunc(ParseState *pstate, RangeTableFunc *rtf) /* Namespaces, if any, also need to be transformed */ if (rtf->namespaces != NIL) { - ListCell *ns; - ListCell *lc2; List *ns_uris = NIL; List *ns_names = NIL; bool default_ns_seen = false; - foreach(ns, rtf->namespaces) +
Re: POC: converting Lists into arrays
[ returning to this topic now that the CF is over ] I wrote: > Perhaps there's an argument for doing something to change the behavior > of list_union and list_difference and friends. Not sure --- it could > be a foot-gun for back-patching. I'm already worried about the risk > of back-patching code that assumes the new semantics of list_concat. > (Which might be a good argument for renaming it to something else? > Just not list_union, please.) Has anyone got further thoughts about naming around list_concat and friends? If not, I'm inclined to go ahead with the concat-improvement patch as proposed in [1], modulo the one improvement David spotted. regards, tom lane [1] https://www.postgresql.org/message-id/6704.1563739...@sss.pgh.pa.us
Re: POC: converting Lists into arrays
On Thu, 8 Aug 2019 at 12:18, Andres Freund wrote: > Hi, > > On 2019-08-08 11:36:44 +0800, Craig Ringer wrote: > > > you can only put one into the first element of a > > > for (;;). > > > > > > > Use an anonymous block outer scope? Or if not permitted even by C99 > (which > > I think it is), a do {...} while (0); hack? > > You can't easily - the problem is that there's no real way to add the > closing }, because that's after the macro. Ah, right. Hence our PG_TRY(); { } PG_CATCH(); { } PG_END_TRY(); construct in all its beauty. I should've seen that. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Re: POC: converting Lists into arrays
Hi, On 2019-08-08 11:36:44 +0800, Craig Ringer wrote: > > you can only put one into the first element of a > > for (;;). > > > > Use an anonymous block outer scope? Or if not permitted even by C99 (which > I think it is), a do {...} while (0); hack? You can't easily - the problem is that there's no real way to add the closing }, because that's after the macro. Greetings, Andres Freund
Re: POC: converting Lists into arrays
On Thu, 1 Aug 2019 at 07:40, Tom Lane wrote: > Andres Freund writes: > > Unfortunately foreach(ListCell *lc, ...) doesn't work with the current > > definition. Which I think isn't great, because the large scopes for loop > > iteration variables imo makes the code harder to reason about. > > Totally agree. > > you can only put one into the first element of a > for (;;). > Use an anonymous block outer scope? Or if not permitted even by C99 (which I think it is), a do {...} while (0); hack? -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Re: POC: converting Lists into arrays
Hi, On 2019-07-31 19:40:09 -0400, Tom Lane wrote: > Andres Freund writes: > > Unfortunately foreach(ListCell *lc, ...) doesn't work with the current > > definition. Which I think isn't great, because the large scopes for loop > > iteration variables imo makes the code harder to reason about. > > Yeah, I tried to make that possible when I redid those macros, but > couldn't find a way :-(. Even granting that we're willing to have > a different macro for this use-case, it doesn't seem easy, because > you can only put one into the first element of a > for (;;). I remember hitting that at one point and annoyed/confused as there that restriction came from. Probably some grammar difficulties. But still, odd. > That makes the other idea (of a foreach-ish macro declaring the > listcell value variable) problematic, too :-(. Hm. One way partially around that would be using an anonymous struct inside the for(). Something like #define foreach_node(membertype, name, lst) \ for (struct {membertype *node; ListCell *lc; const List *l; int i;} name = {...}; \ ...) which then would allow code like foreach_node(OpExpr, cur, list) { do_something_with_node(cur.node); foreach_delete_current(cur); } That's quite similar to your: > One idea is that we could do something like > > foreach_variant(identifier, list_value) > { >type *v = (type *) lfirst_variant(identifier); >... > } > > where the "identifier" isn't actually a variable name but just something > we use to construct the ForEachState variable's name. (The only reason > we need it is to avoid confusion in cases with nested foreach's.) The > lfirst_variant macro would fetch the correct value just by looking > at the ForEachState, so there's no separate ListCell* variable at all. but would still allow to avoid the variable. Greetings, Andres Freund
Re: POC: converting Lists into arrays
I wrote: > BTW, I think we could make equivalent macros in the old regime, > which would be a good thing because then it would be possible to > back-patch code using this notation. Oh, wait-a-second. I was envisioning that for (ListCell *anonymous__lc = ...) would work for that, but of course that requires C99, so we could only put it into v12. But that might still be worth doing. It'd mean that the backpatchability of this notation is the same as that of "for (int x = ...)", which seems worth something. regards, tom lane
Re: POC: converting Lists into arrays
I wrote: > One idea is that we could do something like > foreach_variant(identifier, list_value) > { >type *v = (type *) lfirst_variant(identifier); >... > } > where the "identifier" isn't actually a variable name but just something > we use to construct the ForEachState variable's name. (The only reason > we need it is to avoid confusion in cases with nested foreach's.) On second thought, there seems no strong reason why you should need to fetch the current value of a foreach-ish loop that's not the most closely nested one. So forget the dummy identifier, and consider this straw-man proposal: #define aforeach(list_value) ... (I'm thinking "anonymous foreach", but bikeshedding welcome.) This is just like the current version of foreach(), except it uses a fixed name for the ForEachState variable and doesn't attempt to assign to a "cell" variable. #define aforeach_current() ... Retrieves the current value of the most-closely-nested aforeach loop, based on knowing the fixed name of aforeach's loop variable. This replaces "lfirst(lc)", and we'd also need aforeach_current_int() and so on for the other variants of lfirst(). So usage would look like, say, aforeach(my_list) { type *my_value = (type *) aforeach_current(); ... } We'd also want aforeach_delete_current() and aforeach_current_index(), to provide functionality equivalent to foreach_delete_current() and foreach_current_index(). These names are a bit long, and maybe we should try to make them shorter, but more shortness might also mean less clarity. BTW, I think we could make equivalent macros in the old regime, which would be a good thing because then it would be possible to back-patch code using this notation. Thoughts? regards, tom lane
Re: POC: converting Lists into arrays
Andres Freund writes: > Unfortunately foreach(ListCell *lc, ...) doesn't work with the current > definition. Which I think isn't great, because the large scopes for loop > iteration variables imo makes the code harder to reason about. Yeah, I tried to make that possible when I redid those macros, but couldn't find a way :-(. Even granting that we're willing to have a different macro for this use-case, it doesn't seem easy, because you can only put one into the first element of a for (;;). That makes the other idea (of a foreach-ish macro declaring the listcell value variable) problematic, too :-(. One idea is that we could do something like foreach_variant(identifier, list_value) { type *v = (type *) lfirst_variant(identifier); ... } where the "identifier" isn't actually a variable name but just something we use to construct the ForEachState variable's name. (The only reason we need it is to avoid confusion in cases with nested foreach's.) The lfirst_variant macro would fetch the correct value just by looking at the ForEachState, so there's no separate ListCell* variable at all. regards, tom lane
Re: POC: converting Lists into arrays
On 01/08/2019 01:04, Andres Freund wrote: Hi, On 2019-07-31 16:00:47 -0700, Andres Freund wrote: On 2019-07-31 15:57:56 -0700, Andres Freund wrote: I also wonder if a foreach version that includes the typical (Type *) var = (Type *) lfirst(lc); or (Type *) var = castNode(Type, lfirst(lc)); or OpExpr *hclause = lfirst_node(OpExpr, lc); would make it nicer to use lists. foreach_node_in(Type, name, list) could mean something like foreach(ListCell *name##_cell, list) { Type* name = lfirst_node(Type, name##_cell); } s/lfirst/linitial/ of course. Was looking at code that also used lfirst... Bullshit, of course. /me performs a tactical withdrawal into his brown paper bag. Reminds me that one advantage of macros like the second one would also be to reduce the use of the confusingly named linitial*(), helping newer hackers. But that point just had two consecutive embarassing demonstrations... Yeah, pg_list.h is one file I never close. -- Petr Jelinek 2ndQuadrant - PostgreSQL Solutions for the Enterprise https://www.2ndQuadrant.com/
Re: POC: converting Lists into arrays
Hi, On 2019-07-31 16:00:47 -0700, Andres Freund wrote: > On 2019-07-31 15:57:56 -0700, Andres Freund wrote: > > I also wonder if a foreach version that includes the typical > > (Type *) var = (Type *) lfirst(lc); > > or > > (Type *) var = castNode(Type, lfirst(lc)); > > or > > OpExpr *hclause = lfirst_node(OpExpr, lc); > > > > would make it nicer to use lists. > > > > foreach_node_in(Type, name, list) could mean something like > > > > foreach(ListCell *name##_cell, list) > > { > > Type* name = lfirst_node(Type, name##_cell); > > } > > s/lfirst/linitial/ of course. Was looking at code that also used > lfirst... Bullshit, of course. /me performs a tactical withdrawal into his brown paper bag. > Reminds me that one advantage of macros like the second one would also > be to reduce the use of the confusingly named linitial*(), helping newer > hackers. But that point just had two consecutive embarassing demonstrations... - Andres
Re: POC: converting Lists into arrays
Hi, On 2019-07-31 15:57:56 -0700, Andres Freund wrote: > I also wonder if a foreach version that includes the typical > (Type *) var = (Type *) lfirst(lc); > or > (Type *) var = castNode(Type, lfirst(lc)); > or > OpExpr *hclause = lfirst_node(OpExpr, lc); > > would make it nicer to use lists. > > foreach_node_in(Type, name, list) could mean something like > > foreach(ListCell *name##_cell, list) > { > Type* name = lfirst_node(Type, name##_cell); > } s/lfirst/linitial/ of course. Was looking at code that also used lfirst... Reminds me that one advantage of macros like the second one would also be to reduce the use of the confusingly named linitial*(), helping newer hackers. Greetings, Andres Freund
Re: POC: converting Lists into arrays
Hi, I was just looking at the diff for a fix, which adds a "ListCell *lc;" to function scope, even though it's only needed in a pretty narrow scope. Unfortunately foreach(ListCell *lc, ...) doesn't work with the current definition. Which I think isn't great, because the large scopes for loop iteration variables imo makes the code harder to reason about. I wonder if we could either have a different version of foreach() that allows that, or find a way to make the above work. For the latter I don't immediately have a good idea of how to accomplish that. For the former it's easy enough if we either don't include the typename (thereby looking more alien), or if we reference the name separately (making it more complicated to use). I also wonder if a foreach version that includes the typical (Type *) var = (Type *) lfirst(lc); or (Type *) var = castNode(Type, lfirst(lc)); or OpExpr *hclause = lfirst_node(OpExpr, lc); would make it nicer to use lists. foreach_node_in(Type, name, list) could mean something like foreach(ListCell *name##_cell, list) { Type* name = lfirst_node(Type, name##_cell); } (using a hypothetical foreach that supports defining the ListCell in scope, just for display simplicity's sake). Greetings, Andres Freund
Re: POC: converting Lists into arrays
Alvaro Herrera writes: > So with this patch we end up with: > list_union (copies list1, appends list2 element not already in list1) > list_concat_unique (appends list2 elements not already in list) > list_concat (appends all list2 elements) > list_concat_copy (copies list1, appends all list2 elements) > This seems a little random -- for example we end up with "union" being > the same as "concat_copy" except for the copy; and the analogy between > those two seems to exactly correspond to that between "concat_unique" > and "concat". Yeah, list_concat_unique is kind of weird here. Its header comment even points out that it's much like list_union: * This is almost the same functionality as list_union(), but list1 is * modified in-place rather than being copied. However, callers of this * function may have strict ordering expectations -- i.e. that the relative * order of those list2 elements that are not duplicates is preserved. I think that last sentence is bogus --- does anybody really think people have been careful not to assume anything about the ordering of list_union results? > I would propose to use the name list_union, with flags > being "unique" (or "uniquify" if that's a word, or even just "all" which > seems obvious to people with a SQL background), and something that > suggests "copy_first". I really dislike using "union" for something that doesn't have the same semantics as SQL's UNION (ie guaranteed duplicate elimination); so I've never been that happy with "list_union" and "list_difference". Propagating that into things that aren't doing any dup-elimination at all seems very wrong. Also, a big -1 for replacing these calls with something with extra parameter(s). That's going to be verbose, and not any more readable, and probably slower because the called code will have to figure out what to do. Perhaps there's an argument for doing something to change the behavior of list_union and list_difference and friends. Not sure --- it could be a foot-gun for back-patching. I'm already worried about the risk of back-patching code that assumes the new semantics of list_concat. (Which might be a good argument for renaming it to something else? Just not list_union, please.) regards, tom lane
Re: POC: converting Lists into arrays
On 2019-Jul-22, Tom Lane wrote: > David Rowley writes: > > If we do end up with another function, it might be nice to stay away > > from using "concat" in the name. I think we might struggle if there > > are too many variations on concat and there's a risk we'll use the > > wrong one. If we need this then perhaps something like > > list_append_all() might be a better choice... I'm struggling to build > > a strong opinion on this though. (I know that because I've deleted > > this paragraph 3 times and started again, each time with a different > > opinion.) > > Yeah, the name is really the sticking point here; if we could think > of a name that was easy to understand then the whole thing would be > much easier to accept. The best I've been able to come up with is > "list_join", by analogy to bms_join for bitmapsets ... but that's > not great. So with this patch we end up with: list_union (copies list1, appends list2 element not already in list1) list_concat_unique (appends list2 elements not already in list) list_concat (appends all list2 elements) list_concat_copy (copies list1, appends all list2 elements) This seems a little random -- for example we end up with "union" being the same as "concat_copy" except for the copy; and the analogy between those two seems to exactly correspond to that between "concat_unique" and "concat". I would propose to use the name list_union, with flags being "unique" (or "uniquify" if that's a word, or even just "all" which seems obvious to people with a SQL background), and something that suggests "copy_first". Maybe we can offer a single name that does the four things, selecting the exact semantics with boolean flags? (We can provide the old names as macros, to avoid unnecessarily breaking other code). Also, perhaps it would make sense to put them all closer in the source file. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: POC: converting Lists into arrays
David Rowley writes: > I looked over this and only noted down one thing: > In estimate_path_cost_size, can you explain why list_concat_copy() is > needed here? I don't see remote_param_join_conds being used after > this, so might it be better to just get rid of remote_param_join_conds > and pass remote_conds to classifyConditions(), then just > list_concat()? Hm, you're right, remote_param_join_conds is not used after that, so we could just drop the existing list_copy() and make it remote_conds = list_concat(remote_param_join_conds, fpinfo->remote_conds); I'm disinclined to change the API of classifyConditions(), if that's what you were suggesting. >> It turns out there are a *lot* of places where list_concat() callers >> are now leaking the second input list (where before they just leaked >> that list's header). So I've got mixed emotions about the choice not >> to add a variant function that list_free's the second input. > In some of these places, for example, the calls to > generate_join_implied_equalities_normal() and > generate_join_implied_equalities_broken(), I wonder, since these are > static functions if we could just change the function signature to > accept a List to append to. I'm pretty disinclined to do that, too. Complicating function APIs for marginal performance gains isn't something that leads to understandable or maintainable code. > If we do end up with another function, it might be nice to stay away > from using "concat" in the name. I think we might struggle if there > are too many variations on concat and there's a risk we'll use the > wrong one. If we need this then perhaps something like > list_append_all() might be a better choice... I'm struggling to build > a strong opinion on this though. (I know that because I've deleted > this paragraph 3 times and started again, each time with a different > opinion.) Yeah, the name is really the sticking point here; if we could think of a name that was easy to understand then the whole thing would be much easier to accept. The best I've been able to come up with is "list_join", by analogy to bms_join for bitmapsets ... but that's not great. regards, tom lane
Re: POC: converting Lists into arrays
On Mon, 22 Jul 2019 at 08:01, Tom Lane wrote: > > I wrote: > >> * Rationalize places that are using combinations of list_copy and > >> list_concat, probably by inventing an additional list-concatenation > >> primitive that modifies neither input. > > > I poked around to see what we have in this department. There seem to > > be several identifiable use-cases: > > [ ... analysis ... ] > > Here's a proposed patch based on that. I added list_concat_copy() > and then simplified callers as appropriate. I looked over this and only noted down one thing: In estimate_path_cost_size, can you explain why list_concat_copy() is needed here? I don't see remote_param_join_conds being used after this, so might it be better to just get rid of remote_param_join_conds and pass remote_conds to classifyConditions(), then just list_concat()? /* * The complete list of remote conditions includes everything from * baserestrictinfo plus any extra join_conds relevant to this * particular path. */ remote_conds = list_concat_copy(remote_param_join_conds, fpinfo->remote_conds); classifyConditions() seems to create new lists, so it does not appear that you have to worry about modifying the existing lists. > It turns out there are a *lot* of places where list_concat() callers > are now leaking the second input list (where before they just leaked > that list's header). So I've got mixed emotions about the choice not > to add a variant function that list_free's the second input. On the > other hand, the leakage probably amounts to nothing significant in > all or nearly all of these places, and I'm concerned about the > readability/understandability loss of having an extra version of > list_concat. Anybody have an opinion about that? In some of these places, for example, the calls to generate_join_implied_equalities_normal() and generate_join_implied_equalities_broken(), I wonder, since these are static functions if we could just change the function signature to accept a List to append to. This could save us from having to perform any additional pallocs at all, so there'd be no need to free anything or worry about any leaks. The performance of the code would be improved too. There may be other cases where we can do similar, but I wouldn't vote we change signatures of non-static functions for that. If we do end up with another function, it might be nice to stay away from using "concat" in the name. I think we might struggle if there are too many variations on concat and there's a risk we'll use the wrong one. If we need this then perhaps something like list_append_all() might be a better choice... I'm struggling to build a strong opinion on this though. (I know that because I've deleted this paragraph 3 times and started again, each time with a different opinion.) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
David Rowley writes: > On Mon, 22 Jul 2019 at 16:37, Tom Lane wrote: >> Interesting. I wonder if bms_next_member() could be made any quicker? > I had a quick look earlier and the only thing I saw was maybe to do > the first loop differently from subsequent ones. The "w &= mask;" > does nothing useful once we're past the first bitmapword that the loop > touches. Good thought, but it would only help when we're actually iterating to later words, which happens just 1 out of 64 times in the fully- populated-bitmap case. Still, I think it might be worth pursuing to make the sparse-bitmap case faster. regards, tom lane
Re: POC: converting Lists into arrays
On Mon, 22 Jul 2019 at 16:37, Tom Lane wrote: > > David Rowley writes: > > So the bms_next_member() loop is slower when the bitmapset is fully > > populated with all subplans, but way faster when there's just 1 > > member. > > Interesting. I wonder if bms_next_member() could be made any quicker? I had a quick look earlier and the only thing I saw was maybe to do the first loop differently from subsequent ones. The "w &= mask;" does nothing useful once we're past the first bitmapword that the loop touches. Not sure what the could would look like exactly yet, or how much it would help. I'll maybe experiment a bit later, but as separate work from the other patch. > Still, I agree that this is negligible compared to the actual work > needed per live subplan, and the fact that the cost scales per live > subplan is a good thing. So no objection to this patch, but a mental > note to take another look at bms_next_member() someday. Thanks for having a look. I'll have another look and will likely push this a bit later on today if all is well. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
David Rowley writes: > On Mon, 22 Jul 2019 at 02:45, Tom Lane wrote: >> One small question is whether it loses if most of the subplans >> are present in the bitmap. I imagine that would be close enough >> to break-even, but it might be worth trying to test to be sure. > ... > -- Test 2 (all matching subplans (8192 of them)) -- > Master version: > Time: 14825.304 ms (00:14.825) > Time: 14701.601 ms (00:14.702) > Time: 14650.969 ms (00:14.651) > Patched version: > Time: 44551.811 ms (00:44.552) > Time: 44357.915 ms (00:44.358) > Time: 43454.958 ms (00:43.455) > So the bms_next_member() loop is slower when the bitmapset is fully > populated with all subplans, but way faster when there's just 1 > member. Interesting. I wonder if bms_next_member() could be made any quicker? Still, I agree that this is negligible compared to the actual work needed per live subplan, and the fact that the cost scales per live subplan is a good thing. So no objection to this patch, but a mental note to take another look at bms_next_member() someday. regards, tom lane
Re: POC: converting Lists into arrays
On Mon, 22 Jul 2019 at 02:45, Tom Lane wrote: > One small question is whether it loses if most of the subplans > are present in the bitmap. I imagine that would be close enough > to break-even, but it might be worth trying to test to be sure. > (I'd think about breaking out just the loops in question and > testing them stand-alone, or else putting in an outer loop to > repeat them, since as you say the surrounding work probably > dominates.) My 2nd test was for when all subplans were present in the bitmap. It did show a very slight slowdown for the case were all subplans were present in the bitmapset. However, yeah, it seems like a good idea to try it a million times to help show the true cost. I did: int x = 0; /* Patched version */ for (j = 0; j < 100; j++) { i = -1; while ((i = bms_next_member(validsubplans, i)) >= 0) { Plan*initNode = (Plan *) list_nth(node->appendplans, i); x++; } } /* Master version */ for (j = 0; j < 100; j++) { ListCell *lc; i = 0; foreach(lc, node->appendplans) { Plan*initNode; if (bms_is_member(i, validsubplans)) { initNode = (Plan *)lfirst(lc); x++; } } } elog(DEBUG1, "%d", x); /* stop the compiler optimizing away the loops */ I separately commented out each one of the outer loops away before performing the test again. plan_cache_mode = force_generic_plan -- Test 1 (one matching subplan) -- prepare q1(int) as select * from ht where a = $1; execute q1(1); Master version: Time: 14441.332 ms (00:14.441) Time: 13829.744 ms (00:13.830) Time: 13753.943 ms (00:13.754) Patched version: Time: 41.250 ms Time: 40.976 ms Time: 40.853 ms -- Test 2 (all matching subplans (8192 of them)) -- prepare q2 as select * from ht; execute q2; Master version: Time: 14825.304 ms (00:14.825) Time: 14701.601 ms (00:14.702) Time: 14650.969 ms (00:14.651) Patched version: Time: 44551.811 ms (00:44.552) Time: 44357.915 ms (00:44.358) Time: 43454.958 ms (00:43.455) So the bms_next_member() loop is slower when the bitmapset is fully populated with all subplans, but way faster when there's just 1 member. In realiy, the ExecInitNode() call drowns most of it out. Plus a plan with more subnodes is going take longer to execute and then shutdown the plan after too. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
I wrote: >> * Rationalize places that are using combinations of list_copy and >> list_concat, probably by inventing an additional list-concatenation >> primitive that modifies neither input. > I poked around to see what we have in this department. There seem to > be several identifiable use-cases: > [ ... analysis ... ] Here's a proposed patch based on that. I added list_concat_copy() and then simplified callers as appropriate. It turns out there are a *lot* of places where list_concat() callers are now leaking the second input list (where before they just leaked that list's header). So I've got mixed emotions about the choice not to add a variant function that list_free's the second input. On the other hand, the leakage probably amounts to nothing significant in all or nearly all of these places, and I'm concerned about the readability/understandability loss of having an extra version of list_concat. Anybody have an opinion about that? Other than that point, I think this is pretty much good to go. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 548ae66..50d1f18 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1497,7 +1497,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, if (fpinfo->jointype == JOIN_INNER) { *ignore_conds = list_concat(*ignore_conds, - list_copy(fpinfo->joinclauses)); + fpinfo->joinclauses); fpinfo->joinclauses = NIL; } diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 033aeb2..b9f90e9 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2667,8 +2667,8 @@ estimate_path_cost_size(PlannerInfo *root, * baserestrictinfo plus any extra join_conds relevant to this * particular path. */ - remote_conds = list_concat(list_copy(remote_param_join_conds), - fpinfo->remote_conds); + remote_conds = list_concat_copy(remote_param_join_conds, + fpinfo->remote_conds); /* * Construct EXPLAIN query including the desired SELECT, FROM, and @@ -5102,23 +5102,23 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype, { case JOIN_INNER: fpinfo->remote_conds = list_concat(fpinfo->remote_conds, - list_copy(fpinfo_i->remote_conds)); + fpinfo_i->remote_conds); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, - list_copy(fpinfo_o->remote_conds)); + fpinfo_o->remote_conds); break; case JOIN_LEFT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, - list_copy(fpinfo_i->remote_conds)); + fpinfo_i->remote_conds); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, - list_copy(fpinfo_o->remote_conds)); + fpinfo_o->remote_conds); break; case JOIN_RIGHT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, - list_copy(fpinfo_o->remote_conds)); + fpinfo_o->remote_conds); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, - list_copy(fpinfo_i->remote_conds)); + fpinfo_i->remote_conds); break; case JOIN_FULL: diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index fd29927..e0af665 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -526,8 +526,8 @@ DefineIndex(Oid relationId, * is part of the key columns, and anything equal to and over is part of * the INCLUDE columns. */ - allIndexParams = list_concat(list_copy(stmt->indexParams), - list_copy(stmt->indexIncludingParams)); + allIndexParams = list_concat_copy(stmt->indexParams, + stmt->indexIncludingParams); numberOfAttributes = list_length(allIndexParams); if (numberOfAttributes <= 0) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 64a9e58..83337c2 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -719,8 +719,7 @@ init_sql_fcache(FmgrInfo *finfo, Oid collation, bool lazyEvalOK) fcache->pinfo, NULL); queryTree_list = lappend(queryTree_list, queryTree_sublist); - flat_query_list = list_concat(flat_query_list, - list_copy(queryTree_sublist)); + flat_query_list = list_concat(flat_query_list, queryTree_sublist); } check_sql_fn_statements(flat_query_list); diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c index 9163464..6bf13ae 100644 --- a/src/backend/nodes/list.c +++ b/src/backend/nodes/list.c @@ -501,12 +501,15 @@ lcons_oid(Oid datum, List *list) } /* - * Concatenate list2 to the end of list1, and return list1. list1 is - * destructively changed, list2 is not. (However, in the case of pointer - * lists, list1 and list2 will point to the same structures.)
Re: POC: converting Lists into arrays
David Rowley writes: > On Wed, 17 Jul 2019 at 11:06, Tom Lane wrote: >> (Actually, I doubt that any of these changes will really move the >> performance needle in the real world. It's more a case of wanting >> the code to present good examples not bad ones.) > In spirit with the above, I'd quite like to fix a small bad example > that I ended up with in nodeAppend.c and nodeMergeappend.c for > run-time partition pruning. I didn't test the patch, but just by eyeball it looks sane, and I concur it should win if the bitmap is sparse. One small question is whether it loses if most of the subplans are present in the bitmap. I imagine that would be close enough to break-even, but it might be worth trying to test to be sure. (I'd think about breaking out just the loops in question and testing them stand-alone, or else putting in an outer loop to repeat them, since as you say the surrounding work probably dominates.) regards, tom lane
Re: POC: converting Lists into arrays
On Wed, 17 Jul 2019 at 11:06, Tom Lane wrote: > (Actually, I doubt that any of these changes will really move the > performance needle in the real world. It's more a case of wanting > the code to present good examples not bad ones.) In spirit with the above, I'd quite like to fix a small bad example that I ended up with in nodeAppend.c and nodeMergeappend.c for run-time partition pruning. The code in question performs a loop over a list and checks bms_is_member() on each element and only performs an action if the member is present in the Bitmapset. It would seem much more efficient just to perform a bms_next_member() type loop then just fetch the list item with list_nth(), at least this is certainly the case when only a small number of the list items are indexed by the Bitmapset. With these two loops in particular, when a large number of list items are in the set the cost of the work goes up greatly, so it does not seem unreasonable to optimise the case for when just a few match. A quick test shows that it's hardly groundbreaking performance-wise, but test 1 does seem measurable above the noise. -- Setup plan_cache_mode = force_generic_plan max_locks_per_transaction = 256 create table ht (a int primary key, b int, c int) partition by hash (a); select 'create table ht' || x::text || ' partition of ht for values with (modulus 8192, remainder ' || (x)::text || ');' from generate_series(0,8191) x; \gexec -- Test 1: Just one member in the Bitmapset. test1.sql: \set p 1 select * from ht where a = :p Master: $ pgbench -n -f test1.sql -T 60 -M prepared postgres tps = 297.267191 (excluding connections establishing) tps = 298.276797 (excluding connections establishing) tps = 296.264459 (excluding connections establishing) tps = 298.968037 (excluding connections establishing) tps = 298.575684 (excluding connections establishing) Patched: $ pgbench -n -f test1.sql -T 60 -M prepared postgres tps = 300.924254 (excluding connections establishing) tps = 299.360196 (excluding connections establishing) tps = 300.197024 (excluding connections establishing) tps = 299.741215 (excluding connections establishing) tps = 299.748088 (excluding connections establishing) 0.71% faster -- Test 2: when all list items are found in the Bitmapset. test2.sql: select * from ht; Master: $ pgbench -n -f test2.sql -T 60 -M prepared postgres tps = 12.526578 (excluding connections establishing) tps = 12.528046 (excluding connections establishing) tps = 12.491347 (excluding connections establishing) tps = 12.538292 (excluding connections establishing) tps = 12.528959 (excluding connections establishing) Patched: $ pgbench -n -f test2.sql -T 60 -M prepared postgres tps = 12.503670 (excluding connections establishing) tps = 12.516133 (excluding connections establishing) tps = 12.404925 (excluding connections establishing) tps = 12.514567 (excluding connections establishing) tps = 12.541484 (excluding connections establishing) 0.21% slower With that removed the slowness of test 1 is almost entirely in AcquireExecutorLocks() and ExecCheckRTPerms(). We'd be up close to about 30k tps instead of 300 tps if there was some solution to those problems. I think it makes sense to remove the inefficient loops and leave the just final two bottlenecks, in the meantime. Patch attached. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services list_fixups_list_nth.patch Description: Binary data
Re: POC: converting Lists into arrays
Daniel Gustafsson writes: > For cases where an Oid list is copied and then head elements immediately > removed, as in fetch_search_path, couldn’t we instead use a counter and > list_copy_tail to avoid repeated list_delete_first calls? Perhaps, but I'm having a hard time getting excited about it. I don't think there's any evidence that fetch_search_path is a performance issue. Also, this coding requires that the *only* changes be deletion of head elements, whereas as it stands, once we've copied the list we can do what we like. > Since we’ve allowed list_delete_first on NIL for a long time, it seems > reasonable to do the same for list_delete_last even though it’s hard to come > up > with a good usecase for deleting the last element without inspecting the list > (a stack growing from the bottom perhaps? Yeah, I intentionally made the edge cases the same. There's room to argue that both functions should error out on NIL, instead. I've not looked into that though, and would consider it material for a separate patch. > Looking mainly at 0001 for now, I agree that the order is insignificant. Thanks for looking! regards, tom lane
Re: POC: converting Lists into arrays
David Rowley writes: > I've only looked at 0002. Here are my thoughts: Thanks for looking! > get_tables_to_cluster: > Looks fine. It's a heap scan. Any previous order was accidental, so if > it causes issues then we might need to think of using a more > well-defined order for CLUSTER; Check. > get_rels_with_domain: > This is a static function. Changing the order of the list seems to > only really affect the error message that a failed domain constraint > validation could emit. Perhaps this might break someone else's tests, > but they should just be able to update their expected results. Also, this is already dependent on the order of pg_depend entries, so it's not terribly stable anyhow. > get_relation_statistics: > RelationGetStatExtList does not seem to pay much attention to the > order it returns its results, so I don't think the order we apply > extended statistics was that well defined before. We always attempt to > use the stats with the most matching columns in > choose_best_statistics(), so I think > for people to be affected they'd either multiple stats with the same > sets of columns or a complex clause that equally well matches two sets > of stats, and in that case the other columns would be matched to the > other stats later... I'd better check that... erm... actually that's > not true. I see statext_mcv_clauselist_selectivity() makes no attempt > to match the clause list to another set of stats after finding the > first best match. I think it likely should do that. > estimate_multivariate_ndistinct() seems to have an XXX comment > mentioning thoughts about the stability of which stats are used, but > nothing is done. I figured that (a) this hasn't been around so long that anybody's expectations are frozen, and (b) if there is a meaningful difference in results then it's probably incumbent on the extstats code to do better. That seems to match your conclusions. But I don't see any regression test changes from making this change, so at least in simple cases it doesn't matter. (As you say, any extstats changes that we conclude are needed should be a separate patch.) regards, tom lane
Re: POC: converting Lists into arrays
> On 17 Jul 2019, at 01:06, Tom Lane wrote: > There are a bunch of places that are using list_delete_first to remove > the next-to-process entry from a List used as a queue. In principle, > we could invert the order of those queues and then use list_delete_last, > but I thought this would probably be too confusing: it's natural to > think of the front of the list as being the head of the queue. I doubt > that any of those queues get long enough for it to be a serious > performance problem to leave them as-is. For cases where an Oid list is copied and then head elements immediately removed, as in fetch_search_path, couldn’t we instead use a counter and list_copy_tail to avoid repeated list_delete_first calls? Something like the attached poc. > +List * > +list_delete_last(List *list) > +{ > + check_list_invariants(list); > + > + if (list == NIL) > + return NIL; /* would an error be > better? */ Since we’ve allowed list_delete_first on NIL for a long time, it seems reasonable to do the same for list_delete_last even though it’s hard to come up with a good usecase for deleting the last element without inspecting the list (a stack growing from the bottom perhaps?). It reads better to check for NIL before calling check_list_invariants though IMO. Looking mainly at 0001 for now, I agree that the order is insignificant. cheers ./daniel fetch_search_path_listcopy.diff Description: Binary data
Re: POC: converting Lists into arrays
On Wed, 17 Jul 2019 at 11:06, Tom Lane wrote: > 0002 changes some additional places where it's maybe a bit less safe, > ie there's a potential for user-visible behavioral change because > processing will occur in a different order. In particular, the proposed > change in execExpr.c causes aggregates and window functions that are in > the same plan node to be executed in a different order than before --- > but it seems to me that this order is saner. (Note the change in the > expected regression results, in a test that's intentionally sensitive to > the execution order.) And anyway when did we guarantee anything about > that? I've only looked at 0002. Here are my thoughts: get_tables_to_cluster: Looks fine. It's a heap scan. Any previous order was accidental, so if it causes issues then we might need to think of using a more well-defined order for CLUSTER; get_rels_with_domain: This is a static function. Changing the order of the list seems to only really affect the error message that a failed domain constraint validation could emit. Perhaps this might break someone else's tests, but they should just be able to update their expected results. ExecInitExprRec: As you mention, the order of aggregate evaluation is reversed. I agree that the new order is saner. I can't think why we'd be doing it in backwards order beforehand. get_relation_statistics: RelationGetStatExtList does not seem to pay much attention to the order it returns its results, so I don't think the order we apply extended statistics was that well defined before. We always attempt to use the stats with the most matching columns in choose_best_statistics(), so I think for people to be affected they'd either multiple stats with the same sets of columns or a complex clause that equally well matches two sets of stats, and in that case the other columns would be matched to the other stats later... I'd better check that... erm... actually that's not true. I see statext_mcv_clauselist_selectivity() makes no attempt to match the clause list to another set of stats after finding the first best match. I think it likely should do that. estimate_multivariate_ndistinct() seems to have an XXX comment mentioning thoughts about the stability of which stats are used, but nothing is done. parseCheckAggregates: I can't see any user-visible change to this one. Not even in error messages. estimate_num_groups: Similar to get_relation_statistics(), I see that estimate_multivariate_ndistinct() is only called once and we don't attempt to match up the remaining clauses with more stats. I can't imagine swapping lcons for lappend here will upset anyone. The behaviour does not look well defined already. I think we should likely change the "if (estimate_multivariate_ndistinct(root, rel, ," to "while ...", then drop the else. Not for this patch though... -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
I wrote: > * Look at places using lcons/list_delete_first to maintain FIFO lists. > The patch makes these O(N^2) for long lists. If we can reverse the list > order and use lappend/list_truncate instead, it'd be better. Possibly in > some places the list ordering is critical enough to make this impractical, > but I suspect it's an easy win in most. Attached are two patches that touch all the places where it seemed like an easy win to stop using lcons and/or list_delete_first. 0001 adds list_delete_last() as a mirror image to list_delete_first(), and changes all the places where it seemed 100% safe to do so (ie, there's no behavioral change because the list order is demonstrably immaterial). 0002 changes some additional places where it's maybe a bit less safe, ie there's a potential for user-visible behavioral change because processing will occur in a different order. In particular, the proposed change in execExpr.c causes aggregates and window functions that are in the same plan node to be executed in a different order than before --- but it seems to me that this order is saner. (Note the change in the expected regression results, in a test that's intentionally sensitive to the execution order.) And anyway when did we guarantee anything about that? I refrained from changing lcons to lappend in get_relation_info, because that demonstrably causes the planner to change its choices when two indexes look equally attractive, and probably people would complain about that. I think that the other changes proposed in 0002 are pretty harmless --- for example, in get_tables_to_cluster the order depends initially on the results of a seqscan of pg_index, so anybody who's expecting stability is in for rude surprises anyhow. Also, the proposed changes in plancat.c, parse_agg.c, selfuncs.c almost certainly have no user-visible effect, but maybe there could be changes at the roundoff-error level due to processing estimates in a different order? There are a bunch of places that are using list_delete_first to remove the next-to-process entry from a List used as a queue. In principle, we could invert the order of those queues and then use list_delete_last, but I thought this would probably be too confusing: it's natural to think of the front of the list as being the head of the queue. I doubt that any of those queues get long enough for it to be a serious performance problem to leave them as-is. (Actually, I doubt that any of these changes will really move the performance needle in the real world. It's more a case of wanting the code to present good examples not bad ones.) Thoughts? Anybody want to object to any of the changes in 0002? regards, tom lane diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index dfb51f6..169bf6f 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -1323,8 +1323,6 @@ static void gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, GISTSTATE *giststate, List *splitinfo, bool unlockbuf) { - ListCell *lc; - List *reversed; GISTPageSplitInfo *right; GISTPageSplitInfo *left; IndexTuple tuples[2]; @@ -1339,14 +1337,6 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, * left. Finally insert the downlink for the last new page and update the * downlink for the original page as one operation. */ - - /* for convenience, create a copy of the list in reverse order */ - reversed = NIL; - foreach(lc, splitinfo) - { - reversed = lcons(lfirst(lc), reversed); - } - LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE); gistFindCorrectParent(state->r, stack); @@ -1354,10 +1344,10 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, * insert downlinks for the siblings from right to left, until there are * only two siblings left. */ - while (list_length(reversed) > 2) + for (int pos = list_length(splitinfo) - 1; pos > 1; pos--) { - right = (GISTPageSplitInfo *) linitial(reversed); - left = (GISTPageSplitInfo *) lsecond(reversed); + right = (GISTPageSplitInfo *) list_nth(splitinfo, pos); + left = (GISTPageSplitInfo *) list_nth(splitinfo, pos - 1); if (gistinserttuples(state, stack->parent, giststate, >downlink, 1, @@ -1371,11 +1361,10 @@ gistfinishsplit(GISTInsertState *state, GISTInsertStack *stack, gistFindCorrectParent(state->r, stack); } /* gistinserttuples() released the lock on right->buf. */ - reversed = list_delete_first(reversed); } - right = (GISTPageSplitInfo *) linitial(reversed); - left = (GISTPageSplitInfo *) lsecond(reversed); + right = (GISTPageSplitInfo *) lsecond(splitinfo); + left = (GISTPageSplitInfo *) linitial(splitinfo); /* * Finally insert downlink for the remaining right page and update the diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 6c3ff76..032fab9 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -633,7 +633,7 @@
Re: POC: converting Lists into arrays
I wrote: > * Rationalize places that are using combinations of list_copy and > list_concat, probably by inventing an additional list-concatenation > primitive that modifies neither input. I poked around to see what we have in this department. There seem to be several identifiable use-cases: * Concat two Lists that are freshly built, or at least not otherwise referenced. In the old code, list_concat serves fine, leaking the second List's header but not any of its cons cells. As of 1cff1b95a, the second List's storage is leaked completely. We could imagine inventing a list_concat variant that list_free's its second input, but I'm unconvinced that that's worth the trouble. Few if any callers can't stand to leak any storage, and if there are any where it seems worth the trouble, adding an explicit list_free seems about as good as calling a variant of list_concat. (If we do want such a variant, we need a name for it. list_join, maybe, by analogy to bms_join?) * Concat two lists where there exist other pointers to the second list, but it's okay if the lists share cons cells afterwards. As of the new code, they don't actually share any storage, which seems strictly better. I don't think we need to do anything here, except go around and adjust the comments that explain that that's what's happening. * Concat two lists where there exist other pointers to the second list, and it's not okay to share storage. This is currently implemented by list_copy'ing the second argument, but we can just drop that (and adjust comments where needed). * Concat two lists where we mustn't modify either input list. This is currently implemented by list_copy'ing both arguments. I'm inclined to replace this pattern with a function like "list_concat_copy(const List *, const List *)", although settling on a suitable name might be difficult. * There's a small number of places that list_copy the first argument but not the second. I believe that all of these are either of the form "x = list_concat(list_copy(y), x)", ie replacing the only reference to the second argument, or are relying on the "it's okay to share storage" assumption to not copy a second argument that has other references. I think we can just replace these with list_concat_copy. We'll leak the second argument's storage in the cases where another list is being prepended onto a working list, but I doubt it's worth fussing over. (But, if this is repeated a lot of times, maybe it is worth fussing over? Conceivably you could leak O(N^2) storage while building a long working list, if you prepend many shorter lists onto it.) * Note that some places are applying copyObject() not list_copy(). In these places the idea is to make duplicates of pointed-to structures not just the list proper. These should be left alone, I think. When the copyObject is applied to the second argument, we're leaking the top-level List in the copy result, but again it's not worth fussing over. Comments? regards, tom lane
Re: POC: converting Lists into arrays
Robert Haas writes: > On Tue, Jul 16, 2019 at 10:44 AM Tom Lane wrote: >> OK, I'm outvoted, will do it that way. > I cast my vote in the other direction i.e. for sticking with qsort. Didn't see this until after pushing a commit that uses "list_sort". While composing that commit message another argument occurred to me, which is that renaming makes it absolutely sure that any external callers will notice they have an API change to deal with, no matter how forgiving their compiler is. Also, if somebody really really doesn't want to cope with the change, they can now make their own version of list_qsort (stealing it out of 1cff1b95a) and the core code won't pose a conflict. So I'm good with "list_sort" at this point. regards, tom lane
Re: POC: converting Lists into arrays
On Tue, Jul 16, 2019 at 9:01 AM Robert Haas wrote: > I cast my vote in the other direction i.e. for sticking with qsort. I do too. -- Peter Geoghegan
Re: POC: converting Lists into arrays
On Tue, Jul 16, 2019 at 10:44 AM Tom Lane wrote: > David Steele writes: > > On 7/15/19 11:07 PM, Tom Lane wrote: > >> David Rowley writes: > >>> The only thoughts I have so far here are that it's a shame that the > >>> function got called list_qsort() and not just list_sort(). > > > I agree with David -- list_sort() is better. I don't think "sort" is > > such a common stem that searching is a big issue, especially with modern > > code indexing tools. > > OK, I'm outvoted, will do it that way. I cast my vote in the other direction i.e. for sticking with qsort. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: POC: converting Lists into arrays
David Steele writes: > On 7/15/19 11:07 PM, Tom Lane wrote: >> David Rowley writes: >>> The only thoughts I have so far here are that it's a shame that the >>> function got called list_qsort() and not just list_sort(). > I agree with David -- list_sort() is better. I don't think "sort" is > such a common stem that searching is a big issue, especially with modern > code indexing tools. OK, I'm outvoted, will do it that way. regards, tom lane
Re: POC: converting Lists into arrays
On 7/15/19 11:07 PM, Tom Lane wrote: David Rowley writes: The only thoughts I have so far here are that it's a shame that the function got called list_qsort() and not just list_sort(). I don't see why callers need to know anything about the sort algorithm that's being used. Meh. list_qsort() is quicksort only to the extent that qsort() is quicksort, which in our current implementation is a bit of a lie already --- and, I believe, it's much more of a lie in some versions of libc. I don't really think of either name as promising anything about the underlying sort algorithm. What they do share is an API based on a callback comparison function, and if you are looking for uses of those, it's a lot easier to grep for "qsort" than some more-generic term. I agree with David -- list_sort() is better. I don't think "sort" is such a common stem that searching is a big issue, especially with modern code indexing tools. -- -David da...@pgmasters.net
Re: POC: converting Lists into arrays
On Tue, 16 Jul 2019 at 07:49, Tom Lane wrote: > A possibly controversial point is that I made list_qsort() sort the > given list in-place, rather than building a new list as it has > historically. For every single one of the existing and new callers, > copying the input list is wasteful, because they were just leaking > the input list anyway. But perhaps somebody feels that we should > preserve the "const input" property? I thought that changing the > function to return void, as done here, might be a good idea to > ensure that callers notice its API changed --- otherwise they'd > only get a warning about incompatible signature of the passed > function pointer, which they might not notice; in fact I'm not > totally sure all compilers would even give such a warning. > > If there's not complaints about that, I'm just going to go ahead > and push this --- it seems simple enough to not need much review. The only thoughts I have so far here are that it's a shame that the function got called list_qsort() and not just list_sort(). I don't see why callers need to know anything about the sort algorithm that's being used. If we're going to break compatibility for this, should we rename the function too? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
I wrote: > [ list_qsort-API-change.patch ] Also, here's a follow-on patch that cleans up some crufty code in heap.c and relcache.c to use list_qsort, rather than manually doing insertions into a list that's kept ordered. The existing comments argue that that's faster than qsort for small N, but I think that's a bit questionable. And anyway, that code is definitely O(N^2) if N isn't so small, while this replacement logic is O(N log N). This incidentally removes the only two calls of lappend_cell_oid. There are no callers of lappend_cell_int, and only one of lappend_cell, and that one would be noticeably cleaner if it were rewritten to use list_insert_nth instead. So I'm a bit tempted to do so and then nuke all three of those functions, which would at least make some tiny dent in Andres' unhappiness with the messiness of the List API. Thoughts? regards, tom lane diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index ab8a475..43d4252 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -125,7 +125,6 @@ static void SetRelationNumChecks(Relation rel, int numchecks); static Node *cookConstraint(ParseState *pstate, Node *raw_constraint, char *relname); -static List *insert_ordered_unique_oid(List *list, Oid datum); /* @@ -3384,55 +3383,19 @@ heap_truncate_find_FKs(List *relationIds) if (!list_member_oid(relationIds, con->confrelid)) continue; - /* Add referencer unless already in input or result list */ + /* Add referencer to result, unless present in input list */ if (!list_member_oid(relationIds, con->conrelid)) - result = insert_ordered_unique_oid(result, con->conrelid); + result = lappend_oid(result, con->conrelid); } systable_endscan(fkeyScan); table_close(fkeyRel, AccessShareLock); - return result; -} + /* Now sort and de-duplicate the result list */ + list_qsort(result, list_oid_cmp); + list_deduplicate_oid(result); -/* - * insert_ordered_unique_oid - * Insert a new Oid into a sorted list of Oids, preserving ordering, - * and eliminating duplicates - * - * Building the ordered list this way is O(N^2), but with a pretty small - * constant, so for the number of entries we expect it will probably be - * faster than trying to apply qsort(). It seems unlikely someone would be - * trying to truncate a table with thousands of dependent tables ... - */ -static List * -insert_ordered_unique_oid(List *list, Oid datum) -{ - ListCell *prev; - - /* Does the datum belong at the front? */ - if (list == NIL || datum < linitial_oid(list)) - return lcons_oid(datum, list); - /* Does it match the first entry? */ - if (datum == linitial_oid(list)) - return list; /* duplicate, so don't insert */ - /* No, so find the entry it belongs after */ - prev = list_head(list); - for (;;) - { - ListCell *curr = lnext(list, prev); - - if (curr == NULL || datum < lfirst_oid(curr)) - break;/* it belongs after 'prev', before 'curr' */ - - if (datum == lfirst_oid(curr)) - return list; /* duplicate, so don't insert */ - - prev = curr; - } - /* Insert datum into list after 'prev' */ - lappend_cell_oid(list, prev, datum); - return list; + return result; } /* diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c index 8a7eb6a..c16b220 100644 --- a/src/backend/nodes/list.c +++ b/src/backend/nodes/list.c @@ -1298,6 +1298,34 @@ list_concat_unique_oid(List *list1, const List *list2) } /* + * Remove adjacent duplicates in a list of OIDs. + * + * It is caller's responsibility to have sorted the list to bring duplicates + * together, perhaps via list_qsort(list, list_oid_cmp). + */ +void +list_deduplicate_oid(List *list) +{ + int len; + + Assert(IsOidList(list)); + len = list_length(list); + if (len > 1) + { + ListCell *elements = list->elements; + int i = 0; + + for (int j = 1; j < len; j++) + { + if (elements[i].oid_value != elements[j].oid_value) +elements[++i].oid_value = elements[j].oid_value; + } + list->length = i + 1; + } + check_list_invariants(list); +} + +/* * Free all storage in a list, and optionally the pointed-to elements */ static void @@ -1440,3 +1468,19 @@ list_qsort(List *list, list_qsort_comparator cmp) if (len > 1) qsort(list->elements, len, sizeof(ListCell), (qsort_comparator) cmp); } + +/* + * list_qsort comparator for sorting a list into ascending OID order. + */ +int +list_oid_cmp(const ListCell *p1, const ListCell *p2) +{ + Oid v1 = lfirst_oid(p1); + Oid v2 = lfirst_oid(p2); + + if (v1 < v2) + return -1; + if (v1 > v2) + return 1; + return 0; +} diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 3ca9a9f..13ce2b6 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -285,7 +285,6 @@ static TupleDesc GetPgIndexDescriptor(void); static void AttrDefaultFetch(Relation relation); static
Re: POC: converting Lists into arrays
I wrote: > BTW, further on the subject of performance --- I'm aware of at least > these topics for follow-on patches: > ... > * Adjust API for list_qsort(), as discussed, to save indirections and > avoid constructing an intermediate pointer array. I also seem to recall > one place in the planner that's avoiding using list_qsort by manually > flattening a list into an array, qsort'ing, and rebuilding the list :-( Here's a proposed patch for that. There are only two existing calls of list_qsort(), it turns out. I didn't find the described spot in the planner (I believe I was thinking of choose_bitmap_and(), but its usage isn't quite as described). However, I found about four other places that were doing pretty much exactly that, so the attached also simplifies those places to use list_qsort(). (There are some other places that could perhaps be changed also, but it would require more invasive hacking than I wanted to do here, with less-clear benefits.) A possibly controversial point is that I made list_qsort() sort the given list in-place, rather than building a new list as it has historically. For every single one of the existing and new callers, copying the input list is wasteful, because they were just leaking the input list anyway. But perhaps somebody feels that we should preserve the "const input" property? I thought that changing the function to return void, as done here, might be a good idea to ensure that callers notice its API changed --- otherwise they'd only get a warning about incompatible signature of the passed function pointer, which they might not notice; in fact I'm not totally sure all compilers would even give such a warning. If there's not complaints about that, I'm just going to go ahead and push this --- it seems simple enough to not need much review. regards, tom lane diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c index ecc2911..8a7eb6a 100644 --- a/src/backend/nodes/list.c +++ b/src/backend/nodes/list.c @@ -1421,43 +1421,22 @@ list_copy_deep(const List *oldlist) /* * Sort a list as though by qsort. * - * A new list is built and returned. Like list_copy, this doesn't make - * fresh copies of any pointed-to data. + * The list is sorted in-place (this is a change from pre-v13 Postgres). * - * The comparator function receives arguments of type ListCell **. - * (XXX that's really inefficient now, but changing it seems like - * material for a standalone patch.) + * The comparator function is declared to receive arguments of type + * const ListCell *; this allows it to use lfirst() and variants + * without casting its arguments. */ -List * -list_qsort(const List *list, list_qsort_comparator cmp) +void +list_qsort(List *list, list_qsort_comparator cmp) { - int len = list_length(list); - ListCell **list_arr; - List *newlist; - ListCell *cell; - int i; - - /* Empty list is easy */ - if (len == 0) - return NIL; - - /* We have to make an array of pointers to satisfy the API */ - list_arr = (ListCell **) palloc(sizeof(ListCell *) * len); - i = 0; - foreach(cell, list) - list_arr[i++] = cell; - - qsort(list_arr, len, sizeof(ListCell *), cmp); - - /* Construct new list (this code is much like list_copy) */ - newlist = new_list(list->type, len); - - for (i = 0; i < len; i++) - newlist->elements[i] = *list_arr[i]; + typedef int (*qsort_comparator) (const void *a, const void *b); + int len; - /* Might as well free the workspace array */ - pfree(list_arr); + check_list_invariants(list); - check_list_invariants(newlist); - return newlist; + /* Nothing to do if there's less than two elements */ + len = list_length(list); + if (len > 1) + qsort(list->elements, len, sizeof(ListCell), (qsort_comparator) cmp); } diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 67254c2..286d8de 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -52,8 +52,8 @@ typedef enum #define STD_FUZZ_FACTOR 1.01 static List *translate_sub_tlist(List *tlist, int relid); -static int append_total_cost_compare(const void *a, const void *b); -static int append_startup_cost_compare(const void *a, const void *b); +static int append_total_cost_compare(const ListCell *a, const ListCell *b); +static int append_startup_cost_compare(const ListCell *a, const ListCell *b); static List *reparameterize_pathlist_by_child(PlannerInfo *root, List *pathlist, RelOptInfo *child_rel); @@ -1239,9 +1239,8 @@ create_append_path(PlannerInfo *root, */ Assert(pathkeys == NIL); - subpaths = list_qsort(subpaths, append_total_cost_compare); - partial_subpaths = list_qsort(partial_subpaths, - append_startup_cost_compare); + list_qsort(subpaths, append_total_cost_compare); + list_qsort(partial_subpaths, append_startup_cost_compare); } pathnode->first_partial_path = list_length(subpaths); pathnode->subpaths = list_concat(subpaths,
Re: POC: converting Lists into arrays
Daniel Gustafsson writes: >> On 13 Jul 2019, at 18:32, Tom Lane wrote: >> I see from the cfbot that v8 is already broken (new call of lnext >> to be fixed). Don't really want to keep chasing a moving target, >> so unless I hear objections I'm going to adjust the additional >> spot(s) and commit this pretty soon, like tomorrow or Monday. > I just confirmed that fixing the recently introduced callsite not handled in > the patch still passes tests etc. +1 on this. Thanks for checking! I've now pushed this, with a bit of additional cleanup and comment-improvement in pg_list.h and list.c. regards, tom lane
Re: POC: converting Lists into arrays
> On 13 Jul 2019, at 18:32, Tom Lane wrote: > I see from the cfbot that v8 is already broken (new call of lnext > to be fixed). Don't really want to keep chasing a moving target, > so unless I hear objections I'm going to adjust the additional > spot(s) and commit this pretty soon, like tomorrow or Monday. I just confirmed that fixing the recently introduced callsite not handled in the patch still passes tests etc. +1 on this. cheers ./daniel
Re: POC: converting Lists into arrays
David Rowley writes: > Thanks for the speedy turnaround. I've looked at v8, as far as a diff > between the two patches and I'm happy. > I've marked as ready for committer. So ... last chance for objections? I see from the cfbot that v8 is already broken (new call of lnext to be fixed). Don't really want to keep chasing a moving target, so unless I hear objections I'm going to adjust the additional spot(s) and commit this pretty soon, like tomorrow or Monday. regards, tom lane
Re: POC: converting Lists into arrays
On Thu, 4 Jul 2019 at 06:15, Tom Lane wrote: > > David Rowley writes: > > I've now read over the entire patch and have noted down the following: > > Thanks for the review! Attached is a v8 responding to most of your > comments. Anything not quoted below I just accepted. Thanks for the speedy turnaround. I've looked at v8, as far as a diff between the two patches and I'm happy. I've marked as ready for committer. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
David Rowley writes: > I also did some more benchmarking of the patch. ... > Which makes the patched version 2.2% faster than master on that run. BTW, further on the subject of performance --- I'm aware of at least these topics for follow-on patches: * Fix places that are maintaining arrays parallel to Lists for access-speed reasons (at least simple_rte_array, append_rel_array, es_range_table_array). * Look at places using lcons/list_delete_first to maintain FIFO lists. The patch makes these O(N^2) for long lists. If we can reverse the list order and use lappend/list_truncate instead, it'd be better. Possibly in some places the list ordering is critical enough to make this impractical, but I suspect it's an easy win in most. * Rationalize places that are using combinations of list_copy and list_concat, probably by inventing an additional list-concatenation primitive that modifies neither input. * Adjust API for list_qsort(), as discussed, to save indirections and avoid constructing an intermediate pointer array. I also seem to recall one place in the planner that's avoiding using list_qsort by manually flattening a list into an array, qsort'ing, and rebuilding the list :-( I don't think that any one of these fixes would move the needle very much on "typical" simple workloads, but it's reasonable to hope that in aggregate they'd make for a noticeable improvement. In the meantime, I'm gratified that the initial patch at least doesn't seem to have lost any ground. regards, tom lane
Re: POC: converting Lists into arrays
On 2019-Jul-03, Tom Lane wrote: > David Rowley writes: > > 10. I wonder if we can reduce a bit of pain for extension authors by > > back patching a macro that wraps up a lnext() call adding a dummy > > argument for the List. > > I was wondering about a variant of that yesterday; specifically, > I thought of naming the new 2-argument function list_next() not lnext(). > Then we could add "#define list_next(l,c) lnext(c)" in the back branches. > This would simplify back-patching code that used the new definition, and > it might save some effort for extension authors who are trying to maintain > cross-branch code. On the other hand, it's more keystrokes forevermore, > and I'm not entirely convinced that code that's using lnext() isn't likely > to need other adjustments anyway. So I didn't pull the trigger on that, > but if people like the idea I'd be okay with doing it like that. I was thinking about this issue too earlier today, and my conclusion is that the way you have it in v7 is fine, because lnext() callsites are not that numerous, so the cost to third-party code authors is not that high; the other arguments trump this consideration IMO. I say this as someone who curses every time he has to backpatch things across the heap_open / table_open change -- but there are a lot more calls of that. > Indeed. I don't want to expend a lot of effort keeping it in sync > with master over a long period, either. Opinions? Yeah, let's get it done soon. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: POC: converting Lists into arrays
On Tue, Jul 2, 2019 at 5:12 PM Tom Lane wrote: > Oleksandr Shulgin writes: > > Not related to the diff v6..v7, but shouldn't we throw additionally a > > memset() with '\0' before calling pfree(): > > I don't see the point of that. In debug builds CLOBBER_FREED_MEMORY will > take care of it, and in non-debug builds I don't see why we'd expend > the cycles. > This is what I was wondering about, thanks for providing a reference. -- Alex
Re: POC: converting Lists into arrays
On Tue, 2 Jul 2019 at 11:27, Tom Lane wrote: > My previous patch would have had you replace this with a loop using > an integer list-position index. You can still do that if you like, > but it's less change to convert the loop to a foreach(), drop the > prev/next variables, and replace the list_delete_cell call with > foreach_delete_current: > > foreach(cell, state->enterKeys) > { > TrgmStateKey *existingKey = (TrgmStateKey *) lfirst(cell); > > if (need to delete) > state->enterKeys = > foreach_delete_current(state->enterKeys, > cell); > } > > So I think this is a win, and attached is v7. It's pretty nice to get rid of those. I also like you've handled the changes in SRFs. I've now read over the entire patch and have noted down the following: 1. MergeAttributes() could do with a comment mention why you're not using foreach() on the outer loop. I almost missed the list_delete_nth_cell() call that's a few branches deep in the outer loop. 2. In expandTupleDesc(), couldn't you just change the following: int i; for (i = 0; i < offset; i++) { if (aliascell) aliascell = lnext(eref->colnames, aliascell); } to: aliascell = offset < list_length(eref->colnames) ? list_nth_cell(eref->colnames, offset) : NULL; 3. Worth Assert(list != NIL); in new_head_cell() and new_tail_cell() ? 4. Do you think it would be a good idea to document that the 'pos' arg in list_insert_nth and co must be <= list_length(). I know you've mentioned that in insert_new_cell, but that's static and list_insert_nth is not. I think it would be better not to have to chase down comments of static functions to find out how to use an external function. 5. Why does enlarge_list() return List *? Can't it just return void? I noticed this after looking at add_new_cell_after() and reading your cautionary comment and then looking at lappend_cell(). At first, it seemed that lappend_cell() could end up reallocating List to make way for the new cell, but from looking at enlarge_list() it seems we always maintain the original allocation of the header. So why bother returning List * in that function? 6. Is there a reason to use memmove() in list_concat() rather than just memcpy()? I don't quite believe the comment you've written. As far as I can see you're not overwriting any useful memory so the order of the copy should not matter. 7. The last part of the following comment might not age well. /* * Note: unlike the individual-list-cell deletion functions, we never make * any effort to move the surviving list cells to new storage. This is * because none of them can move in this operation, so it's the same as * the old implementation in terms of what callers may assume. */ The old comment about extending the list seems more fitting. 9. I see your XXX comment in list_qsort(), but wouldn't it be better to just invent list_qsort_internal() as a static function and just have it qsort the list in-place, then make list_qsort just return list_qsort_internal(list_copy(list)); and keep the XXX comment so that the fixup would just remove the list_copy()? That way, when it comes to fixing that inefficiency we can just cut and paste the internal implementation into list_qsort(). It'll be much less churn, especially if you put the internal version directly below the external one. 10. I wonder if we can reduce a bit of pain for extension authors by back patching a macro that wraps up a lnext() call adding a dummy argument for the List. That way they don't have to deal with their own pre-processor version dependent code. Downsides are we'd need to keep the macro into the future, however, it's just 1 line of code... I also did some more benchmarking of the patch. Basically, I patched with the attached file (converted to .txt not to upset the CFbot) then ran make installcheck. This was done on an AWS m5d.large instance. The patch runs the planner 10 times then LOGs the average time of those 10 runs. Taking the sum of those averages I see: Master: 0.482479 seconds Patched: 0.471949 seconds Which makes the patched version 2.2% faster than master on that run. I've resisted attaching the spreadsheet since there are almost 22k data points per run. Apart from the 10 points above, I think the patch is good to go. I also agree with keeping the further improvements like getting rid of the needless list_copy() in list_concat() calls as a followup patch. I also agree with Tom about moving quickly with this one. Reviewing it in detail took me a long time, I'd really rather not do it again after leaving it to rot for a while. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 1bcfdd67e0..750320cb3b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@
Re: POC: converting Lists into arrays
Oleksandr Shulgin writes: > Not related to the diff v6..v7, but shouldn't we throw additionally a > memset() with '\0' before calling pfree(): I don't see the point of that. In debug builds CLOBBER_FREED_MEMORY will take care of it, and in non-debug builds I don't see why we'd expend the cycles. regards, tom lane
Re: POC: converting Lists into arrays
On Tue, Jul 2, 2019 at 1:27 AM Tom Lane wrote: > > So I think this is a win, and attached is v7. > Not related to the diff v6..v7, but shouldn't we throw additionally a memset() with '\0' before calling pfree(): +ListCell *newelements; + +newelements = (ListCell *) +MemoryContextAlloc(GetMemoryChunkContext(list), + new_max_len * sizeof(ListCell)); +memcpy(newelements, list->elements, + list->length * sizeof(ListCell)); +pfree(list->elements); +list->elements = newelements; Or is this somehow ensured by debug pfree() implementation or does it work differently together with Valgrind? Otherwise it seems that the calling code can still be hanging onto a list element from a freed chunk and (rather) happily accessing it, as opposed to almost ensured crash if that is zeroed before returning from enlarge_list(). Cheers, -- Alex
Re: POC: converting Lists into arrays
I spent some time experimenting with the idea mentioned upthread of adding a macro to support deletion of a foreach loop's current element (adjusting the loop's state behind the scenes). This turns out to work really well: it reduces the complexity of fixing existing loops around element deletions quite a bit. Whereas in existing code you have to not use foreach() at all, and you have to track both the next list element and the previous undeleted element, now you can use foreach() and you don't have to mess with extra variables at all. A good example appears in the trgm_regexp.c changes below. Typically we've coded such loops with a handmade expansion of foreach, like prev = NULL; cell = list_head(state->enterKeys); while (cell) { TrgmStateKey *existingKey = (TrgmStateKey *) lfirst(cell); next = lnext(cell); if (need to delete) state->enterKeys = list_delete_cell(state->enterKeys, cell, prev); else prev = cell; cell = next; } My previous patch would have had you replace this with a loop using an integer list-position index. You can still do that if you like, but it's less change to convert the loop to a foreach(), drop the prev/next variables, and replace the list_delete_cell call with foreach_delete_current: foreach(cell, state->enterKeys) { TrgmStateKey *existingKey = (TrgmStateKey *) lfirst(cell); if (need to delete) state->enterKeys = foreach_delete_current(state->enterKeys, cell); } So I think this is a win, and attached is v7. regards, tom lane reimplement-List-as-array-7.patch.gz Description: reimplement-List-as-array-7.patch.gz
Re: POC: converting Lists into arrays
Hi, On 7/1/19 2:44 PM, Tom Lane wrote: Yup, here's a rebase against HEAD (and I also find that check-world shows no problems). Thanks - no further comments. This is pretty much of a pain to maintain, since it changes the API for lnext() which is, um, a bit invasive. I'd like to make a decision pretty quickly on whether we're going to do this, and either commit this patch or abandon it. IMHO it is an improvement over the existing API. There is some unneeded MemoryContext stuff in async.c's pg_listening_channels() which should be cleaned up. Yeah, there's a fair amount of follow-on cleanup that could be undertaken afterwards, but I've wanted to keep the patch's footprint as small as possible for the moment. Assuming we pull the trigger, I'd then go look at removing the planner's duplicative lists+arrays for RTEs and such as the first cleanup step. But thanks for the pointer to async.c, I'll check that too. Yeah, I only called out the async.c change, as that function isn't likely to change in any of the follow up patches. Best regards, Jesper
Re: POC: converting Lists into arrays
Jesper Pedersen writes: > This needs a rebase. After that check-world passes w/ and w/o > -DDEBUG_LIST_MEMORY_USAGE. Yup, here's a rebase against HEAD (and I also find that check-world shows no problems). This is pretty much of a pain to maintain, since it changes the API for lnext() which is, um, a bit invasive. I'd like to make a decision pretty quickly on whether we're going to do this, and either commit this patch or abandon it. > There is some unneeded MemoryContext stuff in async.c's > pg_listening_channels() which should be cleaned up. Yeah, there's a fair amount of follow-on cleanup that could be undertaken afterwards, but I've wanted to keep the patch's footprint as small as possible for the moment. Assuming we pull the trigger, I'd then go look at removing the planner's duplicative lists+arrays for RTEs and such as the first cleanup step. But thanks for the pointer to async.c, I'll check that too. regards, tom lane reimplement-List-as-array-6.patch.gz Description: reimplement-List-as-array-6.patch.gz
Re: POC: converting Lists into arrays
Hi, On 5/25/19 11:48 AM, Tom Lane wrote: The cfbot noticed a set-but-not-used variable that my compiler hadn't whined about. Here's a v5 to pacify it. No other changes. This needs a rebase. After that check-world passes w/ and w/o -DDEBUG_LIST_MEMORY_USAGE. There is some unneeded MemoryContext stuff in async.c's pg_listening_channels() which should be cleaned up. Thanks for working on this, as the API is more explicit now about what is going on. Best regards, Jesper
Re: POC: converting Lists into arrays
David Rowley writes: > On Fri, 14 Jun 2019 at 13:54, Bruce Momjian wrote: >> Have you tested the performance impact? > I did some and posted earlier in the thread: > https://postgr.es/m/cakjs1f8h2vs8m0cgfsgfivfkjvudu5-mzo1gjb2uf0m8_9v...@mail.gmail.com > It came out only slightly slower over the whole regression test run, > which I now think is surprisingly good considering how much we've > tuned the code over the years with the assumption that List is a > singly linked list. We'll be able to get rid of things like > PlannerInfo's simple_rte_array and append_rel_array along with > EState's es_range_table_array. Yeah. I have not made any attempt at all in the current patch to re-tune the code, or clean up places that are maintaining parallel Lists and arrays (such as the ones David mentions). So it's not entirely fair to take the current state of the patch as representative of where performance would settle once we've bought into the new method. regards, tom lane
Re: POC: converting Lists into arrays
On Fri, 14 Jun 2019 at 13:54, Bruce Momjian wrote: > > On Sat, May 25, 2019 at 11:48:47AM -0400, Tom Lane wrote: > > I wrote: > > > Here's a new version of the Lists-as-arrays patch. > > > > The cfbot noticed a set-but-not-used variable that my compiler hadn't > > whined about. Here's a v5 to pacify it. No other changes. > > Have you tested the performance impact? I did some and posted earlier in the thread: https://postgr.es/m/cakjs1f8h2vs8m0cgfsgfivfkjvudu5-mzo1gjb2uf0m8_9v...@mail.gmail.com It came out only slightly slower over the whole regression test run, which I now think is surprisingly good considering how much we've tuned the code over the years with the assumption that List is a singly linked list. We'll be able to get rid of things like PlannerInfo's simple_rte_array and append_rel_array along with EState's es_range_table_array. I'm particularly happy about getting rid of es_range_table_array since initialising a plan with many partitions ends up costing quite a bit just to build that array. Run-time pruning might end up pruning all but one of those, so getting rid of something that's done per partition is pretty good. (There's also the locking, but that's another problem). -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
On Sat, May 25, 2019 at 11:48:47AM -0400, Tom Lane wrote: > I wrote: > > Here's a new version of the Lists-as-arrays patch. > > The cfbot noticed a set-but-not-used variable that my compiler hadn't > whined about. Here's a v5 to pacify it. No other changes. Have you tested the performance impact? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: POC: converting Lists into arrays
David Rowley writes: > If we're doing an API break for this, wouldn't it be better to come up > with something that didn't have to shuffle list elements around every > time one is deleted? Agreed that as long as there's an API break anyway, we could consider more extensive changes for this use-case. But ... > For example, we could have a foreach_delete() that instead of taking a > pointer to a ListCell, it took a ListDeleteIterator which contained a > ListCell pointer and a Bitmapset, then just have a macro that marks a > list item as deleted (list_delete_current(di)) and have a final > cleanup at the end of the loop. ... I'm not quite sold on this particular idea. The amount of added bitmapset manipulation overhead seems rather substantial in comparison to the memmove work saved. It might win for cases involving very long lists with many entries being deleted in one operation, but I don't think that's a common scenario for us. It's definitely a loss when there's just one item to be deleted, which I think is a common case. (Of course, callers expecting that could just not use this multi-delete API.) > Or maybe we should worry about having the list in an inconsistent > state during the loop? e.g if the list is getting passed into a > function call to do something. Not following that? If I understand your idea correctly, the list doesn't actually get changed until the cleanup step. If we pass it to another operation that independently deletes some members meanwhile, that's trouble; but it'd be trouble for the existing code, and for my version of the patch too. FWIW, I don't really see a need to integrate this idea into the loop logic as such. You could just define it as "make a bitmap of the list indexes to delete, then call "list = list_delete_multi(list, bitmapset)". It would be helpful perhaps if we provided official access to the current list index that the foreach macro is maintaining internally. regards, tom lane
Re: POC: converting Lists into arrays
I wrote: > Here's a new version of the Lists-as-arrays patch. The cfbot noticed a set-but-not-used variable that my compiler hadn't whined about. Here's a v5 to pacify it. No other changes. regards, tom lane reimplement-List-as-array-5.patch.gz Description: reimplement-List-as-array-5.patch.gz
Re: POC: converting Lists into arrays
On Sat, 25 May 2019 at 12:53, Tom Lane wrote: > Now, it turns out that the new formulation of foreach() is really > strictly equivalent to > > for (int pos = 0; pos < list_length(list); pos++) > { > whatever-type item = list_nth(list, pos); > ... > } > > which means that it could cope fine with deletion of the current > list element if we were to provide some supported way of not > incrementing the list index counter. That is, instead of > code that looks more or less like this: > > for (int pos = 0; pos < list_length(list); pos++) > { > whatever-type item = list_nth(list, pos); > ... > if (delete_cur) > { > list = list_delete_nth_cell(list, pos); > pos--; /* keep loop in sync with deletion */ > } > } > > we could write, say: > > foreach(lc, list) > { > whatever-type item = lfirst(lc); > ... > if (delete_cur) > { > list = list_delete_cell(list, lc); > foreach_backup(lc); /* keep loop in sync with > deletion */ > } > } > > which is the same thing under the hood. I'm not quite sure if that way > is better or not. It's more magical than explicitly manipulating a list > index, but it's also shorter and therefore less subject to typos. If we're doing an API break for this, wouldn't it be better to come up with something that didn't have to shuffle list elements around every time one is deleted? For example, we could have a foreach_delete() that instead of taking a pointer to a ListCell, it took a ListDeleteIterator which contained a ListCell pointer and a Bitmapset, then just have a macro that marks a list item as deleted (list_delete_current(di)) and have a final cleanup at the end of the loop. The cleanup operation can still use memmove, but just only move up until the next bms_next_member on the deleted set, something like (handwritten and untested): void list_finalize_delete(List *list, ListDeleteIterator *di) { int srcpos, curr, tarpos; /* Zero the source and target list position markers */ srcpos = tarpos = 0; curr = -1; while ((curr = bms_next_member(di->deleted, curr) >= 0) { int n = curr - srcpos; if (n > 0) { memmove(>elements[tarpos], >elements[srcpos], n * sizeof(ListCell)); tarpos += n; } srcpos = curr + 1; } list->length = tarpos; } Or maybe we should worry about having the list in an inconsistent state during the loop? e.g if the list is getting passed into a function call to do something. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
Here's a new version of the Lists-as-arrays patch. It's rebased up to HEAD, and I also realized that I could fix the problem with multiple evaluation of the List arguments of foreach etc. by using structure assignment. So that gets rid of a large chunk of the semantic gotchas that were in the previous patch. You still have to be careful about code that deletes list entries within a foreach() over the list --- but nearly all such code is using list_delete_cell, which means you'll have to touch it anyway because of the API change for that function. Previously, the typical logic for deletion-within-a-loop involved either advancing or not advancing a "prev" pointer that was used with list_delete_cell. The way I've recoded that here changes those loops to use an integer list index that gets incremented or not. Now, it turns out that the new formulation of foreach() is really strictly equivalent to for (int pos = 0; pos < list_length(list); pos++) { whatever-type item = list_nth(list, pos); ... } which means that it could cope fine with deletion of the current list element if we were to provide some supported way of not incrementing the list index counter. That is, instead of code that looks more or less like this: for (int pos = 0; pos < list_length(list); pos++) { whatever-type item = list_nth(list, pos); ... if (delete_cur) { list = list_delete_nth_cell(list, pos); pos--; /* keep loop in sync with deletion */ } } we could write, say: foreach(lc, list) { whatever-type item = lfirst(lc); ... if (delete_cur) { list = list_delete_cell(list, lc); foreach_backup(lc); /* keep loop in sync with deletion */ } } which is the same thing under the hood. I'm not quite sure if that way is better or not. It's more magical than explicitly manipulating a list index, but it's also shorter and therefore less subject to typos. regards, tom lane reimplement-List-as-array-4.patch.gz Description: reimplement-List-as-array-4.patch.gz
Re: POC: converting Lists into arrays
David Rowley writes: > On Tue, 5 Mar 2019 at 12:54, Andres Freund wrote: >> Yes, I think you have a point that progress here would be good and that >> it's worth some pain. But the names will make even less sense if we just >> shunt in an array based approach under the already obscure list >> API. > If we feel strongly about fixing that then probably it would be as > simple as renaming the functions and adding some macros with the old > names and insisting that all new or changed code use the functions and > not the macro wrappers. Meh ... Neil Conway already did a round of that back in 2004 or whenever, and I'm not especially excited about another round. I'm not really following Andres' aversion to the list API --- it's not any more obscure than a whole lot of things in Postgres. (Admittedly, as somebody who dabbled in Lisp decades ago, I might be more used to it than some folks.) regards, tom lane
Re: POC: converting Lists into arrays
David Rowley writes: > ... With list_concat() I find that pretty scary > anyway. Using it means we can have a valid list that does not get it's > length updated when someone appends a new item. Most users of that do > list_copy() to sidestep that and other issues... which likely is > something we'd want to rip out with Tom's patch. Yeah, it's a bit OT for this patch, but I'd noticed the prevalence of locutions like list_concat(list_copy(list1), list2), and been thinking of proposing that we add some new primitives with, er, less ad-hoc behavior. The patch at hand already changes the semantics of list_concat in a somewhat saner direction, but I think there is room for a version of list_concat that treats both its inputs as const Lists. regards, tom lane
Re: POC: converting Lists into arrays
On Tue, 5 Mar 2019 at 12:54, Andres Freund wrote: > Yes, I think you have a point that progress here would be good and that > it's worth some pain. But the names will make even less sense if we just > shunt in an array based approach under the already obscure list > API. If we feel strongly about fixing that then probably it would be as simple as renaming the functions and adding some macros with the old names and insisting that all new or changed code use the functions and not the macro wrappers. That could be followed up by a final sweep in N years time when the numbers have dwindled to a low enough level. All that code mustn't be getting modified anyway, so not much chance backpatching pain. I see length() finally died in a similar way in Tom's patch. Perhaps doing this would have people consider lcons more carefully before they use it over lappend. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
Hi, On 2019-03-05 12:42:47 +1300, David Rowley wrote: > So you think targetlists are the only case to benefit from an array > based list? (Ignoring the fact that I already showed another case) No, that's not what I'm trying to say at all. I think there's plenty cases where it'd be beneficial. In this subthread we're just arguing whether it's somewhat feasible to not change everything, and I'm still fairly convinced that's possible; but I'm not arguing that that's the best way. > It's true that linked lists are certainly better for some stuff; > list_concat() is going to get slower, lcons() too, but likely we can > have a bonus lcons() elimination round at some point. I see quite a > few of them that look like they could be changed to lappend(). I also > just feel that if we insist on more here then we'll get about nothing. > I'm also blocked on my partition performance improvement goals on > list_nth() being O(N), so I'm keen to see progress here and do what I > can to help with that. With list_concat() I find that pretty scary > anyway. Using it means we can have a valid list that does not get it's > length updated when someone appends a new item. Most users of that do > list_copy() to sidestep that and other issues... which likely is > something we'd want to rip out with Tom's patch. Yes, I think you have a point that progress here would be good and that it's worth some pain. But the names will make even less sense if we just shunt in an array based approach under the already obscure list API. Obviously the individual pain of that is fairly small, but over the years and everybody reading PG code, it's also substantial. So I'm torn. Greetings, Andres Freund
Re: POC: converting Lists into arrays
On Tue, 5 Mar 2019 at 11:11, Andres Freund wrote: > On 2019-03-04 16:28:40 -0500, Tom Lane wrote: > > Andres Freund writes: > > > I don't buy this. I think e.g. redisgning the way we represent > > > targetlists would be good (it's e.g. insane that we recompute > > > descriptors out of them all the time), and would reduce their allocator > > > costs. > > > > Maybe we're not on the same page here, but it seems to me that that'd be > > addressable with pretty localized changes (eg, adding more fields to > > TargetEntry, or keeping a pre-instantiated output tupdesc in each Plan > > node). But if the concern is about the amount of palloc bandwidth going > > into List cells, we're not going to be able to improve that with localized > > data structure changes; it'll take something like the patch I've proposed. > > What I'm saying is that it'd be reasonable to replace the use of list > for targetlists with 'list2' without a wholesale replacement of all the > list code, and it'd give us benefits. So you think targetlists are the only case to benefit from an array based list? (Ignoring the fact that I already showed another case) When we discover the next thing to benefit, then the replacement will be piecemeal, just the way Tom would rather not do it. I personally don't want to be up against huge resistance when I discover that turning a single usage of a List into List2 is better. We'll need to consider backpatching pain / API breakage *every single time*. A while ago I did have a go at changing some List implementations for my then proposed ArrayList and it was beyond a nightmare, as each time I changed one I realised I needed to change another. In the end, I just gave up. Think of all the places we have forboth() and forthree(), we'll need to either provide a set of macros that take various combinations of List and List2 or do some conversion beforehand. With respect, if you don't believe me, please take my ArrayList patch [1] and have a go at changing targetlists to use ArrayLists all the way from the parser through to the executor. I'll be interested in the diff stat once you're done. It's true that linked lists are certainly better for some stuff; list_concat() is going to get slower, lcons() too, but likely we can have a bonus lcons() elimination round at some point. I see quite a few of them that look like they could be changed to lappend(). I also just feel that if we insist on more here then we'll get about nothing. I'm also blocked on my partition performance improvement goals on list_nth() being O(N), so I'm keen to see progress here and do what I can to help with that. With list_concat() I find that pretty scary anyway. Using it means we can have a valid list that does not get it's length updated when someone appends a new item. Most users of that do list_copy() to sidestep that and other issues... which likely is something we'd want to rip out with Tom's patch. [1] https://www.postgresql.org/message-id/CAKJS1f_2SnXhPVa6eWjzy2O9A=ocwgd0cj-lqewpgtrwqbu...@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
On Mon, Mar 4, 2019 at 2:04 PM Bruce Momjian wrote: > On Mon, Mar 4, 2019 at 12:44:41PM -0500, Robert Haas wrote: > > I think that's a separate but also promising thing to attack, and I > > agree that it'd take a lot of work to get there. I don't think that > > the problem with either parse-tree-copying or list usage is that no > > performance benefits are to be had; I think it's that the amount of > > work required to get those benefits is pretty large. If it were > > otherwise, somebody likely would have done it before now. > > Stupid question, but do we use any kind of reference counter to know if > two subsystems look at a structure, and a copy is required? No, but I wonder if we could use Valgrind to enforce rules about who has the right to scribble on what, when. That could make it a lot easier to impose a new rule. -- Peter Geoghegan
Re: POC: converting Lists into arrays
On Mon, Mar 4, 2019 at 01:11:35PM -0500, Tom Lane wrote: > Robert Haas writes: > > I think the reason why you're not seeing a performance benefit is > > because the problem is not that lists are generically a more expensive > > data structure than arrays, but that there are cases when they are > > more expensive than arrays. If you only ever push/pop at the front, > > of course a list is going to be better. If you often look up elements > > by index, of course an array is going to be better. If you change > > every case where the code currently uses a list to use something else > > instead, then you're changing both the winning and losing cases. > > I don't think this argument is especially on-point, because what I'm > actually seeing is just that there aren't any list operations that > are expensive enough to make much of an overall difference in > typical queries. To the extent that an array reimplementation > reduces the palloc traffic, it'd take some load off that subsystem, > but apparently you need not-typical queries to really notice. > (And, if the real motivation is aggregate palloc savings, then yes you > really do want to replace everything...) Could it be that allocating List* structures near the structure it points to is enough of a benefit in terms of cache hits that it is a loss when moving to a List* array? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: POC: converting Lists into arrays
On Mon, Mar 4, 2019 at 12:44:41PM -0500, Robert Haas wrote: > I think that's a separate but also promising thing to attack, and I > agree that it'd take a lot of work to get there. I don't think that > the problem with either parse-tree-copying or list usage is that no > performance benefits are to be had; I think it's that the amount of > work required to get those benefits is pretty large. If it were > otherwise, somebody likely would have done it before now. Stupid question, but do we use any kind of reference counter to know if two subsystems look at a structure, and a copy is required? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: POC: converting Lists into arrays
Andres Freund writes: > I don't buy this. I think e.g. redisgning the way we represent > targetlists would be good (it's e.g. insane that we recompute > descriptors out of them all the time), and would reduce their allocator > costs. Maybe we're not on the same page here, but it seems to me that that'd be addressable with pretty localized changes (eg, adding more fields to TargetEntry, or keeping a pre-instantiated output tupdesc in each Plan node). But if the concern is about the amount of palloc bandwidth going into List cells, we're not going to be able to improve that with localized data structure changes; it'll take something like the patch I've proposed. I *have* actually done some tests of the sort you proposed, driving just the planner and not any of the rest of the system, but I still didn't find much evidence of big wins. I find it interesting that you get different results. regards, tom lane
Re: POC: converting Lists into arrays
Hi, On 2019-03-04 13:11:35 -0500, Tom Lane wrote: > The concern I have is mostly about the use of lists as core infrastructure > in parsetree, plantree, etc data structures. I think any idea that we'd > replace those piecemeal is borderline insane: it's simply not worth it > from a notational and bug-risk standpoint to glue together some parts of > those structures differently from the way other parts are glued together. I don't buy this. I think e.g. redisgning the way we represent targetlists would be good (it's e.g. insane that we recompute descriptors out of them all the time), and would reduce their allocator costs. Greetings, Andres Freund
Re: POC: converting Lists into arrays
Hi, On 2019-03-03 13:29:04 -0500, Tom Lane wrote: > The cases I've been looking at suggest to me that we'd make far > more progress on the excessive-palloc'ing front if we could redesign > things to reduce unnecessary copying of parsetrees. Right now the > planner does an awful lot of copying because of fear of unwanted > modifications of multiply-linked subtrees. I suspect that we could > reduce that overhead with some consistently enforced rules about > not scribbling on input data structures; but it'd take a lot of work > to get there, and I'm afraid it'd be easily broken :-( Given the difficulty of this tasks, isn't your patch actually a *good* attack on the problem? It makes copying lists considerably cheaper. As you say, a more principled answer to this problem is hard, so attacking it from the "make the constant factors smaller" side doesn't seem crazy? Greetings, Andres Freund
Re: POC: converting Lists into arrays
Hi, On 2019-03-02 18:11:43 -0500, Tom Lane wrote: > On test cases like "pg_bench -S" it seems to be pretty much within the > noise level of being the same speed as HEAD. I think that might be because it's bottleneck is just elsewhere (e.g. very context switch heavy, very few lists of any length). FWIW, even just taking context switches out of the equation leads to a ~5-6 %benefit in a simple statement: DO $f$BEGIN FOR i IN 1..50 LOOP EXECUTE $s$SELECT aid, bid, abalance, filler FROM pgbench_accounts WHERE aid = 2045530;$s$;END LOOP;END;$f$; master: +6.05% postgres postgres[.] AllocSetAlloc +5.52% postgres postgres[.] base_yyparse +2.51% postgres postgres[.] palloc +1.82% postgres postgres[.] hash_search_with_hash_value +1.61% postgres postgres[.] core_yylex +1.57% postgres postgres[.] SearchCatCache1 +1.43% postgres postgres[.] expression_tree_walker.part.4 +1.09% postgres postgres[.] check_stack_depth +1.08% postgres postgres[.] MemoryContextAllocZeroAligned patch v3: +5.77% postgres postgres[.] base_yyparse +4.88% postgres postgres[.] AllocSetAlloc +1.95% postgres postgres[.] hash_search_with_hash_value +1.89% postgres postgres[.] core_yylex +1.64% postgres postgres[.] SearchCatCache1 +1.46% postgres postgres[.] expression_tree_walker.part.0 +1.45% postgres postgres[.] palloc +1.18% postgres postgres[.] check_stack_depth +1.13% postgres postgres[.] MemoryContextAllocZeroAligned +1.04% postgres libc-2.28.so[.] _int_malloc +1.01% postgres postgres[.] nocachegetattr And even just pgbenching the EXECUTEd statement above gives me a reproducible ~3.5% gain when using -M simple, and ~3% when using -M prepared. Note than when not using prepared statement (a pretty important workload, especially as long as we don't have a pooling solution that actually allows using prepared statement across connections), even after the patch most of the allocator overhead is still from list allocations, but it's near exclusively just the "create a new list" case: +5.77% postgres postgres[.] base_yyparse -4.88% postgres postgres[.] AllocSetAlloc - 80.67% AllocSetAlloc - 68.85% AllocSetAlloc - 57.65% palloc - 50.30% new_list (inlined) - 37.34% lappend + 12.66% pull_var_clause_walker + 8.83% build_index_tlist (inlined) + 8.80% make_pathtarget_from_tlist + 8.73% get_quals_from_indexclauses (inlined) + 8.73% distribute_restrictinfo_to_rels + 8.68% RewriteQuery + 8.56% transformTargetList + 8.46% make_rel_from_joinlist + 4.36% pg_plan_queries + 4.30% add_rte_to_flat_rtable (inlined) + 4.29% build_index_paths + 4.23% match_clause_to_index (inlined) + 4.22% expression_tree_mutator + 4.14% transformFromClause + 1.02% get_index_paths + 17.35% list_make1_impl + 16.56% list_make1_impl (inlined) + 15.87% lcons + 11.31% list_copy (inlined) + 1.58% lappend_oid + 12.90% expression_tree_mutator + 9.73% get_relation_info + 4.71% bms_copy (inlined) + 2.44% downcase_identifier + 2.43% heap_tuple_untoast_attr + 2.37% add_rte_to_flat_rtable (inlined) + 1.69% btbeginscan + 1.65% CreateTemplateTupleDesc + 1.61% core_yyalloc (inlined) + 1.59% heap_copytuple + 1.54% text_to_cstring (inlined) + 0.84% ExprEvalPushStep (inlined) + 0.84% ExecInitRangeTable + 0.84% scanner_init + 0.83% ExecInitRangeTable + 0.81% CreateQueryDesc + 0.81% _bt_search + 0.77% ExecIndexBuildScanKeys + 0.66% RelationGetIndexScan + 0.65% make_pathtarget_from_tlist Given how hard it is to improve performance with as flatly distributed costs as the above profiles, I actually think these are quite promising results. I'm not even convinced that it makes all that much sense to measure end-to-end performance here, it might be worthwhile to measure with a debugging function that allows to exercise parsing, parse-analysis, rewrite etc at configurable loop counts. Given the relatively evenly distributed profiles were going to have to make a few different improvements to make headway, and it's hard to see benefits of individual ones if you look at the overall numbers.
Re: POC: converting Lists into arrays
Robert Haas writes: > I think the reason why you're not seeing a performance benefit is > because the problem is not that lists are generically a more expensive > data structure than arrays, but that there are cases when they are > more expensive than arrays. If you only ever push/pop at the front, > of course a list is going to be better. If you often look up elements > by index, of course an array is going to be better. If you change > every case where the code currently uses a list to use something else > instead, then you're changing both the winning and losing cases. I don't think this argument is especially on-point, because what I'm actually seeing is just that there aren't any list operations that are expensive enough to make much of an overall difference in typical queries. To the extent that an array reimplementation reduces the palloc traffic, it'd take some load off that subsystem, but apparently you need not-typical queries to really notice. (And, if the real motivation is aggregate palloc savings, then yes you really do want to replace everything...) > Yeah, changing things individually is more work, but that's how you > get the wins without incurring the losses. The concern I have is mostly about the use of lists as core infrastructure in parsetree, plantree, etc data structures. I think any idea that we'd replace those piecemeal is borderline insane: it's simply not worth it from a notational and bug-risk standpoint to glue together some parts of those structures differently from the way other parts are glued together. regards, tom lane
Re: POC: converting Lists into arrays
On Sun, Mar 3, 2019 at 1:29 PM Tom Lane wrote: > > My main observation was from when the expression evaluation was using > > lists all over. List iteration overhead was very substantial there. But > > that's not a problem anymore, because all of those are gone now due to > > the expression rewrite. I personally wasn't actually advocating for a > > new list implementation, I was/am advocating that we should move some > > tasks over to a more optimized representation. > > I doubt that you'll get far with that; if this experiment is anything > to go by, it's going to be really hard to make the case that twiddling > the representation of widely-known data structures is worth the work > and bug hazards. I'm befuddled by this comment. Andres is arguing that we shouldn't go do a blind search-and-replace, but rather change certain things, and you're saying that's going to be really hard because twiddling the representation of widely-known data structures is really hard. But if we only change certain things, we don't *need* to twiddle the representation of a widely-known data structure. We just add a new one and convert the things that benefit from it, like I proposed upthread (and promptly got told I was wrong). I think the reason why you're not seeing a performance benefit is because the problem is not that lists are generically a more expensive data structure than arrays, but that there are cases when they are more expensive than arrays. If you only ever push/pop at the front, of course a list is going to be better. If you often look up elements by index, of course an array is going to be better. If you change every case where the code currently uses a list to use something else instead, then you're changing both the winning and losing cases. Yeah, changing things individually is more work, but that's how you get the wins without incurring the losses. I think David's results go in this direction, too. Code that was written on the assumption that list_nth() is slow is going to avoid using it as much as possible, and therefore no benefit is to be expected from making it fast. If the author written the same code assuming that the underlying data structure was an array rather than a list, they might have picked a different algorithm which, as David's results shows, could be a lot faster in some cases. But it's not going to come from just changing the way lists work internally; it's going to come from redesigning the algorithms that are using lists to do something better instead, as Andres's example of linearized expression evaluation also shows. > The cases I've been looking at suggest to me that we'd make far > more progress on the excessive-palloc'ing front if we could redesign > things to reduce unnecessary copying of parsetrees. Right now the > planner does an awful lot of copying because of fear of unwanted > modifications of multiply-linked subtrees. I suspect that we could > reduce that overhead with some consistently enforced rules about > not scribbling on input data structures; but it'd take a lot of work > to get there, and I'm afraid it'd be easily broken :-( I think that's a separate but also promising thing to attack, and I agree that it'd take a lot of work to get there. I don't think that the problem with either parse-tree-copying or list usage is that no performance benefits are to be had; I think it's that the amount of work required to get those benefits is pretty large. If it were otherwise, somebody likely would have done it before now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: POC: converting Lists into arrays
On Mon, 4 Mar 2019 at 07:29, Tom Lane wrote: > > Andres Freund writes: > > I still regularly see list overhead matter in production workloads. A > > lot of it being memory allocator overhead, which is why I'm concerned > > with a rewrite that doesn't reduce the number of memory allocations. > > Well, I did that in the v3 patch, and it still hasn't moved the needle > noticeably in any test case I've tried. At this point I'm really > struggling to see a reason why we shouldn't just mark this patch rejected > and move on. If you have test cases that suggest differently, please > show them don't just handwave. I think we discussed this before, but... if this patch is not a win by itself (and we've already seen it's not really causing much in the way of regression, if any), then we need to judge it on what else we can do to exploit the new performance characteristics of List. For example list_nth() is now deadly fast. My primary interest here is getting rid of a few places where we build an array version of some List so that we can access the Nth element more quickly. What goes on in ExecInitRangeTable() is not particularly great for queries to partitioned tables with a large number of partitions where only one survives run-time pruning. I've hacked together a patch to show you what wins we can have with the new list implementation. Using the attached, (renamed to .txt to not upset CFbot) I get: setup: create table hashp (a int, b int) partition by hash (a); select 'create table hashp'||x||' partition of hashp for values with (modulus 1, remainder '||x||');' from generate_Series(0,) x; \gexec alter table hashp add constraint hashp_pkey PRIMARY KEY (a); postgresql.conf plan_cache_mode = force_generic_plan max_parallel_workers_per_gather=0 max_locks_per_transaction=256 bench.sql \set p random(1,1) select * from hashp where a = :p; master: tps = 189.499654 (excluding connections establishing) tps = 195.102743 (excluding connections establishing) tps = 194.338813 (excluding connections establishing) your List reimplementation v3 + attached tps = 12852.003735 (excluding connections establishing) tps = 12791.834617 (excluding connections establishing) tps = 12691.515641 (excluding connections establishing) The attached does include [1], but even with just that the performance is not as good as with the arraylist plus the follow-on exploits I added. Now that we have a much faster bms_next_member() some form of what in there might be okay. A profile shows that in this workload we're still spending 42% of the 12k TPS in hash_seq_search(). That's due to LockReleaseAll() having a hard time of it due to the bloated lock table from having to build the generic plan with 10k partitions. [2] aims to fix that, so likely we'll be closer to 18k TPS, or about 100x faster. In fact, I should test that... tps = 18763.977940 (excluding connections establishing) tps = 18589.531558 (excluding connections establishing) tps = 19011.295770 (excluding connections establishing) Yip, about 100x. I think these are worthy goals to aspire to. [1] https://commitfest.postgresql.org/22/1897/ [2] https://commitfest.postgresql.org/22/1993/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index 2048d71535..63554b3057 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -1610,6 +1610,7 @@ recordDependencyOnSingleRelExpr(const ObjectAddress *depender, rte.rtekind = RTE_RELATION; rte.relid = relId; rte.relkind = RELKIND_RELATION; /* no need for exactness here */ + rte.delaylock = false; rte.rellockmode = AccessShareLock; context.rtables = list_make1(list_make1()); diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c index 0208388af3..0ac564dd77 100644 --- a/src/backend/commands/createas.c +++ b/src/backend/commands/createas.c @@ -516,6 +516,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo) rte->rtekind = RTE_RELATION; rte->relid = intoRelationAddr.objectId; rte->relkind = relkind; + rte->delaylock = false; rte->rellockmode = RowExclusiveLock; rte->requiredPerms = ACL_INSERT; diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c index fe99096efc..9b93c76aca 100644 --- a/src/backend/executor/execCurrent.c +++ b/src/backend/executor/execCurrent.c @@ -96,13 +96,13 @@ execCurrentOf(CurrentOfExpr *cexpr, { ExecRowMark *erm; Index i; - + int len = list_length(queryDesc->estate->es_range_table); /* * Here, the query must have exactly one FOR UPDATE/SHARE reference to * the target table, and we dig the ctid info out of that.
Re: POC: converting Lists into arrays
Andres Freund writes: > On 2019-03-02 18:11:43 -0500, Tom Lane wrote: >> I wonder what test cases Andres has been looking at that convince >> him that we need a reimplementation of Lists. > My main observation was from when the expression evaluation was using > lists all over. List iteration overhead was very substantial there. But > that's not a problem anymore, because all of those are gone now due to > the expression rewrite. I personally wasn't actually advocating for a > new list implementation, I was/am advocating that we should move some > tasks over to a more optimized representation. I doubt that you'll get far with that; if this experiment is anything to go by, it's going to be really hard to make the case that twiddling the representation of widely-known data structures is worth the work and bug hazards. > I still regularly see list overhead matter in production workloads. A > lot of it being memory allocator overhead, which is why I'm concerned > with a rewrite that doesn't reduce the number of memory allocations. Well, I did that in the v3 patch, and it still hasn't moved the needle noticeably in any test case I've tried. At this point I'm really struggling to see a reason why we shouldn't just mark this patch rejected and move on. If you have test cases that suggest differently, please show them don't just handwave. The cases I've been looking at suggest to me that we'd make far more progress on the excessive-palloc'ing front if we could redesign things to reduce unnecessary copying of parsetrees. Right now the planner does an awful lot of copying because of fear of unwanted modifications of multiply-linked subtrees. I suspect that we could reduce that overhead with some consistently enforced rules about not scribbling on input data structures; but it'd take a lot of work to get there, and I'm afraid it'd be easily broken :-( regards, tom lane
Re: POC: converting Lists into arrays
Hi, On 2019-03-02 18:11:43 -0500, Tom Lane wrote: > I wonder what test cases Andres has been looking at that convince > him that we need a reimplementation of Lists. My main observation was from when the expression evaluation was using lists all over. List iteration overhead was very substantial there. But that's not a problem anymore, because all of those are gone now due to the expression rewrite. I personally wasn't actually advocating for a new list implementation, I was/am advocating that we should move some tasks over to a more optimized representation. I still regularly see list overhead matter in production workloads. A lot of it being memory allocator overhead, which is why I'm concerned with a rewrite that doesn't reduce the number of memory allocations. And a lot of it is stuff that you won't see in pgbench - e.g. there's a lot of production queries that join a bunch of tables with a few dozen columns, where e.g. all the targetlists are much longer than what you'd see in pgbench -S. Greetings, Andres Freund
Re: POC: converting Lists into arrays
Here's a v3 incorporating Andres' idea of trying to avoid a separate palloc for the list cell array. In a 64-bit machine we can have up to five ListCells in the initial allocation without actually increasing space consumption at all compared to the old code. So only when a List grows larger than that do we need more than one palloc. I'm still having considerable difficulty convincing myself that this is enough of a win to justify the bug hazards we'll introduce, though. On test cases like "pg_bench -S" it seems to be pretty much within the noise level of being the same speed as HEAD. I did see a nice improvement in the test case described in https://www.postgresql.org/message-id/6970.1545327...@sss.pgh.pa.us but considering that that's still mostly a tight loop in match_eclasses_to_foreign_key_col, it doesn't seem very interesting as an overall figure of merit. I wonder what test cases Andres has been looking at that convince him that we need a reimplementation of Lists. regards, tom lane reimplement-List-as-array-3.patch.gz Description: reimplement-List-as-array-3.patch.gz
Re: POC: converting Lists into arrays
Here's a rebased version of the main patch. David Rowley writes: > The only thing that I did to manage to speed the patch up was to ditch > the additional NULL test in lnext(). I don't see why that's required > since lnext(NULL) would have crashed with the old implementation. I adopted this idea. I think at one point where I was fooling with different implementations for foreach(), it was necessary that lnext() be cool with a NULL input; but as things stand now, it's not. I haven't done anything else in the performance direction, but am planning to play with that next. I did run through all the list_delete_foo callers and fix the ones that were still busted. I also changed things so that with DEBUG_LIST_MEMORY_USAGE enabled, list deletions would move the data arrays around, in hopes of catching more stale-pointer problems. Depressingly, check-world still passed with that added, even before I'd fixed the bugs I found by inspection. This does not speak well for the coverage of our regression tests. regards, tom lane reimplement-List-as-array-2.patch.gz Description: reimplement-List-as-array-2.patch.gz
Re: POC: converting Lists into arrays
David Rowley writes: > On Thu, 28 Feb 2019 at 09:26, Tom Lane wrote: >> 0001 below does this. I found a couple of places that could use >> forfive(), as well. I think this is a clear legibility and >> error-proofing win, and we should just push it. > I've looked over this and I agree that it's a good idea. Reducing the > number of lnext() usages seems like a good idea in order to reduce the > footprint of the main patch. I've pushed that; thanks for reviewing! >> 0002 below does this. I'm having a hard time deciding whether this >> part is a good idea or just code churn. It might be more readable >> (especially to newbies) but I can't evaluate that very objectively. > I'm less decided on this. Yeah, I think I'm just going to drop that idea. People didn't seem very sold on list_cell_is_last() being a readability improvement, and it certainly does nothing to reduce the footprint of the main patch. I now need to rebase the main patch over what I pushed; off to do that next. regards, tom lane
Re: POC: converting Lists into arrays
Andrew Gierth writes: > To get a reliable measurement of timing changes less than around 3%, > what you have to do is this: pick some irrelevant function and add > something like an asm directive that inserts a variable number of NOPs, > and do a series of test runs with different values. Good point. If you're looking at a microbenchmark that only exercises a small amount of code, it can be way worse than that. I was reminded of this the other day while fooling with the problem discussed in https://www.postgresql.org/message-id/flat/6970.1545327...@sss.pgh.pa.us in which we were spending huge amounts of time in a tight loop in match_eclasses_to_foreign_key_col. I normally run with --enable-cassert unless I'm trying to collect performance data; so I rebuilt with --disable-cassert, and was bemused to find out that that test case ran circa 20% *slower* in the non-debug build. This is silly on its face, and even more so when you notice that match_eclasses_to_foreign_key_col itself contains no Asserts and so its machine code is unchanged by the switch. (I went to the extent of comparing .s files to verify this.) So that had to have been down to alignment/cacheline issues triggered by moving said function around. I doubt the case would be exactly reproducible on different hardware or toolchain, but another platform would likely show similar issues on some case or other. tl;dr: even a 20% difference might be nothing more than an artifact. regards, tom lane
Re: POC: converting Lists into arrays
> "David" == David Rowley writes: David> I went and had a few adventures with this patch to see if I David> could figure out why the small ~1% regression exists. Just changing the number of instructions (even in a completely unrelated place that's not called during the test) can generate performance variations of this size, even when there's no real difference. To get a reliable measurement of timing changes less than around 3%, what you have to do is this: pick some irrelevant function and add something like an asm directive that inserts a variable number of NOPs, and do a series of test runs with different values. See http://tinyurl.com/op9qg8a for an example of the kind of variation that one can get; this plot records timing runs where each different padding size was tested 3 times (non-consecutively, so you can see how repeatable the test result is for each size), each timing is actually the average of the last 10 of 11 consecutive runs of the test. To establish a 1% performance benefit or regression you need to show that there's still a difference _AFTER_ taking this kind of spooky-action-at-a-distance into account. For example, in the test shown at the link, if a substantive change to the code moved the upper and lower bounds of the output from (6091,6289) to (6030,6236) then one would be justified in claiming it as a 1% improvement. Such is the reality of modern CPUs. -- Andrew (irc:RhodiumToad)
Re: POC: converting Lists into arrays
David Rowley writes: > I went and had a few adventures with this patch to see if I could > figure out why the small ~1% regression exists. Thanks for poking around! > ... I had > suspected it was the lcons() calls being expensive because then need > to push the elements up one place each time, not something that'll > scale well with larger lists. I just did some looking around at lcons() calls, and it's hard to identify any that seem like they would be performance-critical. I did notice a number of places that think that lcons'ing a item onto a list, and later stripping it off with list_delete_first, is efficient. With the new implementation it's far cheaper to lappend and then list_truncate instead, at least if the list is long. If the list order matters then that's not an option, but I found some places where it doesn't matter so we could get an easy win. Still, it wasn't obvious that this would move the needle at all. > I then tried hacking at the foreach() macro after wondering if the > lnext() call was somehow making things difficult for the compiler to > predict what cell would come next. Yeah, my gut feeling right now is that foreach() is producing crummy code, though it's not obvious why it would need to. Maybe some micro-optimization is called for. But I've not had time to pursue it. > The only thing that I did to manage to speed the patch up was to ditch > the additional NULL test in lnext(). I don't see why that's required > since lnext(NULL) would have crashed with the old implementation. Hmmm ... > Perhaps if we're not going to see gains from the patch alone then > we'll need to tag on some of the additional stuff that will take > advantage of list_nth() being fast and test the performance of it all > again. Yeah, evaluating this patch in complete isolation is a bit unfair. Still, it'd be nice to hold the line in advance of any follow-on improvements. regards, tom lane
Re: POC: converting Lists into arrays
On Tue, 26 Feb 2019 at 18:34, Tom Lane wrote: > > David Rowley writes: > > Using the attached patch (as text file so as not to upset the CFbot), > > which basically just measures and logs the time taken to run > > pg_plan_query. ... > > Surprisingly it took 1.13% longer. I did these tests on an AWS > > md5.large instance. > > Interesting. Seems to suggest that maybe the cases I discounted > as being infrequent aren't so infrequent? Another possibility > is that the new coding adds more cycles to foreach() loops than > I'd hoped for. I went and had a few adventures with this patch to see if I could figure out why the small ~1% regression exists. Profiling did not prove very useful as I saw none of the list functions come up. I had suspected it was the lcons() calls being expensive because then need to push the elements up one place each time, not something that'll scale well with larger lists. After changing things so that a new "first" element index in the List would allow new_head_cell() to just move everything to the end of the list and mark the start of the actual data... I discovered that slowed things down further... Likely due to all the additional arithmetic work required to find the first element. I then tried hacking at the foreach() macro after wondering if the lnext() call was somehow making things difficult for the compiler to predict what cell would come next. I experimented with the following monstrosity: for ((cell) = list_head(l); ((cell) && (cell) < &((List *) l)->elements[((List *) l)->first + ((List *) l)->length]) || (cell = NULL) != NULL; cell++) it made things worse again... It ended up much more ugly than I thought it would have as I had to account for an empty list being NIL and the fact that we need to make cell NULL after the loop is over. I tried a few other things... I didn't agree with your memmove() in list_concat(). I think memcpy() is fine, even when the list pointers are the same since we never overwrite any live cell values. Strangely I found memcpy slower than memmove... ? The only thing that I did to manage to speed the patch up was to ditch the additional NULL test in lnext(). I don't see why that's required since lnext(NULL) would have crashed with the old implementation. Removing this changed the 1.13% regression into a ~0.8% regression, which at least does show that the foreach() implementation can have an effect on performance. > Anyway, it's just a POC; the main point at this stage is to be > able to make such comparisons at all. If it turns out that we > *can't* make this into a win, then all that bellyaching about > how inefficient Lists are was misinformed ... My primary concern is how much we bend over backwards because list_nth() performance is not O(1). I know from my work on partitioning that ExecInitRangeTable()'s building of es_range_table_array has a huge impact for PREPAREd plans for simple PK lookup SELECT queries to partitioned tables with a large number of partitions, where only 1 of which will survive run-time pruning. I could get the execution speed of such a query with 300 partitions to within 94% of the non-partitioned version if the rtable could be looked up O(1) in the executor natively, (that some changes to ExecCheckRTPerms() to have it skip rtable entries that don't require permission checks.). Perhaps if we're not going to see gains from the patch alone then we'll need to tag on some of the additional stuff that will take advantage of list_nth() being fast and test the performance of it all again. Attached is the (mostly worthless) series of hacks I made to your patch. It might save someone some time if they happened to wonder the same thing as I did. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services list_hacks.patch Description: Binary data
Re: POC: converting Lists into arrays
David Rowley writes: > On Thu, 28 Feb 2019 at 09:26, Tom Lane wrote: >> 0002 below does this. I'm having a hard time deciding whether this >> part is a good idea or just code churn. It might be more readable >> (especially to newbies) but I can't evaluate that very objectively. > I'm less decided on this. Having this now means you'll need to break > the signature of the macro the same way as you'll need to break > lnext(). It's perhaps easier to explain in the release notes about > lnext() having changed so that extension authors can go fix their code > (probably they'll know already from compile failures, but ). On > the other hand, if the list_cell_is_last() is new, then there will be > no calls to that in extensions anyway. Maybe it's better to do it at > the same time as the List reimplementation to ensure nobody needs to > change anything twice? Yeah, I was considering the idea of setting up the macro as "list_cell_is_last(list, cell)" from the get-go, with the first argument just going unused for the moment. That would be a good way to back-patch it if we go through with this. On the other hand, if we end up not pulling the trigger on the main patch, that'd look pretty silly ... regards, tom lane
Re: POC: converting Lists into arrays
On Thu, 28 Feb 2019 at 09:26, Tom Lane wrote: > > I wrote: > > I did find a number of places where getting rid of explicit lnext() > > calls led to just plain cleaner code. Most of these were places that > > could be using forboth() or forthree() and just weren't. There's > > also several places that are crying out for a forfour() macro, so > > I'm not sure why we've stubbornly refused to provide it. I'm a bit > > inclined to just fix those things in the name of code readability, > > independent of this patch. > > 0001 below does this. I found a couple of places that could use > forfive(), as well. I think this is a clear legibility and > error-proofing win, and we should just push it. I've looked over this and I agree that it's a good idea. Reducing the number of lnext() usages seems like a good idea in order to reduce the footprint of the main patch. The only thing of interest that I saw during the review was the fact that you've chosen to assign colexpr and coldefexpr before the continue in get_tablefunc(). We may not end up using those values if we find an ordinality column. I'm pretty sure it's not worth breaking the mould for that case though, but just noting it anyway. > > I also noticed that there's quite a lot of places that are testing > > lnext(cell) for being NULL or not. What that really means is whether > > this cell is last in the list or not, so maybe readability would be > > improved by defining a macro along the lines of list_cell_is_last(). > > Any thoughts about that? > > 0002 below does this. I'm having a hard time deciding whether this > part is a good idea or just code churn. It might be more readable > (especially to newbies) but I can't evaluate that very objectively. > I'm particularly unsure about whether we need two macros; though the > way I initially tried it with just list_cell_is_last() seemed kind of > double-negatively confusing in the places where the test needs to be > not-last. Also, are these macro names too long, and if so what would > be better? I'm less decided on this. Having this now means you'll need to break the signature of the macro the same way as you'll need to break lnext(). It's perhaps easier to explain in the release notes about lnext() having changed so that extension authors can go fix their code (probably they'll know already from compile failures, but ). On the other hand, if the list_cell_is_last() is new, then there will be no calls to that in extensions anyway. Maybe it's better to do it at the same time as the List reimplementation to ensure nobody needs to change anything twice? > Also: if we accept either or both of these, should we back-patch the > macro additions, so that these new macros will be available for use > in back-patched code? I'm not sure that forfour/forfive have enough > use-cases to worry about that; but the is-last macros would have a > better case for that, I think. I see no reason not to put forfour() and forfive() in the back branches. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: POC: converting Lists into arrays
On 2019-Feb-27, Tom Lane wrote: > I'm particularly unsure about whether we need two macros; though the > way I initially tried it with just list_cell_is_last() seemed kind of > double-negatively confusing in the places where the test needs to be > not-last. Also, are these macro names too long, and if so what would > be better? I think "!list_cell_is_last()" is just as readable, if not more, than the "is_not_last" locution: appendStringInfoChar(, '\''); if (!list_cell_is_last(l)) appendStringInfoString(, ", "); I'd go with a single macro. +1 for backpatching the new macros, too. I suspect extension authors are going to need to provide compatibility versions anyway, to be compilable against older minors. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: POC: converting Lists into arrays
Robert Haas writes: > On Wed, Feb 27, 2019 at 3:27 PM Tom Lane wrote: >> 0001 below does this. I found a couple of places that could use >> forfive(), as well. I think this is a clear legibility and >> error-proofing win, and we should just push it. > It sounds like some of these places might need a bigger restructuring > - i.e. to iterate over a list/vector of structs with 5 members instead > of iterating over five lists in parallel. Meh. Most of them are iterating over parsetree substructure, eg the components of a RowCompareExpr. So we could not change them without pretty extensive infrastructure changes including a catversion bump. Also, while the separated substructure is indeed a pain in the rear in some places, it's actually better for other uses. Two examples related to RowCompareExpr: * match_rowcompare_to_indexcol can check whether all the left-hand or right-hand expressions are nonvolatile with one easy call to contain_volatile_functions on the respective list. To do the same with a single list of sub-structs, it'd need bespoke code for each case to run through the list and consider only the correct subexpression of each sub-struct. * expand_indexqual_rowcompare can deal with commuted-clause cases just by swapping the list pointers at the start, it doesn't have to think about it over again for each pair of elements. So I'm not that excited about refactoring the data representation for these. I'm content (for now) with getting these places in line with the coding convention we use elsewhere for similar cases. regards, tom lane
Re: POC: converting Lists into arrays
On Wed, Feb 27, 2019 at 3:27 PM Tom Lane wrote: > 0001 below does this. I found a couple of places that could use > forfive(), as well. I think this is a clear legibility and > error-proofing win, and we should just push it. It sounds like some of these places might need a bigger restructuring - i.e. to iterate over a list/vector of structs with 5 members instead of iterating over five lists in parallel. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: POC: converting Lists into arrays
I wrote: > I did find a number of places where getting rid of explicit lnext() > calls led to just plain cleaner code. Most of these were places that > could be using forboth() or forthree() and just weren't. There's > also several places that are crying out for a forfour() macro, so > I'm not sure why we've stubbornly refused to provide it. I'm a bit > inclined to just fix those things in the name of code readability, > independent of this patch. 0001 below does this. I found a couple of places that could use forfive(), as well. I think this is a clear legibility and error-proofing win, and we should just push it. > I also noticed that there's quite a lot of places that are testing > lnext(cell) for being NULL or not. What that really means is whether > this cell is last in the list or not, so maybe readability would be > improved by defining a macro along the lines of list_cell_is_last(). > Any thoughts about that? 0002 below does this. I'm having a hard time deciding whether this part is a good idea or just code churn. It might be more readable (especially to newbies) but I can't evaluate that very objectively. I'm particularly unsure about whether we need two macros; though the way I initially tried it with just list_cell_is_last() seemed kind of double-negatively confusing in the places where the test needs to be not-last. Also, are these macro names too long, and if so what would be better? Also: if we accept either or both of these, should we back-patch the macro additions, so that these new macros will be available for use in back-patched code? I'm not sure that forfour/forfive have enough use-cases to worry about that; but the is-last macros would have a better case for that, I think. regards, tom lane diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c index 47e80ae..832c3e9 100644 --- a/src/backend/access/common/tupdesc.c +++ b/src/backend/access/common/tupdesc.c @@ -902,23 +902,12 @@ BuildDescFromLists(List *names, List *types, List *typmods, List *collations) desc = CreateTemplateTupleDesc(natts); attnum = 0; - - l2 = list_head(types); - l3 = list_head(typmods); - l4 = list_head(collations); - foreach(l1, names) + forfour(l1, names, l2, types, l3, typmods, l4, collations) { char *attname = strVal(lfirst(l1)); - Oid atttypid; - int32 atttypmod; - Oid attcollation; - - atttypid = lfirst_oid(l2); - l2 = lnext(l2); - atttypmod = lfirst_int(l3); - l3 = lnext(l3); - attcollation = lfirst_oid(l4); - l4 = lnext(l4); + Oid atttypid = lfirst_oid(l2); + int32 atttypmod = lfirst_int(l3); + Oid attcollation = lfirst_oid(l4); attnum++; diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index db3777d..7cbf9d3 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -1683,7 +1683,6 @@ ExecInitExprRec(Expr *node, ExprState *state, *l_opfamily, *l_inputcollid; ListCell *lc; -int off; /* * Iterate over each field, prepare comparisons. To handle @@ -1695,20 +1694,11 @@ ExecInitExprRec(Expr *node, ExprState *state, Assert(list_length(rcexpr->opfamilies) == nopers); Assert(list_length(rcexpr->inputcollids) == nopers); -off = 0; -for (off = 0, - l_left_expr = list_head(rcexpr->largs), - l_right_expr = list_head(rcexpr->rargs), - l_opno = list_head(rcexpr->opnos), - l_opfamily = list_head(rcexpr->opfamilies), - l_inputcollid = list_head(rcexpr->inputcollids); - off < nopers; - off++, - l_left_expr = lnext(l_left_expr), - l_right_expr = lnext(l_right_expr), - l_opno = lnext(l_opno), - l_opfamily = lnext(l_opfamily), - l_inputcollid = lnext(l_inputcollid)) +forfive(l_left_expr, rcexpr->largs, + l_right_expr, rcexpr->rargs, + l_opno, rcexpr->opnos, + l_opfamily, rcexpr->opfamilies, + l_inputcollid, rcexpr->inputcollids) { Expr *left_expr = (Expr *) lfirst(l_left_expr); Expr *right_expr = (Expr *) lfirst(l_right_expr); diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c index 324356e..8b29437 100644 --- a/src/backend/executor/nodeIndexscan.c +++ b/src/backend/executor/nodeIndexscan.c @@ -1332,12 +1332,12 @@ ExecIndexBuildScanKeys(PlanState *planstate, Relation index, { /* (indexkey, indexkey, ...) op (expression, expression, ...) */ RowCompareExpr *rc = (RowCompareExpr *) clause; - ListCell *largs_cell = list_head(rc->largs); - ListCell *rargs_cell = list_head(rc->rargs); - ListCell *opnos_cell = list_head(rc->opnos); - ListCell *collids_cell = list_head(rc->inputcollids); ScanKey first_sub_key; int n_sub_key; + ListCell *largs_cell; + ListCell *rargs_cell; + ListCell *opnos_cell; + ListCell *collids_cell; Assert(!isorderby); @@ -1346,19 +1346,22 @@
Re: POC: converting Lists into arrays
I wrote: > I had an idea that perhaps is worth considering --- upthread I rejected > the idea of deleting lnext() entirely, but what if we did so? We could > redefine foreach() to not use it: > #define foreach(cell, l) \ > for (int cell##__index = 0; \ > (cell = list_nth_cell(l, cell##__index)) != NULL; \ > cell##__index++) > I think this would go a very long way towards eliminating the hazards > associated with iterating around a list-modification operation. I spent some time trying to modify my patch to work that way, but abandoned the idea before completing it, because it became pretty clear that it is a bad idea. There are at least two issues: 1. In many places, the patch as I had it only required adding an additional parameter to lnext() calls. Removing lnext() altogether is far more invasive, requiring restructuring loop logic in many places where we otherwise wouldn't need to. Since most of the point of this proposal is to not have a larger patch footprint than we have to, that seemed like a step in the wrong direction. 2. While making foreach() work this way might possibly help in avoiding writing bad code in the first place, a loop of this form is really just about as vulnerable to being broken by list insertions/deletions as what I had before. If you don't make suitable adjustments to the integer index after an insertion/deletion then you're liable to skip over, or double-visit, some list entries; and there's nothing here to help you notice that you need to do that. Worse, doing things like this utterly destroys our chance of detecting such errors, because there's no state being kept that's in any way checkable. I was driven to realize point 2 by noticing, while trying to get rid of some lnext calls, that I'd mostly failed in the v1 patch to fix loops that contain embedded list_delete() calls other than list_delete_cell(). This is because the debug support I'd added failed to force relocation of lists after a deletion (unlike the insertion cases). It won't take much to add such relocation, and I'll go do that; but with an integer-index-based loop implementation we've got no chance of having debug support that could catch failure to update the loop index. So I think that idea is a failure, and going forward with the v1 approach has better chances. I did find a number of places where getting rid of explicit lnext() calls led to just plain cleaner code. Most of these were places that could be using forboth() or forthree() and just weren't. There's also several places that are crying out for a forfour() macro, so I'm not sure why we've stubbornly refused to provide it. I'm a bit inclined to just fix those things in the name of code readability, independent of this patch. I also noticed that there's quite a lot of places that are testing lnext(cell) for being NULL or not. What that really means is whether this cell is last in the list or not, so maybe readability would be improved by defining a macro along the lines of list_cell_is_last(). Any thoughts about that? regards, tom lane
Re: POC: converting Lists into arrays
David Rowley writes: > Using the attached patch (as text file so as not to upset the CFbot), > which basically just measures and logs the time taken to run > pg_plan_query. ... > Surprisingly it took 1.13% longer. I did these tests on an AWS > md5.large instance. Interesting. Seems to suggest that maybe the cases I discounted as being infrequent aren't so infrequent? Another possibility is that the new coding adds more cycles to foreach() loops than I'd hoped for. Anyway, it's just a POC; the main point at this stage is to be able to make such comparisons at all. If it turns out that we *can't* make this into a win, then all that bellyaching about how inefficient Lists are was misinformed ... regards, tom lane