Hi ne 26. 11. 2023 v 18:56 odesÃlatel Dmitry Dolgov <9erthali...@gmail.com> napsal:
> > On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote: > > so 18. 11. 2023 v 15:54 odesÃlatel Dmitry Dolgov <9erthali...@gmail.com> > > napsal: > > > As a side note, I'm intended to go one more time through the first few > > > patches introducing the basic functionality, and then mark it as ready > > > in CF. I can't break the patch in testing since quite long time, and > for > > > most parts the changes make sense to me. > > > > I marked pg_session_variables function as PARALLEL RESTRICTED, and did > > rebase > > So, after one week of uninterrupted evening reviews I've made it through > the first four patches :) > > It's a decent job -- more than once, looking at the code, I thought I > could construct a case when it's going to blow up, but everything was > working just fine. Yet, I think the patch still has to be reshaped a bit > before moving forward. I've got a couple proposals of different nature: > high level changes (you probably won't like some of them, but I'm sure > they're going to be useful), technical code-level improvements/comments, > and few language changes. With those changes in mind I would be > satisfied with the patch, and hopefully they would also make it easier > for a potential committer to pick it up. > > # High level proposals > > * I would suggest reducing the scope of the patch as much as possible, > and not just by trimming on the edges, but rather following Phileas > Fogg's example with the steamboat Henrietta -- get rid of all > non-essential parts. This will make this rather large patch more > approachable for others. > > For that one can concentrate only on the first two patches plus the > fourth one (memory cleanup after dropping variables), leaving DISCARD, > ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the > future. > > Another thing in this context would be to evaluate plpgsql support for > this feature. You know the use case better than me, how important it > is? Is it an intrinsic part of the feature, or session variables could > be still valuable enough even without plpgsql? From what I see > postponing plgpsql will make everything about ~800 lines lighter (most > likely more), and also allow to ignore couple of concerns about the > implementation (about this later). > > * The new GUC session_variables_ambiguity_warning is definitely going to > cause many objections, it's another knob to manage very subtle > behaviour detail very few people will ever notice. I see the point > behind warning about ambiguity, so probably it makes sense to bite the > bullet and decide one way or another. The proposal is to warn always > in potentially ambiguous situations, and if concerns are high about > logging too much, maybe do the warning on lower logging levels. > > # Code-level observations > > * It feels a bit awkward to have varid assignment logic in a separate > function, what about adding an argument with varid to > CreateVariableDestReceiver? SetVariableDestReceiverVarid still could > be used for CreateDestReceiver. > > /* > * Initially create a DestReceiver object. > */ > DestReceiver * > CreateVariableDestReceiver(void) > > /* > * Set parameters for a VariableDestReceiver. > * Should be called right after creating the DestReceiver. > */ > void > SetVariableDestReceiverVarid(DestReceiver *self, Oid varid) > > * It's worth it to add a commentary here explaining why it's fine to use > InvalidOid here: > > if (pstmt->commandType != CMD_UTILITY) > - ExplainOnePlan(pstmt, into, es, query_string, paramLI, > queryEnv, > + ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, > paramLI, queryEnv, > &planduration, (es->buffers ? &bufusage : > NULL)); > > My understanding is that since LetStmt is CMD_UTILITY, this branch > will never be visited for a session variable. > > * IIUC this one is introduced to exclude session variables from the normal > path with EXPR_KIND_UPDATE_TARGET: > > + EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target - > disallow > + * session > variables */ > > But the name doesn't sound right, maybe longer > EXPR_KIND_UPDATE_TARGET_NO_VARS is better? > > * I'm curious about this one, which exactly part does this change cover? > > @@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node, > - if (param->paramkind != PARAM_EXTERN) > + if (param->paramkind != PARAM_EXTERN && > + param->paramkind != PARAM_VARIABLE) > elog(ERROR, "unexpected paramkind: %d", (int) > param->paramkind); > > I've commented it out, but no tests were affected. > > * Does it mean there could be theoretically two LET statements at the > same time with different command type, one CMD_UTILITY, one > CMD_SELECT? Can it cause any issues? > > + /* > + * Inside PL/pgSQL we don't want to execute LET statement as > utility > + * command, because it disallow to execute expression as simple > + * expression. So for PL/pgSQL we have extra path, and we return > SELECT. > + * Then it can be executed by exec_eval_expr. Result is dirrectly > assigned > + * to target session variable inside PL/pgSQL LET statement > handler. This > + * is extra code, extra path, but possibility to get faster > execution is > + * too attractive. > + */ > + if (stmt->plpgsql_mode) > + return query; > + > > * This probably requires more explanation, is warning the only reason > for this change? > > + * > + * The session variables should not be used as target of PL/pgSQL > assign > + * statement. So we should to use special parser expr kind, that > disallow > + * usage of session variables. This block unwanted (in this > context) > + * possible warning so target PL/pgSQL's variable shadows some > session > + * variable. > */ > target = transformExpr(pstate, (Node *) cref, > - > EXPR_KIND_UPDATE_TARGET); > + > EXPR_KIND_ASSIGN_VARIABLE); > > * It would be great to have more commentaries here: > > typedef struct > { > DestReceiver pub; > Oid varid; > Oid typid; > int32 typmod; > int typlen; > int slot_offset; > int rows; > } SVariableState; > > For example, why does it make sense to have a field rows, where we > interested to only know the fact that there is exactly one column? > > * Why there is SetSessionVariableWithSecurityCheck, but no > GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done > in standard_ExecutorStart, which looks a bit out of place. > > * pg_session_variables -- you mention it exists only for testing. What > about moving it out into a separate patch for the sake of slimming > down? It looks like it's used only in tests for "memory cleanup" > patch, maybe they could be restructured to not require this function. > > * Probably it's time to drop unnecessary historical notes, like this: > > * Note: originally we enhanced a list xact_recheck_varids here. > Unfortunately > * it was not safe and a little bit too complex, because the sinval > callback > * function can be called when we iterate over xact_recheck_varids list. > * Another issue was the possibility of being out of memory when we > enhanced > * the list. So now we just switch flag in related entry sessionvars hash > table. > * We need to iterate over hash table on every sinval message, so extra two > * iteration over this hash table is not significant overhead (and we skip > * entries that don't require recheck). Now we do not have any memory > allocation > * in the sinval handler (This note can be removed before commit). > > * The second patch "Storage for session variables and SQL interface", > mentions DISCARD command: > > /* > * There is no guarantee of sessionvars being initialized, even when > * receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ] > * destroys the hash table entirely. > */ > > This command is implemented in another patch later one, so this > comment probably belong there. > > * This comment mentions a "direct access, without buffering": > > /* > * Direct access to session variable (without buffering). Because > * returned value can be used (without an assignement) after the > * referenced session variables is updated, we have to use an copy > * of stored value every time. > */ > *op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid, > > op->resnull, > > op->d.vparam.vartype); > > But GetSessionVariableWithTypeCheck goes through get_session_variable > and searches in the hash table. What "buffering" means in this > context? > > * GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid > expected_typid) > > Should the "WithTypeCheck" part be an argument of the > GetSessionVariable? To reduce the code duplication a bit. > > * Just out of curiosity, why TopTransactionContext? > > /* > * Store domain_check extra in TopTransactionContext. When we are > in > * other transaction, the domain_check_extra cache is not valid > * anymore. > */ > if (svar->domain_check_extra_lxid != MyProc->lxid) > svar->domain_check_extra = NULL; > > domain_check(svar->value, svar->isnull, > svar->typid, &svar->domain_check_extra, > TopTransactionContext); > > * In SVariableData it would be great to have more comments around > freeval, domain_check_extra, domain_check_extra_lxid. > > * Nitpicking, but the term "shadowing" for ambiguity between a session > variable and a table column might be confusing, one can imagine there > is a connection between those two objects and one actively follows > ("shadows") the other one. > > * The second patch "Storage for session variables and SQL interface" > mentions in the documentation default and temporary variables: > > <para> > The value of a session variable is local to the current session. > Retrieving > a variable's value returns either a <literal>NULL</literal> or a > default > value, unless its value has been set to something else in the current > session using the <command>LET</command> command. The content of a > variable > is not transactional. This is the same as regular variables in PL > languages. > The session variables can be persistent or can be temporary. In both > cases, > the content of session variables is temporary and not shared (like an > content of temporary tables). > </para> > > They're implemented in the following patches, so it belongs there. > > * Nitpicking, maybe merge those two conditions together for readability? > > if (!needs_validation) > return; > > /* > * Reset, this flag here, before we start the validation. It can be > set to > * on by incomming sinval message. > */ > needs_validation = false; > > if (!sessionvars) > return; > > * This one is not very clear, what is the difference between "somewhere > inside a transaction" and "at the end of a transaction"? > > /* > * This routine can be called somewhere inside transaction or at an > transaction > * end. When atEOX argument is false, then we are inside > transaction, and we > * don't want to throw entries related to session variables dropped > in current > * transaction. > */ > > # Language topic > > Since this patch introduces a large body of documentation and > commentaries, I think it would benefit from a native speaker review. > I've stumbled upon few examples (attached with proposed wording, without > a diff extension to not confuse the CF bot), but otherwise if anyone > follows this thread, texts review is appreciated. > Thank you for your review. Next two weeks I'll not too much time to work on this patch - I have to work on some commercial work, and the week is Prague PgConf, so my reply will be slow. But after these events I'll concentrate on this patch. Regards Pavel