> On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> so 18. 11. 2023 v 15:54 odesÃlatel Dmitry Dolgov <[email protected]>
> napsal:
> > As a side note, I'm intended to go one more time through the first few
> > patches introducing the basic functionality, and then mark it as ready
> > in CF. I can't break the patch in testing since quite long time, and for
> > most parts the changes make sense to me.
>
> I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> rebase
So, after one week of uninterrupted evening reviews I've made it through
the first four patches :)
It's a decent job -- more than once, looking at the code, I thought I
could construct a case when it's going to blow up, but everything was
working just fine. Yet, I think the patch still has to be reshaped a bit
before moving forward. I've got a couple proposals of different nature:
high level changes (you probably won't like some of them, but I'm sure
they're going to be useful), technical code-level improvements/comments,
and few language changes. With those changes in mind I would be
satisfied with the patch, and hopefully they would also make it easier
for a potential committer to pick it up.
# High level proposals
* I would suggest reducing the scope of the patch as much as possible,
and not just by trimming on the edges, but rather following Phileas
Fogg's example with the steamboat Henrietta -- get rid of all
non-essential parts. This will make this rather large patch more
approachable for others.
For that one can concentrate only on the first two patches plus the
fourth one (memory cleanup after dropping variables), leaving DISCARD,
ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
future.
Another thing in this context would be to evaluate plpgsql support for
this feature. You know the use case better than me, how important it
is? Is it an intrinsic part of the feature, or session variables could
be still valuable enough even without plpgsql? From what I see
postponing plgpsql will make everything about ~800 lines lighter (most
likely more), and also allow to ignore couple of concerns about the
implementation (about this later).
* The new GUC session_variables_ambiguity_warning is definitely going to
cause many objections, it's another knob to manage very subtle
behaviour detail very few people will ever notice. I see the point
behind warning about ambiguity, so probably it makes sense to bite the
bullet and decide one way or another. The proposal is to warn always
in potentially ambiguous situations, and if concerns are high about
logging too much, maybe do the warning on lower logging levels.
# Code-level observations
* It feels a bit awkward to have varid assignment logic in a separate
function, what about adding an argument with varid to
CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
be used for CreateDestReceiver.
/*
* Initially create a DestReceiver object.
*/
DestReceiver *
CreateVariableDestReceiver(void)
/*
* Set parameters for a VariableDestReceiver.
* Should be called right after creating the DestReceiver.
*/
void
SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)
* It's worth it to add a commentary here explaining why it's fine to use
InvalidOid here:
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, paramLI,
queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
My understanding is that since LetStmt is CMD_UTILITY, this branch
will never be visited for a session variable.
* IIUC this one is introduced to exclude session variables from the normal
path with EXPR_KIND_UPDATE_TARGET:
+ EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target - disallow
+ * session
variables */
But the name doesn't sound right, maybe longer
EXPR_KIND_UPDATE_TARGET_NO_VARS is better?
* I'm curious about this one, which exactly part does this change cover?
@@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
- if (param->paramkind != PARAM_EXTERN)
+ if (param->paramkind != PARAM_EXTERN &&
+ param->paramkind != PARAM_VARIABLE)
elog(ERROR, "unexpected paramkind: %d", (int)
param->paramkind);
I've commented it out, but no tests were affected.
* Does it mean there could be theoretically two LET statements at the
same time with different command type, one CMD_UTILITY, one
CMD_SELECT? Can it cause any issues?
+ /*
+ * Inside PL/pgSQL we don't want to execute LET statement as utility
+ * command, because it disallow to execute expression as simple
+ * expression. So for PL/pgSQL we have extra path, and we return SELECT.
+ * Then it can be executed by exec_eval_expr. Result is dirrectly
assigned
+ * to target session variable inside PL/pgSQL LET statement handler.
This
+ * is extra code, extra path, but possibility to get faster execution is
+ * too attractive.
+ */
+ if (stmt->plpgsql_mode)
+ return query;
+
* This probably requires more explanation, is warning the only reason
for this change?
+ *
+ * The session variables should not be used as target of PL/pgSQL assign
+ * statement. So we should to use special parser expr kind, that
disallow
+ * usage of session variables. This block unwanted (in this context)
+ * possible warning so target PL/pgSQL's variable shadows some session
+ * variable.
*/
target = transformExpr(pstate, (Node *) cref,
- EXPR_KIND_UPDATE_TARGET);
+ EXPR_KIND_ASSIGN_VARIABLE);
* It would be great to have more commentaries here:
typedef struct
{
DestReceiver pub;
Oid varid;
Oid typid;
int32 typmod;
int typlen;
int slot_offset;
int rows;
} SVariableState;
For example, why does it make sense to have a field rows, where we
interested to only know the fact that there is exactly one column?
* Why there is SetSessionVariableWithSecurityCheck, but no
GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done
in standard_ExecutorStart, which looks a bit out of place.
* pg_session_variables -- you mention it exists only for testing. What
about moving it out into a separate patch for the sake of slimming
down? It looks like it's used only in tests for "memory cleanup"
patch, maybe they could be restructured to not require this function.
* Probably it's time to drop unnecessary historical notes, like this:
* Note: originally we enhanced a list xact_recheck_varids here. Unfortunately
* it was not safe and a little bit too complex, because the sinval callback
* function can be called when we iterate over xact_recheck_varids list.
* Another issue was the possibility of being out of memory when we enhanced
* the list. So now we just switch flag in related entry sessionvars hash table.
* We need to iterate over hash table on every sinval message, so extra two
* iteration over this hash table is not significant overhead (and we skip
* entries that don't require recheck). Now we do not have any memory allocation
* in the sinval handler (This note can be removed before commit).
* The second patch "Storage for session variables and SQL interface",
mentions DISCARD command:
/*
* There is no guarantee of sessionvars being initialized, even when
* receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ]
* destroys the hash table entirely.
*/
This command is implemented in another patch later one, so this
comment probably belong there.
* This comment mentions a "direct access, without buffering":
/*
* Direct access to session variable (without buffering). Because
* returned value can be used (without an assignement) after the
* referenced session variables is updated, we have to use an copy
* of stored value every time.
*/
*op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid,
op->resnull,
op->d.vparam.vartype);
But GetSessionVariableWithTypeCheck goes through get_session_variable
and searches in the hash table. What "buffering" means in this
context?
* GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid)
Should the "WithTypeCheck" part be an argument of the
GetSessionVariable? To reduce the code duplication a bit.
* Just out of curiosity, why TopTransactionContext?
/*
* Store domain_check extra in TopTransactionContext. When we are in
* other transaction, the domain_check_extra cache is not valid
* anymore.
*/
if (svar->domain_check_extra_lxid != MyProc->lxid)
svar->domain_check_extra = NULL;
domain_check(svar->value, svar->isnull,
svar->typid, &svar->domain_check_extra,
TopTransactionContext);
* In SVariableData it would be great to have more comments around
freeval, domain_check_extra, domain_check_extra_lxid.
* Nitpicking, but the term "shadowing" for ambiguity between a session
variable and a table column might be confusing, one can imagine there
is a connection between those two objects and one actively follows
("shadows") the other one.
* The second patch "Storage for session variables and SQL interface"
mentions in the documentation default and temporary variables:
<para>
The value of a session variable is local to the current session. Retrieving
a variable's value returns either a <literal>NULL</literal> or a default
value, unless its value has been set to something else in the current
session using the <command>LET</command> command. The content of a variable
is not transactional. This is the same as regular variables in PL languages.
The session variables can be persistent or can be temporary. In both cases,
the content of session variables is temporary and not shared (like an
content of temporary tables).
</para>
They're implemented in the following patches, so it belongs there.
* Nitpicking, maybe merge those two conditions together for readability?
if (!needs_validation)
return;
/*
* Reset, this flag here, before we start the validation. It can be set to
* on by incomming sinval message.
*/
needs_validation = false;
if (!sessionvars)
return;
* This one is not very clear, what is the difference between "somewhere
inside a transaction" and "at the end of a transaction"?
/*
* This routine can be called somewhere inside transaction or at an
transaction
* end. When atEOX argument is false, then we are inside transaction,
and we
* don't want to throw entries related to session variables dropped in
current
* transaction.
*/
# Language topic
Since this patch introduces a large body of documentation and
commentaries, I think it would benefit from a native speaker review.
I've stumbled upon few examples (attached with proposed wording, without
a diff extension to not confuse the CF bot), but otherwise if anyone
follows this thread, texts review is appreciated.
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index d4d1d104198..adc8b21a595 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -3398,12 +3398,11 @@ NamesFromList(List *names)
* variable name and attribute name, and we count valid combinations.
*
* Returns oid of identified variable. When last field of names list is
- * identified as an attribute, then output attrname argument is set to
- * an string of this field.
+ * identified as an attribute, then the argument attrname contains the name of
+ * this field.
*
- * When there is not any valid combination, then we are sure, so the
- * list of names cannot to identify any session variable. In this case
- * we return InvalidOid.
+ * When there is is no valid combinations, we can be sure the specified list of
+ * names cannot identify a session variable. In this case we return InvalidOid.
*
* We can find more valid combination than one.
* Example: users can have session variable x in schema y, and
diff --git a/src/backend/commands/session_variable.c
b/src/backend/commands/session_variable.c
index 88957bd6624..06c6d49c491 100644
--- a/src/backend/commands/session_variable.c
+++ b/src/backend/commands/session_variable.c
@@ -51,10 +51,10 @@ typedef struct SVariableXActDropItem
* Values of session variables are stored in the backend local memory
* inside sessionvars hash table in binary format inside a dedicated memory
* context SVariableMemoryContext. The hash key is oid
- * of related entry in pg_variable table. But unambiguity of oid is
- * not guaranteed (in long time). There can be long time not active
- * session with stored value identified by one oid, and in other session
- * related variable can be dropped, assigned oid can be released, and
+ * of related entry in pg_variable table. But long term unambiguity of oid is
+ * not guaranteed. As an example, for a value with one oid a session can be
+ * inactive for long time, while in the meantime the related session varibale
+ * can be dropped in another session, assigned oid can be released, and
* theoreticaly this oid can be assigned to different session variable.
* At the end, the reading of value stored in old session should to fail,
* because related entry in pg_variable will not be consistent with
@@ -356,11 +356,11 @@ is_session_variable_valid(SVariable svar)
}
/*
- * It does check all possibly invalid entries against the system catalog.
+ * It checks all possibly invalid entries against the system catalog.
* During this validation, the system cache can be invalidated, and the
* some sinval message can be accepted. This routine doesn't ensure
* all living entries of sessionvars will have is_valid flag, but it ensures
- * so all entries are checked once.
+ * that all entries are checked once.
*
* This routine can be called somewhere inside transaction or at an transaction
* end. When atEOX argument is false, then we are inside transaction, and we
@@ -383,7 +383,7 @@ remove_invalid_session_variables(bool atEOX)
return;
/*
- * Reset, this flag here, before we start the validation. It can be set
to
+ * Reset this flag here, before we start the validation. It can be set
to
* on by incomming sinval message.
*/
needs_validation = false;
@@ -732,8 +732,8 @@ set_session_variable(SVariable svar, Datum value, bool
isnull)
* trustable (the invalidation message was not accepted for this
variable).
* When the variable is possibly invalid, force setup.
*
- * Do not it against passed svar, it should be unchanged, when an
assignment
- * is not successful (the datumCopy can fail).
+ * Do not do it against passed svar, it should be unchanged, when an
+ * assignment is not successful (the datumCopy can fail).
*/
if (!svar->is_valid)
{
@@ -1041,7 +1041,7 @@ GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
}
/*
- * Returns a copy of ths value of the session variable specified by its oid
+ * Returns a copy of the value of the session variable specified by its oid
* with a check of the expected type. Like previous GetSessionVariable, the
* caller is responsible for doing permission checks.
*/
diff --git a/src/backend/optimizer/plan/setrefs.c
b/src/backend/optimizer/plan/setrefs.c
index 212ddcb5da7..9860cf3f4d2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2099,7 +2099,7 @@ fix_expr_common(PlannerInfo *root, Node *node)
* Just for paranoia's sake, we make a copy of the node in either case.
*
* If it's a PARAM_VARIABLE, then we collect used session variables in
- * list root->glob->sessionVariable. We should to assign Param paramvarid
+ * list root->glob->sessionVariable. We should assign Param paramvarid
* too, and it is position of related session variable in mentioned list.
*/
static Node *
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 28523715765..c3edb30b79e 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -938,15 +938,15 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
/*
* Session variables are shadowed by columns, routine's
variables or
- * routine's arguments ever. We don't want to use session
variable
- * when it is not exactly shadowed, but RTE is valid like:
+ * routine's arguments. We don't want to use session variable
+ * when it is not exactly shadowed, but RTE like this is valid:
*
* CREATE TYPE T AS (c int); CREATE VARIABLE foo AS T; CREATE
TABLE
* foo(a int, b int);
*
* SELECT foo.a, foo.b, foo.c FROM foo;
*
- * This case can be messy and then we disallow it. When we
know, so
+ * This case can be messy and then we disallow it. When we know
that a
* possible variable will be shadowed, we try to identify
variable
* only when session_variables_ambiguity_warning is requested.
*/