Re: doc: Make selectivity example match wording

2022-07-17 Thread Dian M Fay
On Sat Jul 16, 2022 at 11:23 PM EDT, David G. Johnston wrote:
> Thanks for the review.  I generally like everything you said but it made me
> realize that I still didn't really understand the intent behind the
> formula.  I spent way too much time working that out for myself, then
> turned what I found useful into this v2 patch.
>
> It may need some semantic markup still but figured I'd see if the idea
> makes sense.
>
> I basically rewrote, in a bit different style, the same material into the
> code comments, then proceeded to rework the proof that was already present
> there.
>
> I did do this in somewhat of a vacuum.  I'm not inclined to learn this all
> start-to-end though.  If the abrupt style change is unwanted so be it.  I'm
> not really sure how much benefit the proof really provides.  The comments
> in the docs are probably sufficient for the code as well - just define why
> the three pieces of the formula exist and are packaged into a single
> multiplier called selectivity as an API choice.  I suspect once someone
> gets to that comment it is fair to assume some prior knowledge.
> Admittedly, I didn't really come into this that way...

Fair enough, I only know what I can glean from the comments in
eqjoinsel_inner and friends myself. I do think even this smaller change
is valuable because the current example talks about using an algorithm
based on the number of distinct values immediately after showing
n_distinct == -1, so making it clear that this case uses num_rows
instead is helpful.

"This value does get scaled in the non-unique case" again could be more
specific ("since here all values are unique, otherwise the calculation
uses num_distinct" perhaps?). But past that quibble I'm good.




Re: doc: Make selectivity example match wording

2022-07-02 Thread Dian M Fay
On Thu Jun 9, 2022 at 11:57 AM EDT, David G. Johnston wrote:
> Reposting this to its own thread.
>
> https://www.postgresql.org/message-id/flat/CAKFQuwby1aMsJDMeibaBaohgoaZhivAo4WcqHC1%3D9-GDZ3TSng%40mail.gmail.com
>
> doc: make unique non-null join selectivity example match the prose
>
> The description of the computation for the unique, non-null,
> join selectivity describes a division by the maximum of two values,
> while the example shows a multiplication by their reciprocal.  While
> equivalent the max phrasing is easier to understand; which seems
> more important here than precisely adhering to the formula used
> in the code (for which either variant is still an approximation).
>
> While both num_distinct and num_rows are equal for a unique column
> both the concept and formula use row count (10,000) and the
> field num_distinct has already been set to mean the specific value
> present in the pg_stats table (i.e, -1), so use num_rows here.

Pointing out that n_distinct = -1 is helpful but changing "because" to
"and" suggests that the missing MCV info is coincidental or a side
effect. Is there any case in which the stronger "because" wouldn't be
appropriate?

The second parenthetical (num_rows, not shown, but "tenk") took me a
minute to get since the row counts are only apparent on looking somewhat
closely at the other examples in the chapter. num_rows also isn't a
column in pg_stats which the "not shown" could be taken to imply; it's
sourced from somewhere else and only given as num_rows in this example.
How's '(as num_rowsN, 10,000 for both "tenk" example tables)'?

By "this value does get scaled in the non-unique case" do you mean it
relies on n_distinct as in the uncorrected algorithm listing? If so I
think it'd help to specify that.

You didn't take this line on but "This is, subtract the null
fraction..." omits the step of multiplying the complements of the null
fractions together before dividing.

Should n_distinct and num_rows be d in the text?




Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)

2021-11-11 Thread Dian M Fay
On Thu Nov 11, 2021 at 3:36 PM EST, Tom Lane wrote:
> I thought about this some more and realized exactly why I wanted to
> restrict the change to cases where the other side is a plain foreign
> Var: that way, if anything surprising happens, we can blame it
> directly on the user having declared a local column with a different
> type from the remote column.
>
> That being the case, I took a closer look at deparseVar and realized
> that we can't simply check "IsA(node, Var)": some Vars in the
> expression can belong to local tables. We need to verify that the Var
> is one that will print as a remote column reference.

Eminently reasonable all around! `git apply` insisted that the v8 patch
didn't (apply, that is), but `patch -p1` liked it fine. I've put it
through a few paces and it seems good; what needs to happen next?




Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)

2021-11-10 Thread Dian M Fay
On Mon Nov 8, 2021 at 4:50 PM EST, Tom Lane wrote:
> Um. I doubt that that's any safer than the v5 patch. As an example,
> casting between int4 and oid is just a RelabelType, but the comparison
> semantics change completely (signed vs. unsigned); so there's not a
> good reason to think this is constraining things more than v5 did.
>
> It might be better if you'd further restricted the structure to be only
> COERCE_IMPLICIT_CAST RelabelTypes, since we don't normally make casts
> implicit if they significantly change semantics. Also, this'd ensure
> that the operand printed for the remote server is just a bare Var
> (cf. deparseRelabelType). But even with that I'm feeling antsy about
> whether this will allow any semantic surprises.

I've split the suppression for RelabelTypes with implicit cast check
into a second patch over the core v7 change. As far as testing goes, \dC
lists implicit casts, but most of those I've tried seem to wind up
deparsing as Vars. I've been able to manifest RelabelTypes with varchar,
cidr, and remote char to local varchar, but that's about it. Any ideas
for validating it further, off the top of your head?
From cf0866241f051086cf1b19647102679149214565 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Mon, 8 Nov 2021 22:52:07 -0500
Subject: [PATCH v7 1/2] Suppress explicit casts of safe Consts in postgres_fdw

Comparisons between Consts of UNKNOWN type (string literals or NULLs)
and Vars can rely on the remote server's operator resolution heuristic
to match the Const to the remote column type instead of forcing the type
from the local schema definition. This allows the local schema to
diverge usefully from the remote, for example by declaring a remote enum
column as text and eliminating the need to keep the enum synchronized
across databases.
---
 contrib/postgres_fdw/deparse.c|  75 ++--
 .../postgres_fdw/expected/postgres_fdw.out| 113 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |  18 +++
 3 files changed, 165 insertions(+), 41 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd66681..83c1749882 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -2695,9 +2695,14 @@ deparseVar(Var *node, deparse_expr_cxt *context)
  * Deparse given constant value into context->buf.
  *
  * This function has to be kept in sync with ruleutils.c's get_const_expr.
- * As for that function, showtype can be -1 to never show "::typename" 
decoration,
- * or +1 to always show it, or 0 to show it only if the constant wouldn't be 
assumed
- * to be the right type by default.
+ *
+ * As in that function, showtype can be -1 to never show "::typename"
+ * decoration, +1 to always show it, or 0 to show it only if the constant
+ * wouldn't be assumed to be the right type by default.
+ *
+ * In addition, this code allows showtype to be -2 to indicate that we should
+ * not show "::typename" decoration if the constant is printed as an untyped
+ * literal or NULL (while in other cases, behaving as for showtype == 0).
  */
 static void
 deparseConst(Const *node, deparse_expr_cxt *context, int showtype)
@@ -2707,6 +2712,7 @@ deparseConst(Const *node, deparse_expr_cxt *context, int 
showtype)
booltypIsVarlena;
char   *extval;
boolisfloat = false;
+   boolisstring = false;
boolneedlabel;
 
if (node->constisnull)
@@ -2762,13 +2768,14 @@ deparseConst(Const *node, deparse_expr_cxt *context, 
int showtype)
break;
default:
deparseStringLiteral(buf, extval);
+   isstring = true;
break;
}
 
pfree(extval);
 
-   if (showtype < 0)
-   return;
+   if (showtype == -1)
+   return; /* never print type 
label */
 
/*
 * For showtype == 0, append ::typename unless the constant will be
@@ -2788,7 +2795,13 @@ deparseConst(Const *node, deparse_expr_cxt *context, int 
showtype)
needlabel = !isfloat || (node->consttypmod >= 0);
break;
default:
-   needlabel = true;
+   if (showtype == -2)
+   {
+   /* label unless we printed it as an untyped 
string */
+   needlabel = !isstring;
+   }
+   else
+   needlabel = true;
break;
}
if (needlabel || showtype > 0)
@@ -2953,6 +2966,8 @@ deparseOpExpr(OpExpr *node, deparse_expr_cxt *context)
StringInfo  buf = context->buf;
HeapTuple   tuple;
Form_pg_operator form;
+   Expr  

Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)

2021-11-07 Thread Dian M Fay
On Tue Nov 2, 2021 at 7:10 PM EDT, Tom Lane wrote:
> I wrote:
> > Now that I've looked this over I'm starting to feel uncomfortable
> > again, because we can't actually be quite sure about how the remote
> > parser's heuristic will act.
>
> Actually ... we could make that a lot safer by insisting that the
> other input be a plain Var, which'd necessarily be a column of the
> foreign table. That would still cover most cases of practical
> interest, I think, and it would remove any question of whether
> implicit coercions had snuck in. It's more restrictive than I'd
> really like, but I think it's less likely to cause problems.

Here's v6! I started with restricting cast suppression to Const-Var
comparisons as you suggested. A few tests did regress (relative to the
unrestricted version) right out of the gate with comparisons to varchar
columns, since those become RelabelType nodes instead of Vars. After
reading the notes on RelabelType in primnodes.h, I *think* that that
"dummy" coercion is distinct from the operator input type coercion
you're talking about here:

> What we're checking is that leftType and rightType match, but that
> condition is applied to the inputs *after implicit type coercion to
> the operator's input types*. We can't be entirely sure about what our
> parser saw to begin with. Perhaps it'd be a good idea to strip any
> implicit coercions on the non-Const input before checking its type.

I allowed RelabelTypes over Vars to suppress casts as well. It's working
for me so far and the varchar comparison tests are back to passing, sans
casts.
From e0fe4e4b15403ffab2518f945c13ce0e22bb89a9 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Sun, 7 Nov 2021 17:53:17 -0500
Subject: [PATCH v6] Suppress explicit casts of safe Consts in postgres_fdw

Comparisons between Consts of UNKNOWN type (string literals or NULLs)
and Vars can rely on the remote server's operator resolution heuristic
to match the Const to the remote column type instead of forcing the type
from the local schema definition. This allows the local schema to
diverge usefully from the remote, for example by declaring a remote enum
column as text and eliminating the need to keep the enum synchronized
across databases.
---
 contrib/postgres_fdw/deparse.c|  81 ++--
 .../postgres_fdw/expected/postgres_fdw.out| 125 --
 contrib/postgres_fdw/sql/postgres_fdw.sql |  18 +++
 3 files changed, 177 insertions(+), 47 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd66681..2868d95a03 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -2695,9 +2695,14 @@ deparseVar(Var *node, deparse_expr_cxt *context)
  * Deparse given constant value into context->buf.
  *
  * This function has to be kept in sync with ruleutils.c's get_const_expr.
- * As for that function, showtype can be -1 to never show "::typename" 
decoration,
- * or +1 to always show it, or 0 to show it only if the constant wouldn't be 
assumed
- * to be the right type by default.
+ *
+ * As in that function, showtype can be -1 to never show "::typename"
+ * decoration, +1 to always show it, or 0 to show it only if the constant
+ * wouldn't be assumed to be the right type by default.
+ *
+ * In addition, this code allows showtype to be -2 to indicate that we should
+ * not show "::typename" decoration if the constant is printed as an untyped
+ * literal or NULL (while in other cases, behaving as for showtype == 0).
  */
 static void
 deparseConst(Const *node, deparse_expr_cxt *context, int showtype)
@@ -2707,6 +2712,7 @@ deparseConst(Const *node, deparse_expr_cxt *context, int 
showtype)
booltypIsVarlena;
char   *extval;
boolisfloat = false;
+   boolisstring = false;
boolneedlabel;
 
if (node->constisnull)
@@ -2762,13 +2768,14 @@ deparseConst(Const *node, deparse_expr_cxt *context, 
int showtype)
break;
default:
deparseStringLiteral(buf, extval);
+   isstring = true;
break;
}
 
pfree(extval);
 
-   if (showtype < 0)
-   return;
+   if (showtype == -1)
+   return; /* never print type 
label */
 
/*
 * For showtype == 0, append ::typename unless the constant will be
@@ -2788,7 +2795,13 @@ deparseConst(Const *node, deparse_expr_cxt *context, int 
showtype)
needlabel = !isfloat || (node->consttypmod >= 0);
break;
default:
-   needlabel = true;
+   if (showtype == -2)
+   {
+   

Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)

2021-10-23 Thread Dian M Fay
On Sun Sep 5, 2021 at 6:43 PM EDT, Tom Lane wrote:
> "Dian M Fay"  writes:
> > [ 0001-Suppress-explicit-casts-of-text-constants-in-postgre.patch ]
>
> I took a quick look at this. The restriction to type text seems like
> very obviously a hack rather than something we actually want; wouldn't
> it mean we fail to act in a large fraction of the cases where we'd
> like to suppress the cast?
>
> A second problem is that I don't think the business with a
> const_showtype
> context field is safe at all. As you've implemented it here, it would
> affect the entire RHS tree, including constants far down inside complex
> expressions that have nothing to do with the top-level semantics.
> (I didn't look closely, but I wonder if the regression failure you
> mentioned is associated with that.)
>
> I think that we only want to suppress the cast in cases where
> (1) the constant is directly an operand of the operator we're
> expecting the remote parser to use its same-type heuristic for, and
> (2) the constant will be deparsed as a string literal. (If it's
> deparsed as a number, boolean, etc, then it won't be initially
> UNKNOWN, so that heuristic won't be applied.)
>
> Now point 1 means that we don't really need to mess with keeping
> state in the recursion context. If we've determined at the level
> of the OpExpr that we can do this, including checking that the
> RHS operand IsA(Const), then we can just invoke deparseConst() on
> it directly instead of recursing via deparseExpr().
>
> Meanwhile, I suspect that point 2 might be best checked within
> deparseConst() itself, as that contains both the decision and the
> mechanism about how the Const will be printed. So that suggests
> that we should invent a new showtype code telling deparseConst()
> to act this way, and then supply that code directly when we
> invoke deparseConst directly from deparseOpExpr.
>
> BTW, don't we also want to be able to optimize cases where the Const
> is on the LHS rather than the RHS?
>
> regards, tom lane

Thanks Tom, that makes way more sense! I've attached a new patch which
tests operands and makes sure one side is a Const before feeding it to
deparseConst with a new showtype code, -2. The one regression is gone,
but I've left a couple of test output discrepancies for now which
showcase lost casts on the following predicates:

* date(c5) = '1970-01-17'::date
* ctid = '(0,2)'::tid

These aren't exactly failures -- both implicit string comparisons work
just fine -- but I don't know Postgres well enough to be sure that
that's true more generally. I did try checking that the non-Const member
of the predicate is a Var; that left the date cast alone, since date(c5)
is a FuncExpr, but obviously can't do anything about the tid.

There's also an interesting case where `val::text LIKE 'foo'` works when
val is an enum column in the local table, and breaks, castless, with an
operator mismatch when it's altered to text: Postgres' statement parser
recognizes the cast as redundant and creates a Var node instead of a
RelabelType (as it will for, say, `val::varchar(10)`) before the FDW is
even in the picture. It's a little discomfiting, but I suppose a certain
level of "caveat emptor" entails when disregarding foreign types.

> (val as enum on local and remote)
> explain verbose select * from test where (val::text) like 'foo';
> 
>  Foreign Scan on public.test  (cost=100.00..169.06 rows=8 width=28)
>Output: id, val, on_day, ts, ts2
>Filter: ((test.val)::text ~~ 'foo'::text)
>Remote SQL: SELECT id, val, on_day, ts, ts2 FROM public.test
>
> (val as local text, remote enum)
> explain verbose select * from test where (val::text) like 'foo';
> 
>  Foreign Scan on public.test  (cost=100.00..122.90 rows=5 width=56)
>Output: id, val, on_day, ts, ts2
>Remote SQL: SELECT id, val, on_day, ts, ts2 FROM public.test WHERE ((val 
> ~~ 'foo'))
>
> explain verbose select * from test where (val::varchar(10)) like 'foo';
>
>  Foreign Scan on public.test  (cost=100.00..125.46 rows=5 width=56)
>Output: id, val, on_day, ts, ts2
>Remote SQL: SELECT id, val, on_day, ts, ts2 FROM public.test WHERE 
> ((val::character varying(10) ~~ 'foo'))

Outside that, deparseConst also contains a note about keeping the code
in sync with the parser (make_const in particular); from what I could
tell, I don't think there's anything in this that necessitates changes
there.
From 92ca385ff9891008e48d975a622f8b56b921 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Sat, 23 Oct 2021 00:54:48 -0400
Subject: [PATCH] Suppress explicit casts of safe const operands

Adds a new showtype code -2 to deparseConst in order to skip casting for
string literals when deparseOpExpr determines that the remote server can
be relied on to match operand types.
---
 contrib/postgres_

Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)

2021-08-04 Thread Dian M Fay
On Sun Mar 7, 2021 at 2:37 AM EST, Dian M Fay wrote:
> > What'd be better, if we could do it, is to ship the clause in
> > the form
> > WHERE foreigncol = 'one'
> > that is, instead of plastering a cast on the Var, try to not put
> > any explicit cast on the constant. That fixes your original use
> > case better than what you've proposed, and I think it might be
> > possible to do it unconditionally instead of needing a hacky
> > column property to enable it. The reason this could be okay
> > is that it seems reasonable for postgres_fdw to rely on the
> > core parser's heuristic that an unknown-type literal is the
> > same type as what it's being compared to. So, if we are trying
> > to deparse something of the form "foreigncol operator constant",
> > and the foreigncol and constant are of the same type locally,
> > we could leave off the cast on the constant. (There might need
> > to be some restrictions about the operator taking those types
> > natively with no cast, not sure; also this doesn't apply to
> > constants that are going to be printed as non-string literals.)
> >
> > Slipping this heuristic into the code structure of deparse.c
> > might be rather messy, though. I've not looked at just how
> > to implement it.
>
> This doesn't look too bad from here, at least so far. The attached
> change adds a new const_showtype field to the deparse_expr_cxt, and
> passes that instead of the hardcoded 0 to deparseConst. deparseOpExpr
> modifies const_showtype if both sides of a binary operation are text,
> and resets it to 0 after the recursion.
>
> I restricted it to text-only after seeing a regression test fail: while
> deparsing `percentile_cont(c2/10::numeric)`, c2, an integer column, is a
> FuncExpr with a numeric return type. That matches the numeric 10, and
> without the explicit cast, integer-division-related havoc ensues. I
> don't know why it's a FuncExpr, and I don't know why it's not an int,
> but the constant is definitely a non-string, in any case.
>
> In the course of testing, I discovered that the @@ text-search operator
> works against textified enums on my stock 13.1 server (a "proper" enum
> column yields "operator does not exist"). I'm rather wary of actually
> trying to depend on that behavior, although it seems probably-safe in
> the same character set and collation.

hello again! My second version of this change (suppressing the cast
entirely as Tom suggested) seemed to slip under the radar back in March
and then other matters intervened. I'm still interested in making it
happen, though, and now that we're out of another commitfest it seems
like a good time to bring it back up. Here's a rebased patch to start.
From 3b950aa43af686acdbbaa5aada82bfed725652d7 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Sun, 7 Mar 2021 02:59:14 -0500
Subject: [PATCH] Suppress explicit casts of text constants in postgres_fdw
 when the other side of a binary OpExpr is also text.

---
 contrib/postgres_fdw/deparse.c| 27 +-
 .../postgres_fdw/expected/postgres_fdw.out| 93 ---
 contrib/postgres_fdw/sql/postgres_fdw.sql |  9 ++
 3 files changed, 90 insertions(+), 39 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 2b8f00abe7..e056c5f117 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -101,6 +101,7 @@ typedef struct deparse_expr_cxt
 * a base 
relation. */
StringInfo  buf;/* output buffer to append to */
List  **params_list;/* exprs that will become remote Params 
*/
+   int const_showtype; /* override showtype in 
deparseConst */
 } deparse_expr_cxt;
 
 #define REL_ALIAS_PREFIX   "r"
@@ -1146,6 +1147,7 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo 
*root, RelOptInfo *rel,
context.foreignrel = rel;
context.scanrel = IS_UPPER_REL(rel) ? fpinfo->outerrel : rel;
context.params_list = params_list;
+   context.const_showtype = 0;
 
/* Construct SELECT clause */
deparseSelectSql(tlist, is_subquery, retrieved_attrs, );
@@ -1725,6 +1727,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, 
RelOptInfo *foreignrel,
context.scanrel = foreignrel;
context.root = root;
context.params_list = params_list;
+   context.const_showtype = 0;
 
appendStringInfoChar(buf, '(');
appendConditions(fpinfo->joinclauses, );
@@ -2028,6 +2031,7 @@ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
context.scanrel = foreignrel;
  

Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-07 Thread Dian M Fay
> What'd be better, if we could do it, is to ship the clause in
> the form
> WHERE foreigncol = 'one'
> that is, instead of plastering a cast on the Var, try to not put
> any explicit cast on the constant. That fixes your original use
> case better than what you've proposed, and I think it might be
> possible to do it unconditionally instead of needing a hacky
> column property to enable it. The reason this could be okay
> is that it seems reasonable for postgres_fdw to rely on the
> core parser's heuristic that an unknown-type literal is the
> same type as what it's being compared to. So, if we are trying
> to deparse something of the form "foreigncol operator constant",
> and the foreigncol and constant are of the same type locally,
> we could leave off the cast on the constant. (There might need
> to be some restrictions about the operator taking those types
> natively with no cast, not sure; also this doesn't apply to
> constants that are going to be printed as non-string literals.)
>
> Slipping this heuristic into the code structure of deparse.c
> might be rather messy, though. I've not looked at just how
> to implement it.

This doesn't look too bad from here, at least so far. The attached
change adds a new const_showtype field to the deparse_expr_cxt, and
passes that instead of the hardcoded 0 to deparseConst. deparseOpExpr
modifies const_showtype if both sides of a binary operation are text,
and resets it to 0 after the recursion.

I restricted it to text-only after seeing a regression test fail: while
deparsing `percentile_cont(c2/10::numeric)`, c2, an integer column, is a
FuncExpr with a numeric return type. That matches the numeric 10, and
without the explicit cast, integer-division-related havoc ensues. I
don't know why it's a FuncExpr, and I don't know why it's not an int,
but the constant is definitely a non-string, in any case.

In the course of testing, I discovered that the @@ text-search operator
works against textified enums on my stock 13.1 server (a "proper" enum
column yields "operator does not exist"). I'm rather wary of actually
trying to depend on that behavior, although it seems probably-safe in
the same character set and collation.
From f7024ae2e95bd28c47267d97d7350f43a2b6c072 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Sun, 7 Mar 2021 02:59:14 -0500
Subject: [PATCH] Suppress explicit casts of text constants in postgres_fdw
 when the other side of a binary OpExpr is also text.

---
 contrib/postgres_fdw/deparse.c| 27 ++-
 .../postgres_fdw/expected/postgres_fdw.out| 81 ---
 contrib/postgres_fdw/sql/postgres_fdw.sql |  9 +++
 3 files changed, 84 insertions(+), 33 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6faf499f9a..738c9c1038 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -100,6 +100,7 @@ typedef struct deparse_expr_cxt
 * a base 
relation. */
StringInfo  buf;/* output buffer to append to */
List  **params_list;/* exprs that will become remote Params 
*/
+   int const_showtype; /* override showtype in 
deparseConst */
 } deparse_expr_cxt;
 
 #define REL_ALIAS_PREFIX   "r"
@@ -1018,6 +1019,7 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo 
*root, RelOptInfo *rel,
context.foreignrel = rel;
context.scanrel = IS_UPPER_REL(rel) ? fpinfo->outerrel : rel;
context.params_list = params_list;
+   context.const_showtype = 0;
 
/* Construct SELECT clause */
deparseSelectSql(tlist, is_subquery, retrieved_attrs, );
@@ -1597,6 +1599,7 @@ deparseFromExprForRel(StringInfo buf, PlannerInfo *root, 
RelOptInfo *foreignrel,
context.scanrel = foreignrel;
context.root = root;
context.params_list = params_list;
+   context.const_showtype = 0;
 
appendStringInfoChar(buf, '(');
appendConditions(fpinfo->joinclauses, );
@@ -1897,6 +1900,7 @@ deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
context.scanrel = foreignrel;
context.buf = buf;
context.params_list = params_list;
+   context.const_showtype = 0;
 
appendStringInfoString(buf, "UPDATE ");
deparseRelation(buf, rel);
@@ -2005,6 +2009,7 @@ deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root,
context.scanrel = foreignrel;
context.buf = buf;
context.params_list = params_list;
+   context.const_showtype = 0;
 
appendStringInfoString(buf, "DELETE FROM ");
deparseRelation(buf, rel);
@@ -2392,7 +2397,7 @@ deparseExpr(Expr *node, deparse_expr_cxt

Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-04 Thread Dian M Fay
On Thu Mar 4, 2021 at 9:28 AM EST, Georgios Kokolatos wrote:
> I am afraid I will have to :-1: this patch. Of course it is possible
> that I am wrong,
> so please correct me if you, or any other reviewers, think so.
>
> The problem that is intended to be solved, upon closer inspection
> seems
> to be a
> conscious design decision rather than a problem. Even if I am wrong
> there, I am
> not certain that the proposed patch covers all the bases with respect
> to
> collations,
> build-in types, shipability etc for simple expressions, and covers any
> of more
> complicated expressions all together.

Thanks for reviewing it!

I see room for interpretation in the design here, although I have
admittedly not been looking at it for very long. `CREATE/ALTER FOREIGN
TABLE` take the user at their word about types, which only map 1:1 for a
foreign Postgres server anyway. In make_tuple_from_result_row, incoming
values start as strings until they're converted to their target types --
again, with no guarantee that those types match those on the remote
server. The docs recommend types match exactly and note the sorts of
things that can go wrong, but there's no enforcement; either what you've
cooked up works or it doesn't. And in fact, declaring local text for a
remote enum seems to work quite well right up until you try to
reference it in the `WHERE` clause.

Enum::text seems like a safe and potentially standardizable case for
postgres_fdw. As implemented, the patch goes beyond that, but it's
opt-in and the docs already warn about consequences. I haven't tested it
across collations, but right now that seems like something to look into
if the idea survives the next few messages.

> I will add that creating a view on the remote server with type
> flexibility that
> you wish and then create foreign tables against the view, might
> address
> your
> problem.

A view would address the immediate issue of the types, but itself
requires additional maintenance if/when the underlying table's schema
changes (even `SELECT *` is expanded into the current column definitions
at creation). I think it's better than copying the types, because it
moves the extra work of keeping local and remote synchronized to a
*table* modification as opposed to a *type* modification, in which
latter case it's much easier to forget about dependents. But I'd prefer
to avoid extra work anywhere!




Re: [PATCH] postgres-fdw: column option to override foreign types

2021-03-02 Thread Dian M Fay
On Tue Mar 2, 2021 at 6:50 AM EST, Ashutosh Bapat wrote:
> On Mon, Mar 1, 2021 at 12:59 PM Dian M Fay  wrote:
> >
> > Full use of a custom data type with postgres_fdw currently requires the
> > type be maintained in both the local and remote databases. `CREATE
> > FOREIGN TABLE` does not check declared types against the remote table,
> > but declaring e.g. a remote enum to be local text works only partway, as
> > seen here. A simple select query against alpha_items returns the enum
> > values as text; however, *filtering* on the column yields an error.
>
> postgres_fdw assumes that the local type declared is semantically same
> as the remote type. Ideally the enum should also be declared locally
> and used to declare type's datatype. See how to handle UDTs in
> postgres_fdw at
> https://stackoverflow.com/questions/37734170/can-the-foreign-data-wrapper-fdw-postgres-handle-the-geometry-data-type-of-postg

I'm aware, and the reason for this change is that I think it's annoying
to declare and maintain the type on the local server for the sole
purpose of accommodating a read-only foreign table that effectively
treats it like text anyway. The real scenario that prompted it is a
tickets table with status, priority, category, etc. enums. We don't have
plans to modify them any time soon, but if we do it's got to be
coordinated and deployed across two databases, all so we can use what
might as well be a text column in a single WHERE clause. Since foreign
tables can be defined over subsets of columns, reordered, and names
changed, a little opt-in flexibility with types too doesn't seem
misplaced.

Note that currently, postgres_fdw will strip casts on the WHERE column:
`where type::text = 'one'` becomes `where ((type = 'one'::text))` (the
value is cast separately). Making it respect those is another option,
but I thought including it in column configuration would be less
surprising to users who aren't aware of the difference between the local
and remote tables.




Re: TRIM_ARRAY

2021-03-01 Thread Dian M Fay
On Mon Mar 1, 2021 at 6:53 PM EST, Vik Fearing wrote:
> > This basically does what it says, and the code looks good. The
> > documentation is out of alphabetical order (trim_array should appear
> > under cardinality rather than over)) but good otherwise.
>
> Hmm. It appears between cardinality and unnest in the source code and
> also my compiled html. Can you say more about where you're seeing the
> wrong order?

I applied the patch to the latest commit, ffd3944ab9. Table 9.52 is
ordered:

array_to_string
array_upper
trim_array
cardinality
unnest

> The problem here is that postgres needs to know what the return
> type is and it can only determine that from the input.
>
> If you give the function a typed null, it returns null as expected.
>
> > The new status of this patch is: Waiting on Author
>
> I put it back to Needs Review without a new patch because I don't know
> what I would change.

I'd thought that checking v and returning null instead of raising the
error would be more friendly, should it be possible to pass an untyped
null accidentally instead of on purpose, and I couldn't rule that out.
I've got no objections other than the docs having been displaced.




[PATCH] postgres-fdw: column option to override foreign types

2021-02-28 Thread Dian M Fay
Full use of a custom data type with postgres_fdw currently requires the
type be maintained in both the local and remote databases. `CREATE
FOREIGN TABLE` does not check declared types against the remote table,
but declaring e.g. a remote enum to be local text works only partway, as
seen here. A simple select query against alpha_items returns the enum
values as text; however, *filtering* on the column yields an error.

create database alpha;
create database beta;

\c alpha

create type itemtype as enum ('one', 'two', 'three');
create table items (
  id serial not null primary key,
  type itemtype not null
);
insert into items (type) values ('one'), ('one'), ('two');

\c beta

create extension postgres_fdw;
create server alpha foreign data wrapper postgres_fdw options (dbname 'alpha', 
host 'localhost', port '5432');
create user mapping for postgres server alpha options (user 'postgres');

create foreign table alpha_items (
  id int,
  type text
) server alpha options (table_name 'items');
select * from alpha_items; -- ok
select * from alpha_items where type = 'one';

ERROR:  operator does not exist: public.itemtype = text
HINT:  No operator matches the given name and argument types. You might need to 
add explicit type casts.
CONTEXT:  remote SQL command: SELECT id, type FROM public.items WHERE ((type = 
'one'::text))

The attached changeset adds a new boolean option for postgres_fdw
foreign table columns, `use_local_type`. When true, ColumnRefs for the
relevant attribute will be deparsed with a cast to the type defined in
`CREATE FOREIGN TABLE`.

create foreign table alpha_items (
  id int,
  type text options (use_local_type 'true')
) server alpha options (table_name 'items');
select * from alpha_items where type = 'one'; -- succeeds

This builds and checks, with a new regression test and documentation.
From 09961e10daf72e2c1fbbdddb05b2940b2da14df0 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Mon, 1 Mar 2021 00:44:15 -0500
Subject: [PATCH] postgres_fdw: column option to override foreign types

Enabling the use_local_type option on a foreign table column will
deparse it with a cast to the locally-declared type, allowing overrides
of foreign types not known to the local database.
---
 contrib/postgres_fdw/deparse.c | 15 ---
 contrib/postgres_fdw/expected/postgres_fdw.out | 17 +
 contrib/postgres_fdw/option.c  |  1 +
 contrib/postgres_fdw/sql/postgres_fdw.sql  | 10 ++
 doc/src/sgml/postgres-fdw.sgml | 13 +
 5 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6faf499f9a..c722d49316 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -2264,6 +2264,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, 
RangeTblEntry *rte,
else
{
char   *colname = NULL;
+   bool   uselocaltype = NULL;
List   *options;
ListCell   *lc;
 
@@ -2280,10 +2281,9 @@ deparseColumnRef(StringInfo buf, int varno, int 
varattno, RangeTblEntry *rte,
DefElem*def = (DefElem *) lfirst(lc);
 
if (strcmp(def->defname, "column_name") == 0)
-   {
colname = defGetString(def);
-   break;
-   }
+   else if (strcmp(def->defname, "use_local_type") == 0)
+   uselocaltype = defGetBoolean(def);
}
 
/*
@@ -2297,6 +2297,15 @@ deparseColumnRef(StringInfo buf, int varno, int 
varattno, RangeTblEntry *rte,
ADD_REL_QUALIFIER(buf, varno);
 
appendStringInfoString(buf, quote_identifier(colname));
+
+   if (uselocaltype)
+   {
+   Oid coltype = get_atttype(rte->relid, varattno);
+   char *typname = deparse_type_name(coltype, -1);
+
+   appendStringInfoString(buf, "::");
+   appendStringInfoString(buf, typname);
+   }
}
 }
 
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 0649b6b81c..c3271bac8f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4105,6 +4105,23 @@ ERROR:  invalid input syntax for type integer: "foo"
 CONTEXT:  processing expression at position 2 in select list
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE user_enum;
 -- ===
+-- conversion with use_local_type
+-- ===
+ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE text;
+SELECT * FROM ft1 WHERE c8

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Wed Jan 20, 2021 at 2:08 PM EST, Dmitry Dolgov wrote:
> > On Wed, Jan 20, 2021 at 11:34:16AM -0500, Dian M Fay wrote:
> > > Thanks, I need to remember to not skipp doc building for testing process
> > > even for such small changes. Hope now I didn't forget anything.
> > >
> > > > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
> > >
> > > > Here's a full editing pass on the documentation, with v45 and Pavel's
> > > > doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> > > > added hints.
> > >
> > > Great! I've applied almost all of it, except:
> > >
> > > + A jsonb value will accept assignments to nonexistent
> > > subscript
> > > + paths as long as the nonexistent elements being traversed are all
> > > arrays.
> > >
> > > Maybe I've misunderstood the intention, but there is no requirement
> > > about arrays for creating such an empty path. I've formulated it as:
> > >
> > > + A jsonb value will accept assignments to nonexistent
> > > subscript
> > > + paths as long as the last existing path key is an object or an array.
> >
> > My intention there was to highlight the difference between:
> >
> > * SET obj['a']['b']['c'] = '"newvalue"'
> > * SET arr[0][0][3] = '"newvalue"'
> >
> > obj has to conform to {"a": {"b": {...}}} in order to receive the
> > assignment of the nested c. If it doesn't, that's the error case we
> > discussed earlier. But arr can be null, [], and so on, and any missing
> > structure [[[null, null, null, "newvalue"]]] will be created.
>
> If arr is 'null', or any other scalar value, such subscripting will work
> only one level deep because they represented internally as an array of
> one element. If arr is '[]' the path will comply by definition. So it's
> essentially the same as for objects with no particular difference. If
> such a quirk about scalars being treated like arrays is bothering, we
> could also bend it in this case as well (see the attached version).

I missed that distinction in the original UPDATE paragraph too. Here's
another revision based on v48.
From a486ee221469037b08d3663f1ec142a905406f8b Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Wed, 20 Jan 2021 23:36:34 -0500
Subject: [PATCH] more jsonb subscripting documentation edits

---
 doc/src/sgml/json.sgml | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index deeb9e66e0..e16dd6973d 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -616,16 +616,17 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE 
jdoc @ '{"tags": ["qu
 
   
UPDATE statements may use subscripting in the
-   SET clause to modify jsonb values. Object
-   values being traversed must exist as specified by the subscript path. For
-   instance, the path val['a']['b']['c'] assumes that
-   val, val['a'], and 
val['a']['b']
-   are all objects in every record being updated 
(val['a']['b']
-   may or may not contain a field named c, as long as it's 
an
-   object). If any individual val, 
val['a'],
-   or val['a']['b'] is a non-object such as a string, a 
number,
-   or NULL, an error is raised even if other values do 
conform.
-   Array values are not subject to this restriction, as detailed below.
+   SET clause to modify jsonb values. Subscript
+   paths must be traversible for all affected values insofar as they exist. For
+   instance, the path val['a']['b']['c'] can be traversed 
all
+   the way to c if every val,
+   val['a'], and val['a']['b'] is an
+   object. If any val['a'] or 
val['a']['b']
+   is not defined, it will be created as an empty object and filled as
+   necessary. However, if any val itself or one of the
+   intermediary values is defined as a non-object such as a string, number, or
+   jsonb null, traversal cannot proceed 
so
+   an error is raised and the transaction aborted.
   
 
   
@@ -658,8 +659,9 @@ SELECT * FROM table_name WHERE jsonb_field['key'] = 
'"value"';
 
jsonb assignment via subscripting handles a few edge cases
differently from jsonb_set. When a source 
jsonb
-   is NULL, assignment via subscripting will proceed as if
-   it was an empty JSON object:
+   value is NULL, assignment via subscripting will proceed
+   as if it was an empty JSON value of the type (object or array) implied by 
the
+   subscript key:
 
 
 -- Where jsonb_field was NULL, it is now {"a": 1}
@@ -680,17 +682,19 @@ UPDATE table_name SET jsonb_field[2] = '2';
 
 
A jsonb value will accept assignments to nonexistent subscript
-   paths as long as the last existing path key is an object or an 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Wed Jan 20, 2021 at 11:22 AM EST, Dmitry Dolgov wrote:
> > On Tue Jan 19, 2021 at 1:42 PM EST, Pavel Stehule wrote:
> >
> > I found minor issues.
> >
> > Doc - missing tag
> >
> > and three whitespaces issues
> >
> > see attached patch
>
> Thanks, I need to remember to not skipp doc building for testing process
> even for such small changes. Hope now I didn't forget anything.
>
> > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
>
> > Here's a full editing pass on the documentation, with v45 and Pavel's
> > doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> > added hints.
>
> Great! I've applied almost all of it, except:
>
> + A jsonb value will accept assignments to nonexistent
> subscript
> + paths as long as the nonexistent elements being traversed are all
> arrays.
>
> Maybe I've misunderstood the intention, but there is no requirement
> about arrays for creating such an empty path. I've formulated it as:
>
> + A jsonb value will accept assignments to nonexistent
> subscript
> + paths as long as the last existing path key is an object or an array.

My intention there was to highlight the difference between:

* SET obj['a']['b']['c'] = '"newvalue"'
* SET arr[0][0][3] = '"newvalue"'

obj has to conform to {"a": {"b": {...}}} in order to receive the
assignment of the nested c. If it doesn't, that's the error case we
discussed earlier. But arr can be null, [], and so on, and any missing
structure [[[null, null, null, "newvalue"]]] will be created. Take 2:

A jsonb value will accept assignments to nonexistent
subscript paths as long as object key subscripts can be traversed as
described above. The final subscript is not traversed and, if it
describes a missing object key, will be created. Nested arrays will
always be created and NULL-padded according to the
path until the value can be placed appropriately.




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Tue Jan 19, 2021 at 1:42 PM EST, Pavel Stehule wrote:
> Hi
>
> I found minor issues.
>
> Doc - missing tag
>
> and three whitespaces issues
>
> see attached patch
>
> Following sentence is hard to read due long nested example
>
> If the
> + path contradicts structure of modified jsonb for any
> individual
> + value (e.g. path val['a']['b']['c'] assumes keys
> + 'a' and 'b' have object values
> + assigned to them, but if val['a'] or
> + val['b'] is null, a string, or a number, then the
> path
> + contradicts with the existing structure), an error is raised even if
> other
> + values do conform.
>
> It can be divided into two sentences - predicate, and example.
>
> Regards
>
> Pavel

Here's a full editing pass on the documentation, with v45 and Pavel's
doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
added hints.
From 086a34ca860e8513484d829db1cb3f0c17c4ec1e Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Tue, 19 Jan 2021 23:44:23 -0500
Subject: [PATCH] Revise jsonb subscripting documentation

---
 doc/src/sgml/json.sgml  | 101 +---
 src/backend/utils/adt/jsonbsubs.c   |   2 +-
 src/test/regress/expected/jsonb.out |   2 +-
 3 files changed, 49 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 4e19fe4fb8..eb3952193a 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -605,97 +605,90 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE 
jdoc @ '{"tags": ["qu
  
   jsonb Subscripting
   
-   jsonb data type supports array-style subscripting expressions
-   to extract or update particular elements. It's possible to use multiple
-   subscripting expressions to extract nested values. In this case, a chain of
-   subscripting expressions follows the same rules as the
-   path argument in jsonb_set function,
-   e.g. in case of arrays it is a 0-based operation or that negative integers
-   that appear in path count from the end of JSON arrays.
-   The result of subscripting expressions is always jsonb data type.
+   The jsonb data type supports array-style subscripting 
expressions
+   to extract and modify elements. Nested values can be indicated by chaining
+   subscripting expressions, following the same rules as the 
path
+   argument in the jsonb_set function. If a 
jsonb
+   value is an array, numeric subscripts start at zero, and negative integers 
count
+   backwards from the last element of the array. Slice expressions are not 
supported.
+   The result of a subscripting expression is always of the jsonb data type.
   
 
   
UPDATE statements may use subscripting in the
-   SET clause to modify jsonb values. Every
-   affected value must conform to the path defined by the subscript(s). If the
-   path contradicts structure of modified jsonb for any individual
-   value (e.g. path val['a']['b']['c'] assumes keys
-   'a' and 'b' have object values
-   assigned to them, but if val['a'] or
-   val['b'] is null, a string, or a number, then the path
-   contradicts with the existing structure), an error is raised even if other
-   values do conform.
+   SET clause to modify jsonb values. Object
+   values being traversed must exist as specified by the subscript path. For
+   instance, the path val['a']['b']['c'] assumes that
+   val, val['a'], and 
val['a']['b']
+   are all objects in every record being updated 
(val['a']['b']
+   may or may not contain a field named c, as long as it's 
an
+   object). If any individual val, 
val['a'],
+   or val['a']['b'] is a non-object such as a string, a 
number,
+   or NULL, an error is raised even if other values do 
conform.
+   Array values are not subject to this restriction, as detailed below.
   
-  
 
+  
An example of subscripting syntax:
+
 
--- Extract value by key
+-- Extract object value by key
 SELECT ('{"a": 1}'::jsonb)['a'];
 
--- Extract nested value by key path
+-- Extract nested object value by key path
 SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
 
--- Extract element by index
+-- Extract array element by index
 SELECT ('[1, "2", null]'::jsonb)[1];
 
--- Update value by key, note the single quotes - the assigned value
--- needs to be of jsonb type as well
+-- Update object value by key. Note the quotes around '1': the assigned
+-- value must be of the jsonb type as well
 UPDATE table_name SET jsonb_field['key'] = '1';
 
--- This will raise an error if jsonb_field is {"a": 1}
+-- This will raise an error if any record's jsonb_field['a']['b'] is something
+-- other than an object. For example, the value {"a": 1} has no 'b' key.
 UPDATE table_name SET jsonb_field['a']['b']['c'] = '1';
 
--- Select records using where clause with subscripting. Since the result of
--- subscripting is jsonb and we basically want to compare two jsonb objects, we
-

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-14 Thread Dian M Fay
On Thu Jan 14, 2021 at 10:04 AM EST, Dmitry Dolgov wrote:
> > On Tue, Jan 12, 2021 at 08:02:59PM +0100, Pavel Stehule wrote:
> > ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule 
> > napsal:
> >
> > I tested behaviour and I didn't find anything other than the mentioned
> > issue.
> >
> > Now I can check this feature from plpgsql, and it is working. Because there
> > is no special support in plpgsql runtime, the update of jsonb is
> > significantly slower than in update of arrays, and looks so update of jsonb
> > has O(N2) cost. I don't think it is important at this moment - more
> > important is fact, so I didn't find any memory problems.
>
> Thanks for testing. Regarding updates when the structure doesn't match
> provided path as I've mentioned I don't have strong preferences, but on
> the second though probably more inclined for returning an error in this
> case. Since there are pros and cons for both suggestions, it could be
> decided by vote majority between no update (Dian) or an error (Pavel,
> me) options. Any +1 to one of the options from others?
>
> Other than that, since I've already posted the patch for returning an
> error option, it seems that the only thing left is to decide with which
> version to go.

The trigger issue (which I did verify) makes the "no update" option
unworkable imo, JavaScript's behavior notwithstanding. But it should be
called out very clearly in the documentation, since it does depart from
what people more familiar with that behavior may expect. Here's a quick
draft, based on your v44 patch:


 jsonb data type supports array-style subscripting expressions
 to extract or update particular elements. It's possible to use multiple
 subscripting expressions to extract nested values. In this case, a chain of
 subscripting expressions follows the same rules as the
 path argument in jsonb_set function,
 e.g. in case of arrays it is a 0-based operation or that negative integers
 that appear in path count from the end of JSON arrays.
 The result of subscripting expressions is always of the jsonb data type.


 UPDATE statements may use subscripting in the 
 SET clause to modify jsonb values. Every
 affected value must conform to the path defined by the subscript(s). If the
 path cannot be followed to its end for any individual value (e.g.
 val['a']['b']['c'] where val['a'] or
 val['b'] is null, a string, or a number), an error is
 raised even if other values do conform.


 An example of subscripting syntax:




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-09 Thread Dian M Fay
On Sat Jan 9, 2021 at 3:34 PM EST, Pavel Stehule wrote:
> so 9. 1. 2021 v 21:06 odesílatel Dian M Fay 
> napsal:
>
> > On Thu Jan 7, 2021 at 3:24 AM EST, Pavel Stehule wrote:
> > > čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> > > napsal:
> > >
> > > > > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> > > > >
> > > > > this case should to raise exception - the value should be changed or
> > > > error
> > > > > should be raised
> > > > >
> > > > > postgres=# insert into foo values('{}');
> > > > > postgres=# update foo set a['a'] = '100';
> > > > > postgres=# update foo set a['a'][1] = '-1';
> > > > > postgres=# select * from foo;
> > > > > ┌┐
> > > > > │ a  │
> > > > > ╞╡
> > > > > │ {"a": 100} │
> > > > > └┘
> > > >
> > > > I was expecting this question, as I've left this like that
> > intentionally
> > > > because of two reasons:
> > > >
> > > > * Opposite to other changes, to implement this one we need to introduce
> > > >   a condition more interfering with normal processing, which raises
> > > >   performance issues for already existing functionality in jsonb_set.
> > > >
> > > > * I vaguely recall there was a similar discussion about jsonb_set with
> > > >   the similar solution.
> > > >
> > >
> > > ok.
> > >
> > > In this case I have a strong opinion so current behavior is wrong. It
> > > can
> > > mask errors. There are two correct possibilities
> > >
> > > 1. raising error - because the update try to apply index on scalar value
> > >
> > > 2. replace by array - a = {NULL, -1}
> > >
> > > But isn't possible ignore assignment
> > >
> > > What do people think about it?
> >
> > I've been following this thread looking forward to the feature and was
> > all set to come in on the side of raising an exception here, but then I
> > tried it in a JS REPL:
> >
> > ; a = {}
> > {}
> > ; a['a'] = '100'
> > '100'
> > ; a['a'][1] = -1
> > -1
> > ; a
> > { a: '100' }
> >
> > ; b = {}
> > {}
> > ; b['b'] = 100
> > 100
> > ; b['b'][1] = -1
> > -1
> > ; b
> > { b: 100 }
> >
> > Even when the value shouldn't be subscriptable _at all_, the invalid
> > assignment is ignored silently. But since the patch follows some of
> > JavaScript's more idiosyncratic leads in other respects (e.g. padding
> > out arrays with nulls when something is inserted at a higher subscript),
> > the current behavior makes at least as much sense as JavaScript's
> > canonical behavior.
> >
> > There's also the bulk update case to think about. An error makes sense
> > when there's only one tuple being affected at a time, but with 1000
> > tuples, should a few no-ops where the JSON turns out to be a structural
> > mismatch stop the rest from changing correctly? What's the alternative?
> > The only answer I've got is double-checking the structure in the WHERE
> > clause, which seems like a lot of effort to go to for something that's
> > supposed to make working with JSON easier.
> >
> > Changing the surrounding structure (e.g. turning a['a'] into an array)
> > seems much more surprising than the no-op, and more likely to have
> > unforeseen consequences in client code working with the JSON. Ignoring
> > invalid assignments -- like JavaScript itself -- seems like the best
> > solution to me.
> >
>
> We don't need 100% compatibility in possible buggy behaviour.
>
> I very much disliked the situation when the system reports ok, but the
> operation was ignored. It is pretty hard to identify bugs in this
> system.
>
> What can be the benefit or use case for this behavior? JavaScript is
> designed for use in web browsers - and a lot of technologies there are
> fault tolerant. But this is a database. I would like to know about all
> the
> errors there.

I'm thinking of the update path as a kind of implicit schema. JSON is
intentionally not bound to any schema on creation, so I don't see a
failure to enforce another schema at runtime (and outside the WHERE
clause, at that) as an error exactly.

But I looked into the bulk case a little further, and "outside the
WHERE clause" cuts both ways. The server reports an update whether or
not the JSON could have been modified, which suggests triggers will
fire for no-op updates. That's more clearly a problem.

insert into j (val) values
 ('{"a": 100}'),
 ('{"a": "200"}'),
 ('{"b": "300"}'),
 ('{"c": {"d": 400}}'),
 ('{"a": {"z": 500}}');

INSERT 0 5
update j set val['a']['z'] = '600' returning *;
val 

 {"a": 100}
 {"a": "200"}
 {"a": {"z": 600}, "b": "300"}
 {"a": {"z": 600}, "c": {"d": 400}}
 {"a": {"z": 600}}
(5 rows)

*UPDATE 5*




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-09 Thread Dian M Fay
On Thu Jan 7, 2021 at 3:24 AM EST, Pavel Stehule wrote:
> čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
>
> > > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> > >
> > > this case should to raise exception - the value should be changed or
> > error
> > > should be raised
> > >
> > > postgres=# insert into foo values('{}');
> > > postgres=# update foo set a['a'] = '100';
> > > postgres=# update foo set a['a'][1] = '-1';
> > > postgres=# select * from foo;
> > > ┌┐
> > > │ a  │
> > > ╞╡
> > > │ {"a": 100} │
> > > └┘
> >
> > I was expecting this question, as I've left this like that intentionally
> > because of two reasons:
> >
> > * Opposite to other changes, to implement this one we need to introduce
> >   a condition more interfering with normal processing, which raises
> >   performance issues for already existing functionality in jsonb_set.
> >
> > * I vaguely recall there was a similar discussion about jsonb_set with
> >   the similar solution.
> >
>
> ok.
>
> In this case I have a strong opinion so current behavior is wrong. It
> can
> mask errors. There are two correct possibilities
>
> 1. raising error - because the update try to apply index on scalar value
>
> 2. replace by array - a = {NULL, -1}
>
> But isn't possible ignore assignment
>
> What do people think about it?

I've been following this thread looking forward to the feature and was
all set to come in on the side of raising an exception here, but then I
tried it in a JS REPL:

; a = {}
{}
; a['a'] = '100'
'100'
; a['a'][1] = -1
-1
; a
{ a: '100' }

; b = {}
{}
; b['b'] = 100
100
; b['b'][1] = -1
-1
; b
{ b: 100 }

Even when the value shouldn't be subscriptable _at all_, the invalid
assignment is ignored silently. But since the patch follows some of
JavaScript's more idiosyncratic leads in other respects (e.g. padding
out arrays with nulls when something is inserted at a higher subscript),
the current behavior makes at least as much sense as JavaScript's
canonical behavior.

There's also the bulk update case to think about. An error makes sense
when there's only one tuple being affected at a time, but with 1000
tuples, should a few no-ops where the JSON turns out to be a structural
mismatch stop the rest from changing correctly? What's the alternative?
The only answer I've got is double-checking the structure in the WHERE
clause, which seems like a lot of effort to go to for something that's
supposed to make working with JSON easier.

Changing the surrounding structure (e.g. turning a['a'] into an array)
seems much more surprising than the no-op, and more likely to have
unforeseen consequences in client code working with the JSON. Ignoring
invalid assignments -- like JavaScript itself -- seems like the best
solution to me.