Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Robert Haas writes: > On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane wrote: >> BTW, has anyone got an opinion about whether to stick the full fix into >> 9.4? > I think we should put the full fix in 9.4. Since nobody spoke against that, I've put the full fix into HEAD and 9.4, and the restricted version into 9.3 and 9.2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/10/2014 10:19 AM, Robert Haas wrote: On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane wrote: I wrote: Attached are patches meant for HEAD and 9.2-9.4 respectively. BTW, has anyone got an opinion about whether to stick the full fix into 9.4? The argument for, of course, is that we'd get the full fix out to the public a year sooner. The argument against is that someone might have already done compatibility testing of their application against 9.4 betas, and then get blindsided if we change this before release. I'm inclined to think that since we expect json/jsonb usage to really take off with 9.4, it might be better if we get row_to_json behaving unsurprisingly now. But I'm not dead set on it. I think we should put the full fix in 9.4. +1 cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Yes, it's late beta, but especially if we're pushing json/jsonb usage as a major feature of this release, I'd say remove surprising cases now. It's late beta, but that's what beta is for, the last chance for bug fixes, before we live w/ it for the support life of the release. The affected class are people with an app that already uses json, so 9.2 or better, have ran acceptance tests against an early beta of 9.4, and have SQL that returns madeup fieldnames instead of the appropriate aliases? Which they were probably annoyed at when they wrote the code that consumes that output in the first place. I think an item in the list of compatability changes/gotchas should catch anyone who is that on the ball as to be testing the betas anyway. Anyone pushing that envelope is going to do the same acceptance testing against the actual release before rolling to production. Ross On Mon, Nov 10, 2014 at 10:11:25AM -0500, Tom Lane wrote: > I wrote: > > Attached are patches meant for HEAD and 9.2-9.4 respectively. > > BTW, has anyone got an opinion about whether to stick the full fix into > 9.4? The argument for, of course, is that we'd get the full fix out to > the public a year sooner. The argument against is that someone might > have already done compatibility testing of their application against > 9.4 betas, and then get blindsided if we change this before release. > > I'm inclined to think that since we expect json/jsonb usage to really > take off with 9.4, it might be better if we get row_to_json behaving > unsurprisingly now. But I'm not dead set on it. > > regards, tom lane > -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer & Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Tom Lane-2 wrote > I wrote: >> Attached are patches meant for HEAD and 9.2-9.4 respectively. > > BTW, has anyone got an opinion about whether to stick the full fix into > 9.4? The argument for, of course, is that we'd get the full fix out to > the public a year sooner. The argument against is that someone might > have already done compatibility testing of their application against > 9.4 betas, and then get blindsided if we change this before release. > > I'm inclined to think that since we expect json/jsonb usage to really > take off with 9.4, it might be better if we get row_to_json behaving > unsurprisingly now. But I'm not dead set on it. I am not one of those people who would be blindsided but I'd much rather inconvenience our beta testers in order to get a better product in place for the masses. Beta testers read release notes and should be able to identify and fix any problematic areas of their code while simultaneously being happy for the improvements. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/row-to-json-bug-with-index-only-scans-empty-keys-tp5826070p5826338.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane wrote: > I wrote: >> Attached are patches meant for HEAD and 9.2-9.4 respectively. > > BTW, has anyone got an opinion about whether to stick the full fix into > 9.4? The argument for, of course, is that we'd get the full fix out to > the public a year sooner. The argument against is that someone might > have already done compatibility testing of their application against > 9.4 betas, and then get blindsided if we change this before release. > > I'm inclined to think that since we expect json/jsonb usage to really > take off with 9.4, it might be better if we get row_to_json behaving > unsurprisingly now. But I'm not dead set on it. I think we should put the full fix in 9.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
I wrote: > Attached are patches meant for HEAD and 9.2-9.4 respectively. BTW, has anyone got an opinion about whether to stick the full fix into 9.4? The argument for, of course, is that we'd get the full fix out to the public a year sooner. The argument against is that someone might have already done compatibility testing of their application against 9.4 betas, and then get blindsided if we change this before release. I'm inclined to think that since we expect json/jsonb usage to really take off with 9.4, it might be better if we get row_to_json behaving unsurprisingly now. But I'm not dead set on it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
I wrote: > We could reduce the risks involved by narrowing the cases in which > ExecEvalWholeRowVar will replace field names it got from the input. > I'd be inclined to propose: > 1. If Var is of a named composite type, use *exactly* the field names > associated with that type. (This avoids the need to possibly produce > RECORD outputs from a named-type Var, thus removing the Assert-weakening > issue.) > 2. If Var is of type RECORD, replace only empty field names with aliases > from the RTE. (This might sound inconsistent --- could you end up with > some names coming from point A and some from point B? --- but in practice > I think it would always be all-or-nothing, because the issue is whether > or not the planner bothered to attach column names to a lower-level > targetlist.) Attached are patches meant for HEAD and 9.2-9.4 respectively. The HEAD patch extends the prior patch to fix the RowExpr case I mentioned yesterday. The back-branch patch works as suggested above. I added a bunch of regression tests that document behavior in this area. The two patches include the same set of tests but have different expected output for the cases we are intentionally not fixing in back branches. If you try the back-branch regression tests on an unpatched backend, you can verify that the only cases that change behavior are ones where current sources put out empty field names. The test cases use row_to_json() so they could not be used directly before 9.2. I tried the same cases using hstore(record) in 9.1. While 9.1 does some pretty darn dubious things, it does not produce empty field names in any of these cases, so I think we probably should not consider back-patching further than 9.2. regards, tom lane diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 7cfa63f..88af735 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *** *** 50,55 --- 50,56 #include "nodes/nodeFuncs.h" #include "optimizer/planner.h" #include "parser/parse_coerce.h" + #include "parser/parsetree.h" #include "pgstat.h" #include "utils/acl.h" #include "utils/builtins.h" *** ExecEvalWholeRowVar(WholeRowVarExprState *** 712,717 --- 713,720 { Var *variable = (Var *) wrvstate->xprstate.expr; TupleTableSlot *slot; + TupleDesc output_tupdesc; + MemoryContext oldcontext; bool needslow = false; if (isDone) *** ExecEvalWholeRowVar(WholeRowVarExprState *** 787,794 /* If so, build the junkfilter in the query memory context */ if (junk_filter_needed) { - MemoryContext oldcontext; - oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); wrvstate->wrv_junkFilter = ExecInitJunkFilter(subplan->plan->targetlist, --- 790,795 *** ExecEvalWholeRowVar(WholeRowVarExprState *** 860,869 needslow = true; /* need runtime check for null */ } ReleaseTupleDesc(var_tupdesc); } ! /* Skip the checking on future executions of node */ if (needslow) wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow; else --- 861,920 needslow = true; /* need runtime check for null */ } + /* + * Use the variable's declared rowtype as the descriptor for the + * output values, modulo possibly assigning new column names below. In + * particular, we *must* absorb any attisdropped markings. + */ + oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); + output_tupdesc = CreateTupleDescCopy(var_tupdesc); + MemoryContextSwitchTo(oldcontext); + ReleaseTupleDesc(var_tupdesc); } + else + { + /* + * In the RECORD case, we use the input slot's rowtype as the + * descriptor for the output values, modulo possibly assigning new + * column names below. + */ + oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); + output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor); + MemoryContextSwitchTo(oldcontext); + } ! /* ! * Construct a tuple descriptor for the composite values we'll produce, ! * and make sure its record type is "blessed". The main reason to do this ! * is to be sure that operations such as row_to_json() will see the ! * desired column names when they look up the descriptor from the type ! * information embedded in the composite values. ! * ! * We already got the correct physical datatype info above, but now we ! * should try to find the source RTE and adopt its column aliases, in case ! * they are different from the original rowtype's names. For example, in ! * "SELECT foo(t) FROM tab t(x,y)", the first two columns in the composite ! * output should be named "x" and "y" regardless of tab's column names. ! * ! * If we can't locate the RTE, assume the column names we've got are OK. ! * (As of this writing, the only cases where we can't locate the RT
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/08/2014 04:19 PM, Ross Reedstrom wrote: I've no opinion on the not respecting aliases aspect of this, but both the hstore and json emtpy keys case breaks the format: it's duplicate keys that collapse to a single value, and expected keynames are missing. The insidious bit about this bug though is that it works fine during testing with the developers typically small data sets. It's only triggered in my case when we the plan switches to index-only. Even an index scan works fine. I can't imagine that there is code out there that _depends_ on this behavior. Just as likely to me are that there exist systems that just have "can't reproduce" bugs that would be fixed by this. No, I can't imagine it either - it's utterly broken. That's the piece that Tom is proposing to fix on the back branches. AIUI. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
I've no opinion on the not respecting aliases aspect of this, but both the hstore and json emtpy keys case breaks the format: it's duplicate keys that collapse to a single value, and expected keynames are missing. The insidious bit about this bug though is that it works fine during testing with the developers typically small data sets. It's only triggered in my case when we the plan switches to index-only. Even an index scan works fine. I can't imagine that there is code out there that _depends_ on this behavior. Just as likely to me are that there exist systems that just have "can't reproduce" bugs that would be fixed by this. Ross On Sat, Nov 08, 2014 at 01:09:23PM -0500, Tom Lane wrote: > Oh, some more fun: a RowExpr that's labeled with a named composite type > (rather than RECORD) has the same issue of not respecting aliases. > Continuing previous example with table "foo": > > regression=# create view vv as select * from foo; > CREATE VIEW > regression=# select row_to_json(q) from vv q; >row_to_json > - > {"f1":1,"f2":2} > (1 row) > > regression=# select row_to_json(q) from vv q(a,b); >row_to_json > - > {"f1":1,"f2":2} > (1 row) > > So that's another case we probably want to change in HEAD but not the back > branches. > > regards, tom lane > -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer & Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Oh, some more fun: a RowExpr that's labeled with a named composite type (rather than RECORD) has the same issue of not respecting aliases. Continuing previous example with table "foo": regression=# create view vv as select * from foo; CREATE VIEW regression=# select row_to_json(q) from vv q; row_to_json - {"f1":1,"f2":2} (1 row) regression=# select row_to_json(q) from vv q(a,b); row_to_json - {"f1":1,"f2":2} (1 row) So that's another case we probably want to change in HEAD but not the back branches. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/08/2014 12:40 PM, Tom Lane wrote: Andrew Dunstan writes: On 11/08/2014 12:14 PM, Tom Lane wrote: That would be my druthers. But given the lack of complaints, maybe we should just stick to the more-backwards-compatible behavior until someone does complain. Thoughts? Wouldn't that would mean we might not pick up the expected aliases in select row_to_json(q) from (select a,b from foo) as q(x,y) ? If so, I'm definitely in favor of fixing this now. Well, it's inconsistent now. In existing releases you might or might not get x,y: regression=# create table foo (f1 int, f2 int); CREATE TABLE regression=# insert into foo values(1,2); INSERT 0 1 regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo) as q(x,y); row_to_json --- {"x":1,"y":2} (1 row) regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo offset 0) as q(x,y); row_to_json - {"f1":1,"f2":2} (1 row) That seems like something that's worth fixing going forward, but it's a lot harder to make the case that we should change it in back branches. Sure. As long as we fix the empty alias issue in the back branches, just fixing this prospectively seems fine. But I don't think we should wait for more complaints. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Andrew Dunstan writes: > On 11/08/2014 12:14 PM, Tom Lane wrote: >> That would be my druthers. But given the lack of complaints, maybe we >> should just stick to the more-backwards-compatible behavior until someone >> does complain. Thoughts? > Wouldn't that would mean we might not pick up the expected aliases in > select row_to_json(q) from (select a,b from foo) as q(x,y) > ? If so, I'm definitely in favor of fixing this now. Well, it's inconsistent now. In existing releases you might or might not get x,y: regression=# create table foo (f1 int, f2 int); CREATE TABLE regression=# insert into foo values(1,2); INSERT 0 1 regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo) as q(x,y); row_to_json --- {"x":1,"y":2} (1 row) regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo offset 0) as q(x,y); row_to_json - {"f1":1,"f2":2} (1 row) That seems like something that's worth fixing going forward, but it's a lot harder to make the case that we should change it in back branches. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/08/2014 12:14 PM, Tom Lane wrote: I assume that's what you would propose for just the stable branches, and that going forward we'd always use the aliases from the RTE? That would be my druthers. But given the lack of complaints, maybe we should just stick to the more-backwards-compatible behavior until someone does complain. Thoughts? Wouldn't that would mean we might not pick up the expected aliases in select row_to_json(q) from (select a,b from foo) as q(x,y) ? If so, I'm definitely in favor of fixing this now. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Tom Lane > Andrew Dunstan writes: >> I assume that's what you would propose for just the stable branches, and >> that going forward we'd always use the aliases from the RTE? > > That would be my druthers. But given the lack of complaints, maybe we > should just stick to the more-backwards-compatible behavior until someone > does complain. Thoughts? I think consistent use of the aliases would be less surprising and should be what we do on master. I agree that we should avoid breaking anything that might be working as intended on stable branches. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Andrew Dunstan writes: > On 11/08/2014 11:24 AM, Tom Lane wrote: >> That seems like a pretty silly move: it wouldn't actually fix anything, >> and it would break cases that perhaps are acceptable to users today. > What evidence do you have that it might be acceptable to today's users? > The only evidence we have that I know of is Ross' complaint that > indicates that it's not acceptable. The fact that we've only gotten two bug reports in however many years the failure has been possible might mean that few people encounter the case, or it might mean that it doesn't break things for their usages so badly that they need to complain. If the latter, throwing an error rather than fixing it is not going to be an improvement. > I assume that's what you would propose for just the stable branches, and > that going forward we'd always use the aliases from the RTE? That would be my druthers. But given the lack of complaints, maybe we should just stick to the more-backwards-compatible behavior until someone does complain. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/08/2014 11:24 AM, Tom Lane wrote: Andrew Dunstan writes: On 11/08/2014 09:26 AM, Robert Haas wrote: I'm not sure whether this is safe enough to back-patch, but it seems like we should probably plan to back-patch *something*, because the status quo isn't great either. I confirm that Tom's patch does indeed fix my test case that produces empty field names. We should probably not backpatch it, as it is a behaviour change. However, I do think we should add checks in composite_to_json and hstore_from_record for empty field names, and error out if they are found. That seems like a pretty silly move: it wouldn't actually fix anything, and it would break cases that perhaps are acceptable to users today. What evidence do you have that it might be acceptable to today's users? The only evidence we have that I know of is Ross' complaint that indicates that it's not acceptable. However, We could reduce the risks involved by narrowing the cases in which ExecEvalWholeRowVar will replace field names it got from the input. I'd be inclined to propose: 1. If Var is of a named composite type, use *exactly* the field names associated with that type. (This avoids the need to possibly produce RECORD outputs from a named-type Var, thus removing the Assert-weakening issue.) 2. If Var is of type RECORD, replace only empty field names with aliases from the RTE. (This might sound inconsistent --- could you end up with some names coming from point A and some from point B? --- but in practice I think it would always be all-or-nothing, because the issue is whether or not the planner bothered to attach column names to a lower-level targetlist.) I assume that's what you would propose for just the stable branches, and that going forward we'd always use the aliases from the RTE? Or maybe I'm not quite understanding enough which cases will be covered. To the extent that I do this sounds OK. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Andrew Dunstan writes: > On 11/08/2014 09:26 AM, Robert Haas wrote: >> I'm not sure whether this is safe enough to back-patch, but it seems >> like we should probably plan to back-patch *something*, because the >> status quo isn't great either. > I confirm that Tom's patch does indeed fix my test case that produces > empty field names. > We should probably not backpatch it, as it is a behaviour change. > However, I do think we should add checks in composite_to_json and > hstore_from_record for empty field names, and error out if they are > found. That seems like a pretty silly move: it wouldn't actually fix anything, and it would break cases that perhaps are acceptable to users today. We could reduce the risks involved by narrowing the cases in which ExecEvalWholeRowVar will replace field names it got from the input. I'd be inclined to propose: 1. If Var is of a named composite type, use *exactly* the field names associated with that type. (This avoids the need to possibly produce RECORD outputs from a named-type Var, thus removing the Assert-weakening issue.) 2. If Var is of type RECORD, replace only empty field names with aliases from the RTE. (This might sound inconsistent --- could you end up with some names coming from point A and some from point B? --- but in practice I think it would always be all-or-nothing, because the issue is whether or not the planner bothered to attach column names to a lower-level targetlist.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/08/2014 09:26 AM, Robert Haas wrote: On Fri, Nov 7, 2014 at 11:27 PM, Tom Lane wrote: Thoughts? I'm not sure whether this is safe enough to back-patch, but it seems like we should probably plan to back-patch *something*, because the status quo isn't great either. I confirm that Tom's patch does indeed fix my test case that produces empty field names. We should probably not backpatch it, as it is a behaviour change. However, I do think we should add checks in composite_to_json and hstore_from_record for empty field names, and error out if they are found. That seems like an outright bug which we should defend against, including in the back branches. Giving back json or hstore objects with made up names like f1 is one thing, giving them back with empty names (which, in the hstore case will collapse everything to a single field) is far worse. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On Fri, Nov 7, 2014 at 11:27 PM, Tom Lane wrote: > Thoughts? I'm not sure whether this is safe enough to back-patch, but it seems like we should probably plan to back-patch *something*, because the status quo isn't great either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Andrew Dunstan writes: > On 11/07/2014 04:59 PM, Tom Lane wrote: >> I think this is probably a variant of bug #11210, in which the problem is >> that tupledescs bubbled up from inheritance children never get column >> names assigned to them. I've been speculating about ways to fix that >> but I've not thought of anything that's not kinda painful. > Yeah, I've been running this down a bit with a debugger, and it looked > kinda like that. After chewing on this for awhile, it occurred to me that the problem could be solved in ExecEvalWholeRowVar, which is really the only place that produces composite datums "from scratch". We can look up the RTE that the whole-row Var references and scrape column aliases from it. The attached draft patch does that, and appears to fix both Ross' example and the one in bug #11210. Although this patch doesn't change any existing regression test outputs as far as I can find, it's not hard to think of cases where it will change the results. In particular: regression=# create table foo (f1 int, f2 int); CREATE TABLE regression=# insert into foo values(1,2); INSERT 0 1 regression=# select row_to_json(f) from foo f; row_to_json - {"f1":1,"f2":2} (1 row) regression=# select row_to_json(f) from foo f(x,y); row_to_json --- {"x":1,"y":2} (1 row) Without this patch, you get the same result from the second query as from the first. I argue that using the column aliases is clearly saner behavior; I think there have been complaints about that in the past, but am too lazy to trawl the archives right now. However, it's also clear that people might have applications that are expecting the old behavior and don't realize that it might be thought a bug. I'm hesitant to back-patch this anyway, because I'm not sure that there aren't any other unexpected side-effects. Note in particular the assertion I had to weaken in ExecEvalConvertRowtype because a whole-row Var is now capable of returning tuple datums that are marked with a different rowtype than the parser marked the Var with. That's a little bit scary ... On the positive side, this lets us remove some kluges, like the place in nodeFunctionscan.c where we were hacking the output tupledesc's column names in pretty much the same way this patch has ExecEvalWholeRowVar do. I thought that was a crock when it went in, because no other plan node is doing anything like that, so it's good to be able to remove it. Thoughts? regards, tom lane diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 7cfa63f..3ab460c 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *** *** 50,55 --- 50,56 #include "nodes/nodeFuncs.h" #include "optimizer/planner.h" #include "parser/parse_coerce.h" + #include "parser/parsetree.h" #include "pgstat.h" #include "utils/acl.h" #include "utils/builtins.h" *** ExecEvalWholeRowVar(WholeRowVarExprState *** 712,717 --- 713,720 { Var *variable = (Var *) wrvstate->xprstate.expr; TupleTableSlot *slot; + TupleDesc output_tupdesc; + MemoryContext oldcontext; bool needslow = false; if (isDone) *** ExecEvalWholeRowVar(WholeRowVarExprState *** 787,794 /* If so, build the junkfilter in the query memory context */ if (junk_filter_needed) { - MemoryContext oldcontext; - oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); wrvstate->wrv_junkFilter = ExecInitJunkFilter(subplan->plan->targetlist, --- 790,795 *** ExecEvalWholeRowVar(WholeRowVarExprState *** 860,869 needslow = true; /* need runtime check for null */ } ReleaseTupleDesc(var_tupdesc); } ! /* Skip the checking on future executions of node */ if (needslow) wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow; else --- 861,946 needslow = true; /* need runtime check for null */ } + /* + * Use the variable's declared rowtype as the descriptor for the + * output values, modulo possibly assigning new column names below. In + * particular, we *must* absorb any attisdropped markings. + */ + oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); + output_tupdesc = CreateTupleDescCopy(var_tupdesc); + MemoryContextSwitchTo(oldcontext); + ReleaseTupleDesc(var_tupdesc); } + else + { + /* + * In the RECORD case, we use the input slot's rowtype as the + * descriptor for the output values, modulo possibly assigning new + * column names below. + */ + oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory); + output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor); + MemoryContextSwitchTo(oldcontext); + } ! /* ! * Construct a tuple descriptor for the composite values we'll produce, ! * and make sure its record type is "blessed". The
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/07/2014 04:59 PM, Tom Lane wrote: Andrew Dunstan writes: On 11/07/2014 10:51 AM, Ross Reedstrom wrote: row_to_json() yields empty strings for json keys if the data is fulfilled by an index only scan. Could this be a bug in lookup_rowtype_tupdesc()? I think this is probably a variant of bug #11210, in which the problem is that tupledescs bubbled up from inheritance children never get column names assigned to them. I've been speculating about ways to fix that but I've not thought of anything that's not kinda painful. Yeah, I've been running this down a bit with a debugger, and it looked kinda like that. Maybe we should look for this in places we know it matters (e.g. hstore_from_record() and composite_to_json() and raise an error if we find empty names, with a hint to do (what?). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
Andrew Dunstan writes: > On 11/07/2014 10:51 AM, Ross Reedstrom wrote: >> row_to_json() yields empty strings for json keys if the data is >> fulfilled by an index only scan. > Could this be a bug in lookup_rowtype_tupdesc()? I think this is probably a variant of bug #11210, in which the problem is that tupledescs bubbled up from inheritance children never get column names assigned to them. I've been speculating about ways to fix that but I've not thought of anything that's not kinda painful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/07/2014 11:17 AM, Andrew Dunstan wrote: On 11/07/2014 10:51 AM, Ross Reedstrom wrote: This is a serious bug in 9.3.5 and 9.4 beta3: row_to_json() yields empty strings for json keys if the data is fulfilled by an index only scan. Example: testjson=# select count(*) from document_acl; count --- 426 (1 row) testjson=# SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; row_to_json - {"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"} That seems odd. Here's what the relevant code does: td = DatumGetHeapTupleHeader(composite); /* Extract rowtype info and find a tupdesc */ tupType = HeapTupleHeaderGetTypeId(td); tupTypmod = HeapTupleHeaderGetTypMod(td); tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod); ... for (i = 0; i < tupdesc->natts; i++) ... attname = NameStr(tupdesc->attrs[i]->attname); escape_json(result, attname); Could this be a bug in lookup_rowtype_tupdesc()? Further data point: There's nothing json-specific about this, BTW: andrew=# select hstore(q) from (select * from idxo order by a) q; hstore - ""=>"1" (1 row) andrew=# set enable_seqscan = true; SET andrew=# select hstore(q) from (select * from idxo order by a) q; hstore -- "a"=>"1", "b"=>"b", "c"=>"c" (1 row) So it looks like the index scan only stuff is broken somewhere. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json bug with index only scans: empty keys!
On 11/07/2014 10:51 AM, Ross Reedstrom wrote: This is a serious bug in 9.3.5 and 9.4 beta3: row_to_json() yields empty strings for json keys if the data is fulfilled by an index only scan. Example: testjson=# select count(*) from document_acl; count --- 426 (1 row) testjson=# SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; row_to_json - {"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"} That seems odd. Here's what the relevant code does: td = DatumGetHeapTupleHeader(composite); /* Extract rowtype info and find a tupdesc */ tupType = HeapTupleHeaderGetTypeId(td); tupTypmod = HeapTupleHeaderGetTypMod(td); tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod); ... for (i = 0; i < tupdesc->natts; i++) ... attname = NameStr(tupdesc->attrs[i]->attname); escape_json(result, attname); Could this be a bug in lookup_rowtype_tupdesc()? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] row_to_json bug with index only scans: empty keys!
This is a serious bug in 9.3.5 and 9.4 beta3: row_to_json() yields empty strings for json keys if the data is fulfilled by an index only scan. Example: testjson=# select count(*) from document_acl; count --- 426 (1 row) testjson=# SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; row_to_json - {"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"} (1 row) testjson=# explain SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; QUERY PLAN Subquery Scan on combined_rows (cost=0.27..8.30 rows=1 width=76) -> Index Only Scan using document_acl_text_pkey on document_acl_text acl (cost=0.27..8.29 rows=1 width=52) Index Cond: (uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'::text) Planning time: 0.093 ms (4 rows) # set enable_indexonlyscan to off; SET testjson=# SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; row_to_json -- {"uuid":"8f774048-8936-4d7f-aa38-1974c91bbef2","user_id":"admin","permission":"publish"} (1 row) tjson=# explain SELECT row_to_json(combined_rows) FROM ( SELECT uuid, user_id AS uid, permission FROM document_acl_text AS acl WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2' ORDER BY user_id ASC, permission ASC ) as combined_rows; QUERY PLAN --- Subquery Scan on combined_rows (cost=0.27..8.30 rows=1 width=76) -> Index Scan using document_acl_text_pkey on document_acl_text acl (cost=0.27..8.29 rows=1 width=52) Index Cond: (uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'::text) Planning time: 0.095 ms (4 rows) We have a table defined as so: CREATE TYPE permission_type AS ENUM ( 'publish' ); create table "document_acl" ( "uuid" UUID, "user_id" TEXT, "permission" permission_type NOT NULL, PRIMARY KEY ("uuid", "user_id", "permission"), FOREIGN KEY ("uuid") REFERENCES document_controls ("uuid") ); The uuid and enums make no difference - I've made an all text version as well, same problem. testjson=# select version(); version - PostgreSQL 9.4beta3 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit (1 row) Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer & Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json() Bug
On Feb 23, 2012, at 8:49 PM, Andrew Dunstan wrote: > Fixed, Thanks for the report. (Also fixed in my 9.1 backport). Awesome, thanks, will try it tomorrow. David smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] row_to_json() Bug
On 02/23/2012 09:09 PM, Andrew Dunstan wrote: On 02/23/2012 08:35 PM, David E. Wheeler wrote: Looks like row_to_json() thinks 0s are nulls: postgres=# select row(0); row - (0) (1 row) postgres=# SELECT row_to_json(row(0)); row_to_json - {"f1":null} (1 row) Yeah, ouch, will fix. Fixed, Thanks for the report. (Also fixed in my 9.1 backport). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] row_to_json() Bug
On 02/23/2012 08:35 PM, David E. Wheeler wrote: Looks like row_to_json() thinks 0s are nulls: postgres=# select row(0); row - (0) (1 row) postgres=# SELECT row_to_json(row(0)); row_to_json - {"f1":null} (1 row) Yeah, ouch, will fix. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] row_to_json() Bug
Looks like row_to_json() thinks 0s are nulls: postgres=# select row(0); row - (0) (1 row) postgres=# SELECT row_to_json(row(0)); row_to_json - {"f1":null} (1 row) Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers