Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread Tom Lane
Robert Haas  writes:
> On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane  wrote:
>> BTW, has anyone got an opinion about whether to stick the full fix into
>> 9.4?

> I think we should put the full fix in 9.4.

Since nobody spoke against that, I've put the full fix into HEAD and 9.4,
and the restricted version into 9.3 and 9.2.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread Andrew Dunstan


On 11/10/2014 10:19 AM, Robert Haas wrote:

On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane  wrote:

I wrote:

Attached are patches meant for HEAD and 9.2-9.4 respectively.

BTW, has anyone got an opinion about whether to stick the full fix into
9.4?  The argument for, of course, is that we'd get the full fix out to
the public a year sooner.  The argument against is that someone might
have already done compatibility testing of their application against
9.4 betas, and then get blindsided if we change this before release.

I'm inclined to think that since we expect json/jsonb usage to really
take off with 9.4, it might be better if we get row_to_json behaving
unsurprisingly now.  But I'm not dead set on it.

I think we should put the full fix in 9.4.



+1

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread Ross Reedstrom
Yes, it's late beta, but especially if we're pushing json/jsonb usage
as a major feature of this release, I'd say remove surprising cases
now. It's late beta, but that's what beta is for, the last chance for
bug fixes, before we live w/ it for the support life of the release.

The affected class are people with an app that already uses json,
so 9.2 or better, have ran acceptance tests against an early beta of
9.4, and have SQL that returns madeup fieldnames instead of the appropriate
aliases? Which they were probably annoyed at when they wrote the code that
consumes that output in the first place. I think an item in the list of
compatability changes/gotchas should catch anyone who is that on the ball
as to be testing the betas anyway. Anyone pushing that envelope is going to
do the same acceptance testing against the actual release before rolling
to production.

Ross


On Mon, Nov 10, 2014 at 10:11:25AM -0500, Tom Lane wrote:
> I wrote:
> > Attached are patches meant for HEAD and 9.2-9.4 respectively.
> 
> BTW, has anyone got an opinion about whether to stick the full fix into
> 9.4?  The argument for, of course, is that we'd get the full fix out to
> the public a year sooner.  The argument against is that someone might
> have already done compatibility testing of their application against
> 9.4 betas, and then get blindsided if we change this before release.
> 
> I'm inclined to think that since we expect json/jsonb usage to really
> take off with 9.4, it might be better if we get row_to_json behaving
> unsurprisingly now.  But I'm not dead set on it.
> 
>   regards, tom lane
> 

-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer & Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread David G Johnston
Tom Lane-2 wrote
> I wrote:
>> Attached are patches meant for HEAD and 9.2-9.4 respectively.
> 
> BTW, has anyone got an opinion about whether to stick the full fix into
> 9.4?  The argument for, of course, is that we'd get the full fix out to
> the public a year sooner.  The argument against is that someone might
> have already done compatibility testing of their application against
> 9.4 betas, and then get blindsided if we change this before release.
> 
> I'm inclined to think that since we expect json/jsonb usage to really
> take off with 9.4, it might be better if we get row_to_json behaving
> unsurprisingly now.  But I'm not dead set on it.

I am not one of those people who would be blindsided but I'd much rather
inconvenience our beta testers in order to get a better product in place for
the masses.

Beta testers read release notes and should be able to identify and fix any
problematic areas of their code while simultaneously being happy for the
improvements.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/row-to-json-bug-with-index-only-scans-empty-keys-tp5826070p5826338.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread Robert Haas
On Mon, Nov 10, 2014 at 10:11 AM, Tom Lane  wrote:
> I wrote:
>> Attached are patches meant for HEAD and 9.2-9.4 respectively.
>
> BTW, has anyone got an opinion about whether to stick the full fix into
> 9.4?  The argument for, of course, is that we'd get the full fix out to
> the public a year sooner.  The argument against is that someone might
> have already done compatibility testing of their application against
> 9.4 betas, and then get blindsided if we change this before release.
>
> I'm inclined to think that since we expect json/jsonb usage to really
> take off with 9.4, it might be better if we get row_to_json behaving
> unsurprisingly now.  But I'm not dead set on it.

I think we should put the full fix in 9.4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-10 Thread Tom Lane
I wrote:
> Attached are patches meant for HEAD and 9.2-9.4 respectively.

BTW, has anyone got an opinion about whether to stick the full fix into
9.4?  The argument for, of course, is that we'd get the full fix out to
the public a year sooner.  The argument against is that someone might
have already done compatibility testing of their application against
9.4 betas, and then get blindsided if we change this before release.

I'm inclined to think that since we expect json/jsonb usage to really
take off with 9.4, it might be better if we get row_to_json behaving
unsurprisingly now.  But I'm not dead set on it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-09 Thread Tom Lane
I wrote:
> We could reduce the risks involved by narrowing the cases in which
> ExecEvalWholeRowVar will replace field names it got from the input.
> I'd be inclined to propose:

> 1. If Var is of a named composite type, use *exactly* the field names
> associated with that type.  (This avoids the need to possibly produce
> RECORD outputs from a named-type Var, thus removing the Assert-weakening
> issue.)

> 2. If Var is of type RECORD, replace only empty field names with aliases
> from the RTE.  (This might sound inconsistent --- could you end up with
> some names coming from point A and some from point B? --- but in practice
> I think it would always be all-or-nothing, because the issue is whether
> or not the planner bothered to attach column names to a lower-level
> targetlist.)

Attached are patches meant for HEAD and 9.2-9.4 respectively.  The HEAD
patch extends the prior patch to fix the RowExpr case I mentioned
yesterday.  The back-branch patch works as suggested above.  I added a
bunch of regression tests that document behavior in this area.  The two
patches include the same set of tests but have different expected output
for the cases we are intentionally not fixing in back branches.  If you
try the back-branch regression tests on an unpatched backend, you can
verify that the only cases that change behavior are ones where current
sources put out empty field names.

The test cases use row_to_json() so they could not be used directly before
9.2.  I tried the same cases using hstore(record) in 9.1.  While 9.1 does
some pretty darn dubious things, it does not produce empty field names in
any of these cases, so I think we probably should not consider
back-patching further than 9.2.

regards, tom lane

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 7cfa63f..88af735 100644
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
***
*** 50,55 
--- 50,56 
  #include "nodes/nodeFuncs.h"
  #include "optimizer/planner.h"
  #include "parser/parse_coerce.h"
+ #include "parser/parsetree.h"
  #include "pgstat.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 712,717 
--- 713,720 
  {
  	Var		   *variable = (Var *) wrvstate->xprstate.expr;
  	TupleTableSlot *slot;
+ 	TupleDesc	output_tupdesc;
+ 	MemoryContext oldcontext;
  	bool		needslow = false;
  
  	if (isDone)
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 787,794 
  			/* If so, build the junkfilter in the query memory context */
  			if (junk_filter_needed)
  			{
- MemoryContext oldcontext;
- 
  oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
  wrvstate->wrv_junkFilter =
  	ExecInitJunkFilter(subplan->plan->targetlist,
--- 790,795 
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 860,869 
  needslow = true;	/* need runtime check for null */
  		}
  
  		ReleaseTupleDesc(var_tupdesc);
  	}
  
! 	/* Skip the checking on future executions of node */
  	if (needslow)
  		wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow;
  	else
--- 861,920 
  needslow = true;	/* need runtime check for null */
  		}
  
+ 		/*
+ 		 * Use the variable's declared rowtype as the descriptor for the
+ 		 * output values, modulo possibly assigning new column names below. In
+ 		 * particular, we *must* absorb any attisdropped markings.
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+ 		output_tupdesc = CreateTupleDescCopy(var_tupdesc);
+ 		MemoryContextSwitchTo(oldcontext);
+ 
  		ReleaseTupleDesc(var_tupdesc);
  	}
+ 	else
+ 	{
+ 		/*
+ 		 * In the RECORD case, we use the input slot's rowtype as the
+ 		 * descriptor for the output values, modulo possibly assigning new
+ 		 * column names below.
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+ 		output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor);
+ 		MemoryContextSwitchTo(oldcontext);
+ 	}
  
! 	/*
! 	 * Construct a tuple descriptor for the composite values we'll produce,
! 	 * and make sure its record type is "blessed".  The main reason to do this
! 	 * is to be sure that operations such as row_to_json() will see the
! 	 * desired column names when they look up the descriptor from the type
! 	 * information embedded in the composite values.
! 	 *
! 	 * We already got the correct physical datatype info above, but now we
! 	 * should try to find the source RTE and adopt its column aliases, in case
! 	 * they are different from the original rowtype's names.  For example, in
! 	 * "SELECT foo(t) FROM tab t(x,y)", the first two columns in the composite
! 	 * output should be named "x" and "y" regardless of tab's column names.
! 	 *
! 	 * If we can't locate the RTE, assume the column names we've got are OK.
! 	 * (As of this writing, the only cases where we can't locate the RT

Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Andrew Dunstan


On 11/08/2014 04:19 PM, Ross Reedstrom wrote:

I've no opinion on the not respecting aliases aspect of this, but both
the hstore and json emtpy keys case breaks the format: it's duplicate keys
that collapse to a single value, and expected keynames are missing.

The insidious bit about this bug though is that it works fine during testing
with the developers typically small data sets. It's only triggered in my case
when we the plan switches to index-only. Even an index scan works fine. I can't
imagine that there is code out there that _depends_ on this behavior. Just as
likely to me are that there exist systems that just have "can't reproduce" bugs
that would be fixed by this.





No, I can't imagine it either - it's utterly broken. That's the piece 
that Tom is proposing to fix on the back branches. AIUI.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Ross Reedstrom
I've no opinion on the not respecting aliases aspect of this, but both
the hstore and json emtpy keys case breaks the format: it's duplicate keys
that collapse to a single value, and expected keynames are missing.

The insidious bit about this bug though is that it works fine during testing
with the developers typically small data sets. It's only triggered in my case
when we the plan switches to index-only. Even an index scan works fine. I can't
imagine that there is code out there that _depends_ on this behavior. Just as
likely to me are that there exist systems that just have "can't reproduce" bugs
that would be fixed by this.

Ross


On Sat, Nov 08, 2014 at 01:09:23PM -0500, Tom Lane wrote:
> Oh, some more fun: a RowExpr that's labeled with a named composite type
> (rather than RECORD) has the same issue of not respecting aliases.
> Continuing previous example with table "foo":
> 
> regression=# create view vv as select * from foo;
> CREATE VIEW
> regression=# select row_to_json(q) from vv q;
>row_to_json   
> -
>  {"f1":1,"f2":2}
> (1 row)
> 
> regression=# select row_to_json(q) from vv q(a,b);
>row_to_json   
> -
>  {"f1":1,"f2":2}
> (1 row)
> 
> So that's another case we probably want to change in HEAD but not the back
> branches.
> 
>   regards, tom lane
> 

-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer & Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Tom Lane
Oh, some more fun: a RowExpr that's labeled with a named composite type
(rather than RECORD) has the same issue of not respecting aliases.
Continuing previous example with table "foo":

regression=# create view vv as select * from foo;
CREATE VIEW
regression=# select row_to_json(q) from vv q;
   row_to_json   
-
 {"f1":1,"f2":2}
(1 row)

regression=# select row_to_json(q) from vv q(a,b);
   row_to_json   
-
 {"f1":1,"f2":2}
(1 row)

So that's another case we probably want to change in HEAD but not the back
branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Andrew Dunstan


On 11/08/2014 12:40 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 11/08/2014 12:14 PM, Tom Lane wrote:

That would be my druthers.  But given the lack of complaints, maybe we
should just stick to the more-backwards-compatible behavior until someone
does complain.  Thoughts?

Wouldn't that would mean we might not pick up the expected aliases in
  select row_to_json(q) from (select a,b from foo) as q(x,y)
? If so, I'm definitely in favor of fixing this now.

Well, it's inconsistent now.  In existing releases you might or might not
get x,y:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# insert into foo values(1,2);
INSERT 0 1
regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo) as 
q(x,y);
   row_to_json
---
  {"x":1,"y":2}
(1 row)

regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo 
offset 0) as q(x,y);
row_to_json
-
  {"f1":1,"f2":2}
(1 row)

That seems like something that's worth fixing going forward, but it's a
lot harder to make the case that we should change it in back branches.




Sure. As long as we fix the empty alias issue in the back branches, just 
fixing this prospectively seems fine. But I don't think we should wait 
for more complaints.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/08/2014 12:14 PM, Tom Lane wrote:
>> That would be my druthers.  But given the lack of complaints, maybe we
>> should just stick to the more-backwards-compatible behavior until someone
>> does complain.  Thoughts?

> Wouldn't that would mean we might not pick up the expected aliases in
>  select row_to_json(q) from (select a,b from foo) as q(x,y)
> ? If so, I'm definitely in favor of fixing this now.

Well, it's inconsistent now.  In existing releases you might or might not
get x,y:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# insert into foo values(1,2);
INSERT 0 1
regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo) as 
q(x,y);
  row_to_json  
---
 {"x":1,"y":2}
(1 row)

regression=# select row_to_json(q) from (select f1 as a, f2 as b from foo 
offset 0) as q(x,y);
   row_to_json   
-
 {"f1":1,"f2":2}
(1 row)

That seems like something that's worth fixing going forward, but it's a
lot harder to make the case that we should change it in back branches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Andrew Dunstan


On 11/08/2014 12:14 PM, Tom Lane wrote:

I assume that's what you would propose for just the stable branches, and
that going forward we'd always use the aliases from the RTE?

That would be my druthers.  But given the lack of complaints, maybe we
should just stick to the more-backwards-compatible behavior until someone
does complain.  Thoughts?





Wouldn't that would mean we might not pick up the expected aliases in

select row_to_json(q) from (select a,b from foo) as q(x,y)

? If so, I'm definitely in favor of fixing this now.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Kevin Grittner
Tom Lane 
> Andrew Dunstan  writes:

>> I assume that's what you would propose for just the stable branches, and
>> that going forward we'd always use the aliases from the RTE?
>
> That would be my druthers.  But given the lack of complaints, maybe we
> should just stick to the more-backwards-compatible behavior until someone
> does complain.  Thoughts?

I think consistent use of the aliases would be less surprising and
should be what we do on master.  I agree that we should avoid
breaking anything that might be working as intended on stable
branches.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/08/2014 11:24 AM, Tom Lane wrote:
>> That seems like a pretty silly move: it wouldn't actually fix anything,
>> and it would break cases that perhaps are acceptable to users today.

> What evidence do you have that it might be acceptable to today's users? 
> The only evidence we have that I know of is Ross' complaint that 
> indicates that it's not acceptable.

The fact that we've only gotten two bug reports in however many years the
failure has been possible might mean that few people encounter the case,
or it might mean that it doesn't break things for their usages so badly
that they need to complain.  If the latter, throwing an error rather than
fixing it is not going to be an improvement.

> I assume that's what you would propose for just the stable branches, and 
> that going forward we'd always use the aliases from the RTE?

That would be my druthers.  But given the lack of complaints, maybe we
should just stick to the more-backwards-compatible behavior until someone
does complain.  Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Andrew Dunstan


On 11/08/2014 11:24 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 11/08/2014 09:26 AM, Robert Haas wrote:

I'm not sure whether this is safe enough to back-patch, but it seems
like we should probably plan to back-patch *something*, because the
status quo isn't great either.

I confirm that Tom's patch does indeed fix my test case that produces
empty field names.
We should probably not backpatch it, as it is a behaviour change.
However, I do think we should add checks in composite_to_json and
hstore_from_record for empty field names, and error out if they are
found.

That seems like a pretty silly move: it wouldn't actually fix anything,
and it would break cases that perhaps are acceptable to users today.


What evidence do you have that it might be acceptable to today's users? 
The only evidence we have that I know of is Ross' complaint that 
indicates that it's not acceptable.


However,



We could reduce the risks involved by narrowing the cases in which
ExecEvalWholeRowVar will replace field names it got from the input.
I'd be inclined to propose:

1. If Var is of a named composite type, use *exactly* the field names
associated with that type.  (This avoids the need to possibly produce
RECORD outputs from a named-type Var, thus removing the Assert-weakening
issue.)

2. If Var is of type RECORD, replace only empty field names with aliases
from the RTE.  (This might sound inconsistent --- could you end up with
some names coming from point A and some from point B? --- but in practice
I think it would always be all-or-nothing, because the issue is whether
or not the planner bothered to attach column names to a lower-level
targetlist.)




I assume that's what you would propose for just the stable branches, and 
that going forward we'd always use the aliases from the RTE? Or maybe 
I'm not quite understanding enough which cases will be covered. To the 
extent that I do this sounds OK.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/08/2014 09:26 AM, Robert Haas wrote:
>> I'm not sure whether this is safe enough to back-patch, but it seems
>> like we should probably plan to back-patch *something*, because the
>> status quo isn't great either.

> I confirm that Tom's patch does indeed fix my test case that produces 
> empty field names.

> We should probably not backpatch it, as it is a behaviour change. 
> However, I do think we should add checks in composite_to_json and 
> hstore_from_record for empty field names, and error out if they are 
> found.

That seems like a pretty silly move: it wouldn't actually fix anything,
and it would break cases that perhaps are acceptable to users today.

We could reduce the risks involved by narrowing the cases in which
ExecEvalWholeRowVar will replace field names it got from the input.
I'd be inclined to propose:

1. If Var is of a named composite type, use *exactly* the field names
associated with that type.  (This avoids the need to possibly produce
RECORD outputs from a named-type Var, thus removing the Assert-weakening
issue.)

2. If Var is of type RECORD, replace only empty field names with aliases
from the RTE.  (This might sound inconsistent --- could you end up with
some names coming from point A and some from point B? --- but in practice
I think it would always be all-or-nothing, because the issue is whether
or not the planner bothered to attach column names to a lower-level
targetlist.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Andrew Dunstan


On 11/08/2014 09:26 AM, Robert Haas wrote:

On Fri, Nov 7, 2014 at 11:27 PM, Tom Lane  wrote:

Thoughts?

I'm not sure whether this is safe enough to back-patch, but it seems
like we should probably plan to back-patch *something*, because the
status quo isn't great either.




I confirm that Tom's patch does indeed fix my test case that produces 
empty field names.


We should probably not backpatch it, as it is a behaviour change. 
However, I do think we should add checks in composite_to_json and 
hstore_from_record for empty field names, and error out if they are 
found. That seems like an outright bug which we should defend against, 
including in the back branches. Giving back json or hstore objects with 
made up names like f1 is one thing, giving them back with empty names 
(which, in the hstore case will collapse everything to a single field) 
is far worse.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-08 Thread Robert Haas
On Fri, Nov 7, 2014 at 11:27 PM, Tom Lane  wrote:
> Thoughts?

I'm not sure whether this is safe enough to back-patch, but it seems
like we should probably plan to back-patch *something*, because the
status quo isn't great either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/07/2014 04:59 PM, Tom Lane wrote:
>> I think this is probably a variant of bug #11210, in which the problem is
>> that tupledescs bubbled up from inheritance children never get column
>> names assigned to them.  I've been speculating about ways to fix that
>> but I've not thought of anything that's not kinda painful.

> Yeah, I've been running this down a bit with a debugger, and it looked 
> kinda like that.

After chewing on this for awhile, it occurred to me that the problem could
be solved in ExecEvalWholeRowVar, which is really the only place that
produces composite datums "from scratch".  We can look up the RTE that
the whole-row Var references and scrape column aliases from it.  The
attached draft patch does that, and appears to fix both Ross' example
and the one in bug #11210.

Although this patch doesn't change any existing regression test outputs
as far as I can find, it's not hard to think of cases where it will change
the results.  In particular:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# insert into foo values(1,2);
INSERT 0 1
regression=# select row_to_json(f) from foo f;
   row_to_json   
-
 {"f1":1,"f2":2}
(1 row)

regression=# select row_to_json(f) from foo f(x,y);
  row_to_json  
---
 {"x":1,"y":2}
(1 row)

Without this patch, you get the same result from the second query as
from the first.  I argue that using the column aliases is clearly saner
behavior; I think there have been complaints about that in the past,
but am too lazy to trawl the archives right now.  However, it's also
clear that people might have applications that are expecting the old
behavior and don't realize that it might be thought a bug.

I'm hesitant to back-patch this anyway, because I'm not sure that there
aren't any other unexpected side-effects.  Note in particular the
assertion I had to weaken in ExecEvalConvertRowtype because a whole-row
Var is now capable of returning tuple datums that are marked with a
different rowtype than the parser marked the Var with.  That's a little
bit scary ...

On the positive side, this lets us remove some kluges, like the place in
nodeFunctionscan.c where we were hacking the output tupledesc's column
names in pretty much the same way this patch has ExecEvalWholeRowVar do.
I thought that was a crock when it went in, because no other plan node
is doing anything like that, so it's good to be able to remove it.

Thoughts?

regards, tom lane

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 7cfa63f..3ab460c 100644
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
***
*** 50,55 
--- 50,56 
  #include "nodes/nodeFuncs.h"
  #include "optimizer/planner.h"
  #include "parser/parse_coerce.h"
+ #include "parser/parsetree.h"
  #include "pgstat.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 712,717 
--- 713,720 
  {
  	Var		   *variable = (Var *) wrvstate->xprstate.expr;
  	TupleTableSlot *slot;
+ 	TupleDesc	output_tupdesc;
+ 	MemoryContext oldcontext;
  	bool		needslow = false;
  
  	if (isDone)
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 787,794 
  			/* If so, build the junkfilter in the query memory context */
  			if (junk_filter_needed)
  			{
- MemoryContext oldcontext;
- 
  oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
  wrvstate->wrv_junkFilter =
  	ExecInitJunkFilter(subplan->plan->targetlist,
--- 790,795 
*** ExecEvalWholeRowVar(WholeRowVarExprState
*** 860,869 
  needslow = true;	/* need runtime check for null */
  		}
  
  		ReleaseTupleDesc(var_tupdesc);
  	}
  
! 	/* Skip the checking on future executions of node */
  	if (needslow)
  		wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow;
  	else
--- 861,946 
  needslow = true;	/* need runtime check for null */
  		}
  
+ 		/*
+ 		 * Use the variable's declared rowtype as the descriptor for the
+ 		 * output values, modulo possibly assigning new column names below. In
+ 		 * particular, we *must* absorb any attisdropped markings.
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+ 		output_tupdesc = CreateTupleDescCopy(var_tupdesc);
+ 		MemoryContextSwitchTo(oldcontext);
+ 
  		ReleaseTupleDesc(var_tupdesc);
  	}
+ 	else
+ 	{
+ 		/*
+ 		 * In the RECORD case, we use the input slot's rowtype as the
+ 		 * descriptor for the output values, modulo possibly assigning new
+ 		 * column names below.
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+ 		output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor);
+ 		MemoryContextSwitchTo(oldcontext);
+ 	}
  
! 	/*
! 	 * Construct a tuple descriptor for the composite values we'll produce,
! 	 * and make sure its record type is "blessed".  The

Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Andrew Dunstan


On 11/07/2014 04:59 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 11/07/2014 10:51 AM, Ross Reedstrom wrote:

row_to_json() yields empty strings for json keys if the data is
fulfilled by an index only scan.

Could this be a bug in lookup_rowtype_tupdesc()?

I think this is probably a variant of bug #11210, in which the problem is
that tupledescs bubbled up from inheritance children never get column
names assigned to them.  I've been speculating about ways to fix that
but I've not thought of anything that's not kinda painful.


Yeah, I've been running this down a bit with a debugger, and it looked 
kinda like that. Maybe we should look for this in places we know it 
matters (e.g. hstore_from_record() and composite_to_json() and raise an 
error if we find empty names, with a hint to do (what?).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/07/2014 10:51 AM, Ross Reedstrom wrote:
>> row_to_json() yields empty strings for json keys if the data is
>> fulfilled by an index only scan.

> Could this be a bug in lookup_rowtype_tupdesc()?

I think this is probably a variant of bug #11210, in which the problem is
that tupledescs bubbled up from inheritance children never get column
names assigned to them.  I've been speculating about ways to fix that
but I've not thought of anything that's not kinda painful.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Andrew Dunstan


On 11/07/2014 11:17 AM, Andrew Dunstan wrote:


On 11/07/2014 10:51 AM, Ross Reedstrom wrote:

This is a serious bug in 9.3.5 and 9.4 beta3:

row_to_json() yields empty strings for json keys if the data is
fulfilled by an index only scan.

Example:

testjson=# select count(*) from document_acl;
  count
---
426
(1 row)

testjson=# SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
  row_to_json
-
{"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"}



That seems odd. Here's what the relevant code does:

td = DatumGetHeapTupleHeader(composite);

/* Extract rowtype info and find a tupdesc */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);

   ...
for (i = 0; i < tupdesc->natts; i++)
 ...

attname = NameStr(tupdesc->attrs[i]->attname);
escape_json(result, attname);

Could this be a bug in lookup_rowtype_tupdesc()?






Further data point:

There's nothing json-specific about this, BTW:

   andrew=# select hstore(q) from (select * from idxo order by a) q;
 hstore
   -
 ""=>"1"
   (1 row)

   andrew=# set enable_seqscan = true;
   SET
   andrew=# select hstore(q) from (select * from idxo order by a) q;
hstore
   --
 "a"=>"1", "b"=>"b", "c"=>"c"
   (1 row)


So it looks like the index scan only stuff is broken somewhere.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Andrew Dunstan


On 11/07/2014 10:51 AM, Ross Reedstrom wrote:

This is a serious bug in 9.3.5 and 9.4 beta3:

row_to_json() yields empty strings for json keys if the data is
fulfilled by an index only scan.

Example:

testjson=# select count(*) from document_acl;
  count
---
426
(1 row)

testjson=# SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
  row_to_json
-
  {"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"}



That seems odd. Here's what the relevant code does:

td = DatumGetHeapTupleHeader(composite);

/* Extract rowtype info and find a tupdesc */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);

   ...
for (i = 0; i < tupdesc->natts; i++)
 ...

attname = NameStr(tupdesc->attrs[i]->attname);
escape_json(result, attname);

Could this be a bug in lookup_rowtype_tupdesc()?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] row_to_json bug with index only scans: empty keys!

2014-11-07 Thread Ross Reedstrom
This is a serious bug in 9.3.5 and 9.4 beta3:

row_to_json() yields empty strings for json keys if the data is
fulfilled by an index only scan.

Example:

testjson=# select count(*) from document_acl;
 count 
---
   426
(1 row)

testjson=# SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
 row_to_json 
-
 {"":"8f774048-8936-4d7f-aa38-1974c91bbef2","":"admin","":"publish"}
(1 row)

testjson=# explain SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
   QUERY PLAN   


 Subquery Scan on combined_rows  (cost=0.27..8.30 rows=1 width=76)
   ->  Index Only Scan using document_acl_text_pkey on document_acl_text acl  
(cost=0.27..8.29 rows=1 width=52)
 Index Cond: (uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'::text)
 Planning time: 0.093 ms
(4 rows)

# set enable_indexonlyscan to off;
SET
testjson=# SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
   row_to_json  
  
--
 
{"uuid":"8f774048-8936-4d7f-aa38-1974c91bbef2","user_id":"admin","permission":"publish"}
(1 row)

tjson=# explain SELECT row_to_json(combined_rows) FROM (
SELECT uuid, user_id AS uid, permission
FROM document_acl_text AS acl
WHERE uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'
ORDER BY user_id ASC, permission ASC
) as combined_rows;
QUERY PLAN  
   
---
 Subquery Scan on combined_rows  (cost=0.27..8.30 rows=1 width=76)
   ->  Index Scan using document_acl_text_pkey on document_acl_text acl  
(cost=0.27..8.29 rows=1 width=52)
 Index Cond: (uuid = '8f774048-8936-4d7f-aa38-1974c91bbef2'::text)
 Planning time: 0.095 ms
(4 rows)

We have a table defined as so:

CREATE TYPE permission_type AS ENUM (
   'publish'
);

create table "document_acl" (
   "uuid" UUID,
   "user_id" TEXT,
   "permission" permission_type NOT NULL,
   PRIMARY KEY ("uuid", "user_id", "permission"),
   FOREIGN KEY ("uuid") REFERENCES document_controls ("uuid")
);

The uuid and enums make no difference - I've made an all text version as well,
same problem.

testjson=# select version();
 version
 
-
 PostgreSQL 9.4beta3 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 
4.8.2-19ubuntu1) 4.8.2, 64-bit
(1 row)

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer & Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json() Bug

2012-02-23 Thread David E. Wheeler
On Feb 23, 2012, at 8:49 PM, Andrew Dunstan wrote:

> Fixed, Thanks for the report. (Also fixed in my 9.1 backport).

Awesome, thanks, will try it tomorrow.

David



smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] row_to_json() Bug

2012-02-23 Thread Andrew Dunstan



On 02/23/2012 09:09 PM, Andrew Dunstan wrote:



On 02/23/2012 08:35 PM, David E. Wheeler wrote:

Looks like row_to_json() thinks 0s are nulls:

 postgres=# select row(0);
  row
 -
  (0)
 (1 row)

 postgres=# SELECT row_to_json(row(0));
  row_to_json
 -
  {"f1":null}
 (1 row)




Yeah, ouch, will fix.




Fixed, Thanks for the report. (Also fixed in my 9.1 backport).

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_to_json() Bug

2012-02-23 Thread Andrew Dunstan



On 02/23/2012 08:35 PM, David E. Wheeler wrote:

Looks like row_to_json() thinks 0s are nulls:

 postgres=# select row(0);
  row
 -
  (0)
 (1 row)

 postgres=# SELECT row_to_json(row(0));
  row_to_json
 -
  {"f1":null}
 (1 row)




Yeah, ouch, will fix.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] row_to_json() Bug

2012-02-23 Thread David E. Wheeler
Looks like row_to_json() thinks 0s are nulls:

postgres=# select row(0);
 row 
-
 (0)
(1 row)

postgres=# SELECT row_to_json(row(0));
 row_to_json 
-
 {"f1":null}
(1 row)

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers