Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-09-18 Thread Tom Lane
I got interested in this problem again now that we have a user complaint
about it (bug #7553).

Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 On Wed, Feb 1, 2012 at 11:01 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... It seems
 reasonable to me to try to handle relation renames but draw the line
 at disambiguating column names.  But others might find that distinction
 artificial.

 I sure do.

 For me the relation names problem and column aliases problems are two
 independent problems.

I think they are independent problems, and I also think that people are
far less likely to trip over column-name problems in practice.  Columns
of a table are not independent objects and so people aren't so likely
to think they can just rename them freely.  Moreover, if you rename
columns that are used in views, you can get breakage of things like
USING or NATURAL joins, and that is something we *cannot* provide a
workaround for --- it's a failure inherent in the language definition.

As far as the relation-rename problem goes, I propose that what we
should do is have ruleutils.c invent nonconflicting fake aliases for
each RTE in the query tree.  This would allow getting rid of some of the
dubious heuristics in get_variable: it should just print the chosen
alias and be done.  (It still has to do something different for unnamed
joins, but we can leave that part alone I think.)

We can do this as follows:

1. If there's a user-assigned alias, use that.  (It's possible this is
not unique within the query, but that's okay because any actual
variable reference must be to the most closely nested such RTE.)

2. Otherwise, if the relation's current name doesn't conflict with
any previously-assigned alias, use that.

3. Otherwise, append something (underscore and some digits probably)
to the relation's current name to construct a string not matching any
previously-assigned alias.

This might result in printouts that are a bit uglier than the old way
in such cases, but anybody who's offended can select their own aliases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-02-01 Thread Ashutosh Bapat
Looking at the code, it seems that the fake aliases (eref) for relations
(may be views as well) are not generated per say, but they do not get
changed when the relation name changes OR in case of inherited tables, they
do not get changed when the inheritance is expanded
(expand_inherited_rtentry). So, there is not question of generating them so
as to not collide with other aliases in the query. However I did not find
answers to these questions
1. What is the use of eref in case of relation when the relation name
itself can be provided by the reloid?
2. Can we use schema qualified relation name in get_from_clause_item() and
get_variable() instead of use eref-aliasname. I have noticed that the
logic in get_rte_attribute_name() gives preference to the column names in
catalog tables over eref-colnames.

Anyone?

On Mon, Jan 30, 2012 at 10:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
  So, as I understand we have two problems here
  1. Prefixing schemaname to the fake alises if there is another RTE with
  same name. There may not be a relation with that name (fake alias name
  given) in the schema chosen as prefix.
  2. Fake aliases themselves can be conflicting.

 Well, the issue is more that a fake alias might unintentionally collide
 with a regular alias elsewhere in the query.  There's no guard against
 that in the current behavior, and ISTM there needs to be.

 The one possibly-simplifying thing about this whole issue is that we
 needn't cater for references that couldn't have been made in the
 original query.  For instance, if the inner and outer queries both have
 explicit aliases tx, it's impossible for the inner query to have
 referred to any columns of the outer tx --- so we don't have to try to
 make it possible in the dumped form.

  If I understand correctly, if we solve the second problem, first problem
  will not occur. Is that correct?

 I don't believe that the problem has anything to do with the names of
 other tables that might happen to exist in the database.  It's a matter
 of what RTE names/aliases are exposed for variable references in
 different parts of the query.

regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-02-01 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 Looking at the code, it seems that the fake aliases (eref) for relations
 (may be views as well) are not generated per say, but they do not get
 changed when the relation name changes OR in case of inherited tables, they
 do not get changed when the inheritance is expanded
 (expand_inherited_rtentry). So, there is not question of generating them so
 as to not collide with other aliases in the query.

Well, what I was considering was exactly generating new aliases that
don't collide with anything else in the query.  The fact that the code
doesn't do that now doesn't mean we can't make it do that.

 However I did not find answers to these questions
 1. What is the use of eref in case of relation when the relation name
 itself can be provided by the reloid?

eref is stored mainly so that parsing code doesn't have to repeatedly
look up what the effective RTE name is.  The alias field is meant to
represent whether there was an AS clause or not, and if so exactly what
it said.  So eref is a derived result whereas alias is essentially raw
grammar output.  Because of the possibility that the relation gets
renamed, it's probably best if we don't rely on eref anymore after
initial parsing of a query, ie ruleutils.c probably shouldn't use it.
(Too lazy to go check right now if that's already true, but it seems
like a good goal to pursue if we're going to change this code.)

 2. Can we use schema qualified relation name in get_from_clause_item() and
 get_variable() instead of use eref-aliasname.

No.  If there is an alias, it is flat wrong to use the relation name
instead, with or without schema name.  You might want to go study the
SQL spec a bit in this area.

 I have noticed that the
 logic in get_rte_attribute_name() gives preference to the column names in
 catalog tables over eref-colnames.

Hm.  What it should probably do is look at alias first, and if the alias
field doesn't specify a column name, then go to the catalogs to get the
current name.


Thinking about this some more, it seems like there are ways for a user
to shoot himself in the foot pretty much irretrievably.  Consider

CREATE TABLE t (x int);
CREATE VIEW v AS SELECT y FROM t t(y);
ALTER TABLE t ADD COLUMN y int;

On dump and reload, we'll have

CREATE TABLE t (x int, y int);
CREATE VIEW v AS SELECT y FROM t t(y);

and now the CREATE VIEW will fail, complaining (correctly) that the
column reference y is ambiguous.  Should ruleutils be expected to take
it upon itself to prevent that?  We could conceive of fixing it by
inventing column aliases out of whole cloth:

CREATE VIEW v AS SELECT y FROM t t(y, the_other_y);

but that seems a little much, not to mention that such a view definition
would be horribly confusing to work with.  On the other hand it isn't
all that far beyond what I had in mind of inventing relation aliases
to cure relation-name conflicts.  Should we take the existence of such
cases as evidence that we shouldn't try hard in this area?  It seems
reasonable to me to try to handle relation renames but draw the line
at disambiguating column names.  But others might find that distinction
artificial.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-02-01 Thread Robert Haas
On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On the other hand it isn't
 all that far beyond what I had in mind of inventing relation aliases
 to cure relation-name conflicts.  Should we take the existence of such
 cases as evidence that we shouldn't try hard in this area?  It seems
 reasonable to me to try to handle relation renames but draw the line
 at disambiguating column names.  But others might find that distinction
 artificial.

I sure do.

I mean, in Oracle, if you rename a table or column involved in a view,
then the view breaks.  Blammo!  The reference is by object name, not
by some internal identifier a la OID.  If you put back an object with
the correct name (either the original one or a different one), you can
re-enable the view.

We've decide that we don't want that behavior: instead, our references
are to the object itself rather than to the name of the object.
Renaming the object doesn't change what the reference points to.  But
given that position, it seems to me that we ought to be willing to
work pretty hard to make sure that when we dump-and-reload the
database, things stay sane.  Otherwise, we're sort of in this
unsatisfying in-between place where references are *mostly* by
internal identifier but everyone once in a while it falls apart and
name collisions can break everything.  Yech!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-02-01 Thread Ashutosh Bapat
On Wed, Feb 1, 2012 at 10:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
  Looking at the code, it seems that the fake aliases (eref) for relations
  (may be views as well) are not generated per say, but they do not get
  changed when the relation name changes OR in case of inherited tables,
 they
  do not get changed when the inheritance is expanded
  (expand_inherited_rtentry). So, there is not question of generating them
 so
  as to not collide with other aliases in the query.

 Well, what I was considering was exactly generating new aliases that
 don't collide with anything else in the query.  The fact that the code
 doesn't do that now doesn't mean we can't make it do that.

  However I did not find answers to these questions
  1. What is the use of eref in case of relation when the relation name
  itself can be provided by the reloid?

 eref is stored mainly so that parsing code doesn't have to repeatedly
 look up what the effective RTE name is.  The alias field is meant to
 represent whether there was an AS clause or not, and if so exactly what
 it said.  So eref is a derived result whereas alias is essentially raw
 grammar output.  Because of the possibility that the relation gets
 renamed, it's probably best if we don't rely on eref anymore after
 initial parsing of a query, ie ruleutils.c probably shouldn't use it.
 (Too lazy to go check right now if that's already true, but it seems
 like a good goal to pursue if we're going to change this code.)

  2. Can we use schema qualified relation name in get_from_clause_item()
 and
  get_variable() instead of use eref-aliasname.

 No.  If there is an alias, it is flat wrong to use the relation name
 instead, with or without schema name.  You might want to go study the
 SQL spec a bit in this area.


To clarify matters a bit, item 2 is in conjunction with item 1. Aliases, if
provided, are output irrespective of whether we get the relation name from
eref or catalogs.  ruleutils should just ignore eref (for RTE_RELATION
only) and get the relation name from given OID.



  I have noticed that the
  logic in get_rte_attribute_name() gives preference to the column names in
  catalog tables over eref-colnames.

 Hm.  What it should probably do is look at alias first, and if the alias
 field doesn't specify a column name, then go to the catalogs to get the
 current name.


It does give preference to aliases today. I compared preferences of
colnames in eref and that obtained from catalogs.



 Thinking about this some more, it seems like there are ways for a user
 to shoot himself in the foot pretty much irretrievably.  Consider

 CREATE TABLE t (x int);
 CREATE VIEW v AS SELECT y FROM t t(y);
 ALTER TABLE t ADD COLUMN y int;

 On dump and reload, we'll have

 CREATE TABLE t (x int, y int);
 CREATE VIEW v AS SELECT y FROM t t(y);

 and now the CREATE VIEW will fail, complaining (correctly) that the
 column reference y is ambiguous.  Should ruleutils be expected to take
 it upon itself to prevent that?  We could conceive of fixing it by
 inventing column aliases out of whole cloth:

 CREATE VIEW v AS SELECT y FROM t t(y, the_other_y);

 but that seems a little much, not to mention that such a view definition
 would be horribly confusing to work with.  On the other hand it isn't
 all that far beyond what I had in mind of inventing relation aliases
 to cure relation-name conflicts.  Should we take the existence of such
 cases as evidence that we shouldn't try hard in this area?  It seems
 reasonable to me to try to handle relation renames but draw the line
 at disambiguating column names.  But others might find that distinction
 artificial.


I agree. The example of the colnames was only to show that the preference
alias  relation information from catalogs  eref exists somewhere in the
code.



regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-02-01 Thread Ashutosh Bapat
On Wed, Feb 1, 2012 at 11:01 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  On the other hand it isn't
  all that far beyond what I had in mind of inventing relation aliases
  to cure relation-name conflicts.  Should we take the existence of such
  cases as evidence that we shouldn't try hard in this area?  It seems
  reasonable to me to try to handle relation renames but draw the line
  at disambiguating column names.  But others might find that distinction
  artificial.

 I sure do.

 I mean, in Oracle, if you rename a table or column involved in a view,
 then the view breaks.  Blammo!  The reference is by object name, not
 by some internal identifier a la OID.  If you put back an object with
 the correct name (either the original one or a different one), you can
 re-enable the view.

 We've decide that we don't want that behavior: instead, our references
 are to the object itself rather than to the name of the object.
 Renaming the object doesn't change what the reference points to.  But
 given that position, it seems to me that we ought to be willing to
 work pretty hard to make sure that when we dump-and-reload the
 database, things stay sane.  Otherwise, we're sort of in this
 unsatisfying in-between place where references are *mostly* by
 internal identifier but everyone once in a while it falls apart and
 name collisions can break everything.  Yech!


For me the relation names problem and column aliases problems are two
independent problems. While the first one looks easy to fix, the other
problem may be hard to solve. We can solve the first problem and things
will be better than what we have today. If you agree, I will provide a
patch to fix the relation names problems by ignoring the eref (for
RTE_RELATION only) in ruleutils.


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-30 Thread Ashutosh Bapat
Thanks Tom for giving a stronger case. I found the problem whille looking
at inherited tables, and didn't think beyond inherited tables. Since
inherited tables are expanded when subquery planner is invoked, I thought
the problem will occur only in Explain output as we won't generate queries,
that can be used elsewhere after/during planning.

So, as I understand we have two problems here
1. Prefixing schemaname to the fake alises if there is another RTE with
same name. There may not be a relation with that name (fake alias name
given) in the schema chosen as prefix.
2. Fake aliases themselves can be conflicting.

If I understand correctly, if we solve the second problem, first problem
will not occur. Is that correct?

On Sat, Jan 28, 2012 at 8:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
  It's a feature, not a bug, that we schema-qualify names when VERBOSE
  is specified.  That was done on purpose for the benefit of external
  tools that might need this information to disambiguate which object is
  being referenced.

  Table *aliases*, of course, should not be schema-qualified, but I
  don't think that's what we're doing.  You could make it more clear by
  including an alias in the query, like this:

  explain verbose select * into table ramp from road hwy where name ~
 '.*Ramp';

 I think you are both focusing on the wrong thing.  There is a lot of
 squishiness in what EXPLAIN prints out, since SQL notation is not always
 well suited to what an execution plan actually does.  But this code has
 a hard and fast requirement that it dump view definitions correctly,
 else pg_dump doesn't work.  And after looking at this I think Ashutosh
 has in fact found a bug.  Consider this example:

 regression=# create schema s1;
 CREATE SCHEMA
 regression=# create schema s2;
 CREATE SCHEMA
 regression=# create table s1.t1 (f1 int);
 CREATE TABLE
 regression=# create table s2.t1 (f1 int);
 CREATE TABLE
 regression=# create view v1 as
 regression-#   select * from s1.t1 where exists (
 regression(# select 1 from s2.t1 where s2.t1.f1 = s1.t1.f1
 regression(#   );
 CREATE VIEW
 regression=# \d+ v1
   View public.v1
  Column |  Type   | Modifiers | Storage | Description
 +-+---+-+-
  f1 | integer |   | plain   |
 View definition:
  SELECT t1.f1
   FROM s1.t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.t1
  WHERE t1.f1 = s1.t1.f1));

 regression=# alter table s2.t1 rename to tx;
 ALTER TABLE
 regression=# \d+ v1
   View public.v1
  Column |  Type   | Modifiers | Storage | Description
 +-+---+-+-
  f1 | integer |   | plain   |
 View definition:
  SELECT t1.f1
   FROM s1.t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.tx t1
  WHERE t1.f1 = s1.t1.f1));

 Both of the above displays of the view are formally correct, in that the
 variables will be taken to refer to the correct upper or lower RTE.
 But let's change that back and rename the other table:

 regression=# alter table s2.tx rename to t1;
 ALTER TABLE
 regression=# alter table s1.t1 rename to tx;
 ALTER TABLE
 regression=# \d+ v1
   View public.v1
  Column |  Type   | Modifiers | Storage | Description
 +-+---+-+-
  f1 | integer |   | plain   |
 View definition:
  SELECT t1.f1
   FROM s1.tx t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.t1
  WHERE t1.f1 = s1.t1.f1));

 This is just plain wrong, as you'll see if you try to execute that
 query:

 regression=# SELECT t1.f1
 regression-#FROM s1.tx t1
 regression-#   WHERE (EXISTS ( SELECT 1
 regression(#FROM s2.t1
 regression(#   WHERE t1.f1 = s1.t1.f1));
 ERROR:  invalid reference to FROM-clause entry for table t1
 LINE 5:   WHERE t1.f1 = s1.t1.f1));
^
 HINT:  There is an entry for table t1, but it cannot be referenced
 from this part of the query.

 (The HINT is a bit confused here, but the query is certainly invalid.)

 So what we have here is a potential failure to dump and reload view
 definitions, which is a lot more critical in my book than whether
 EXPLAIN's output is confusing.

 If we stick with the existing rule for attaching a fake alias to renamed
 RTEs, I think that Ashutosh's patch or something like it is probably
 appropriate, because the variable-printing code ought to be in step with
 the RTE-printing code.  Unfortunately, I think the hack to attach a fake
 alias to renamed RTEs creates some issues of its own.  Consider

select * from s1.t1
  where exists (select 1 from s2.t2 t1 where t1.f1 = s1.t1.f1);

 If s1.t1 is now renamed to s1.tx, it is still possible to express
 the same semantics:

select * from s1.tx
  where exists (select 1 from s2.t2 t1 where t1.f1 = s1.tx.f1);

 But when we attach a fake alias, it's broken:


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-30 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 So, as I understand we have two problems here
 1. Prefixing schemaname to the fake alises if there is another RTE with
 same name. There may not be a relation with that name (fake alias name
 given) in the schema chosen as prefix.
 2. Fake aliases themselves can be conflicting.

Well, the issue is more that a fake alias might unintentionally collide
with a regular alias elsewhere in the query.  There's no guard against
that in the current behavior, and ISTM there needs to be.

The one possibly-simplifying thing about this whole issue is that we
needn't cater for references that couldn't have been made in the
original query.  For instance, if the inner and outer queries both have
explicit aliases tx, it's impossible for the inner query to have
referred to any columns of the outer tx --- so we don't have to try to
make it possible in the dumped form.

 If I understand correctly, if we solve the second problem, first problem
 will not occur. Is that correct?

I don't believe that the problem has anything to do with the names of
other tables that might happen to exist in the database.  It's a matter
of what RTE names/aliases are exposed for variable references in
different parts of the query.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-30 Thread Ashutosh Bapat
I don't believe that the problem has anything to do with the names of

 other tables that might happen to exist in the database.  It's a matter
 of what RTE names/aliases are exposed for variable references in
 different parts of the query.


Names of other tables come into picture when we schema qualify the fake
aliases in the generated query. See examples in first post.


regards, tom lane




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-27 Thread Robert Haas
On Wed, Jan 11, 2012 at 6:43 AM, Ashutosh Bapat
ashutosh.ba...@enterprisedb.com wrote:
 Hi,
 After running regression, I ran EXPLAIN on one of the queries in regression
 (test create_misc) and got following output
 regression=# explain verbose select * into table ramp from road where name ~
 '.*Ramp';
  QUERY
 PLAN
 
  Result  (cost=0.00..154.00 rows=841 width=67)
    Output: public.road.name, public.road.thepath
    -  Append  (cost=0.00..154.00 rows=841 width=67)
  -  Seq Scan on public.road  (cost=0.00..135.05 rows=418 width=67)
    Output: public.road.name, public.road.thepath
    Filter: (public.road.name ~ '.*Ramp'::text)
  -  Seq Scan on public.ihighway road  (cost=0.00..14.99 rows=367
 width=67)
     ^
    Output: public.road.name, public.road.thepath
    ^^,   ^^
    Filter: (public.road.name ~ '.*Ramp'::text)
  ^^^
  -  Seq Scan on public.shighway road  (cost=0.00..3.96 rows=56
 width=67)
    Output: public.road.name, public.road.thepath
    Filter: (public.road.name ~ '.*Ramp'::text)
 (12 rows)

 regression=# \d+ road
     Table public.road
  Column  | Type | Modifiers | Storage  | Stats target | Description
 -+--+---+--+--+-
  name    | text |   | extended |  |
  thepath | path |   | extended |  |
 Indexes:
     rix btree (name)
 Child tables: ihighway,
   shighway
 Has OIDs: no

 Table road has children ihighway and shighway as seen in the \d+
 output above. The EXPLAIN output of Seq Scan node on children has
 public.road as prefix for variables. public.road could imply the parent
 table road and thus can cause confusion, as to what's been referreed, the
 columns of parent table or child table. In the EXPLAIN output children
 tables have road as alias (as against public.road). The alias comes from
 RangeTblEntry-eref-aliasname. It might be better to have road as prefix
 in the variable names over public.road.

It's a feature, not a bug, that we schema-qualify names when VERBOSE
is specified.  That was done on purpose for the benefit of external
tools that might need this information to disambiguate which object is
being referenced.

Table *aliases*, of course, should not be schema-qualified, but I
don't think that's what we're doing.  You could make it more clear by
including an alias in the query, like this:

explain verbose select * into table ramp from road hwy where name ~ '.*Ramp';

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 It's a feature, not a bug, that we schema-qualify names when VERBOSE
 is specified.  That was done on purpose for the benefit of external
 tools that might need this information to disambiguate which object is
 being referenced.

 Table *aliases*, of course, should not be schema-qualified, but I
 don't think that's what we're doing.  You could make it more clear by
 including an alias in the query, like this:

 explain verbose select * into table ramp from road hwy where name ~ '.*Ramp';

I think you are both focusing on the wrong thing.  There is a lot of
squishiness in what EXPLAIN prints out, since SQL notation is not always
well suited to what an execution plan actually does.  But this code has
a hard and fast requirement that it dump view definitions correctly,
else pg_dump doesn't work.  And after looking at this I think Ashutosh
has in fact found a bug.  Consider this example:

regression=# create schema s1;
CREATE SCHEMA
regression=# create schema s2;
CREATE SCHEMA
regression=# create table s1.t1 (f1 int);
CREATE TABLE
regression=# create table s2.t1 (f1 int);
CREATE TABLE
regression=# create view v1 as
regression-#   select * from s1.t1 where exists (
regression(# select 1 from s2.t1 where s2.t1.f1 = s1.t1.f1
regression(#   );
CREATE VIEW
regression=# \d+ v1
   View public.v1
 Column |  Type   | Modifiers | Storage | Description 
+-+---+-+-
 f1 | integer |   | plain   | 
View definition:
 SELECT t1.f1
   FROM s1.t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.t1
  WHERE t1.f1 = s1.t1.f1));

regression=# alter table s2.t1 rename to tx;
ALTER TABLE
regression=# \d+ v1
   View public.v1
 Column |  Type   | Modifiers | Storage | Description 
+-+---+-+-
 f1 | integer |   | plain   | 
View definition:
 SELECT t1.f1
   FROM s1.t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.tx t1
  WHERE t1.f1 = s1.t1.f1));

Both of the above displays of the view are formally correct, in that the
variables will be taken to refer to the correct upper or lower RTE.
But let's change that back and rename the other table:

regression=# alter table s2.tx rename to t1;
ALTER TABLE
regression=# alter table s1.t1 rename to tx;
ALTER TABLE
regression=# \d+ v1
   View public.v1
 Column |  Type   | Modifiers | Storage | Description 
+-+---+-+-
 f1 | integer |   | plain   | 
View definition:
 SELECT t1.f1
   FROM s1.tx t1
  WHERE (EXISTS ( SELECT 1
   FROM s2.t1
  WHERE t1.f1 = s1.t1.f1));

This is just plain wrong, as you'll see if you try to execute that
query:

regression=# SELECT t1.f1
regression-#FROM s1.tx t1
regression-#   WHERE (EXISTS ( SELECT 1
regression(#FROM s2.t1
regression(#   WHERE t1.f1 = s1.t1.f1));
ERROR:  invalid reference to FROM-clause entry for table t1
LINE 5:   WHERE t1.f1 = s1.t1.f1));
^
HINT:  There is an entry for table t1, but it cannot be referenced
from this part of the query.

(The HINT is a bit confused here, but the query is certainly invalid.)

So what we have here is a potential failure to dump and reload view
definitions, which is a lot more critical in my book than whether
EXPLAIN's output is confusing.

If we stick with the existing rule for attaching a fake alias to renamed
RTEs, I think that Ashutosh's patch or something like it is probably
appropriate, because the variable-printing code ought to be in step with
the RTE-printing code.  Unfortunately, I think the hack to attach a fake
alias to renamed RTEs creates some issues of its own.  Consider

select * from s1.t1
  where exists (select 1 from s2.t2 t1 where t1.f1 = s1.t1.f1);

If s1.t1 is now renamed to s1.tx, it is still possible to express
the same semantics:

select * from s1.tx
  where exists (select 1 from s2.t2 t1 where t1.f1 = s1.tx.f1);

But when we attach a fake alias, it's broken:

select * from s1.tx t1
  where exists (select 1 from s2.t2 t1 where t1.f1 = ?.f1);

There is no way to reference the outer RTE anymore from the subquery,
because the conflicting lower alias masks it.

We may be between a rock and a hard place though, because it's not that
hard to demonstrate cases where not adding a fake alias breaks it too:

select * from s1.t1 tx
  where exists (select 1 from s2.t1 where s2.t1.f1 = tx.f1);

If s2.t1 is renamed to s2.tx, there's no longer any way to reference the
upper alias tx, unless you alias the lower RTE to some different name.
I think that when we put in the fake-alias behavior, we made a value
judgment that this type of situation was more common than the other,
but I'm not really sure why.

Maybe what we need to do instead is create totally-made-up, unique
aliases when something like this 

[HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-11 Thread Ashutosh Bapat
Hi,
After running regression, I ran EXPLAIN on one of the queries in regression
(test create_misc) and got following output
regression=# explain verbose select * into table ramp from road where name
~ '.*Ramp';
 QUERY
PLAN

 Result  (cost=0.00..154.00 rows=841 width=67)
   Output: public.road.name, public.road.thepath
   -  Append  (cost=0.00..154.00 rows=841 width=67)
 -  Seq Scan on public.road  (cost=0.00..135.05 rows=418 width=67)
   Output: public.road.name, public.road.thepath
   Filter: (public.road.name ~ '.*Ramp'::text)
 -  Seq Scan on public.ihighway road  (cost=0.00..14.99 rows=367
width=67)
^
   Output: public.road.name, public.road.thepath
   ^^,   ^^
   Filter: (public.road.name ~ '.*Ramp'::text)
 ^^^
 -  Seq Scan on public.shighway road  (cost=0.00..3.96 rows=56
width=67)
   Output: public.road.name, public.road.thepath
   Filter: (public.road.name ~ '.*Ramp'::text)
(12 rows)

regression=# \d+ road
Table public.road
 Column  | Type | Modifiers | Storage  | Stats target | Description
-+--+---+--+--+-
 name| text |   | extended |  |
 thepath | path |   | extended |  |
Indexes:
rix btree (name)
Child tables: ihighway,
  shighway
Has OIDs: no

Table road has children ihighway and shighway as seen in the \d+
output above. The EXPLAIN output of Seq Scan node on children has
public.road as prefix for variables. public.road could imply the parent
table road and thus can cause confusion, as to what's been referreed, the
columns of parent table or child table. In the EXPLAIN output children
tables have road as alias (as against public.road). The alias comes
from RangeTblEntry-eref-aliasname. It might be better to have road as
prefix in the variable names over public.road.

The reason why this happens is the code in get_variable()
3865 /* Exceptions occur only if the RTE is alias-less */
3866 if (rte-alias == NULL)
3867 {
3868 if (rte-rtekind == RTE_RELATION)
3869 {
3870 /*
3871  * It's possible that use of the bare refname would find
another
3872  * more-closely-nested RTE, or be ambiguous, in which case
we need
3873  * to specify the schemaname to avoid these errors.
3874  */
3875 if (find_rte_by_refname(rte-eref-aliasname, context) !=
rte)
3876 schemaname =
get_namespace_name(get_rel_namespace(rte-relid));
3877 }

If there is no alias, we find out the schema name and later add it to the
prefix. In the inherited table case, we are actually creating a kind of
alias for the children table and thus we should not find out the schema
name and add it to the prefix. This case has been taken care of in
get_from_clause_item(),
6505 else if (rte-rtekind == RTE_RELATION 
6506 strcmp(rte-eref-aliasname,
get_relation_name(rte-relid)) != 0)
6507 {
6508 /*
6509  * Apparently the rel has been renamed since the rule was
made.
6510  * Emit a fake alias clause so that variable references
will still
6511  * work.  This is not a 100% solution but should work in
most
6512  * reasonable situations.
6513  */
6514 appendStringInfo(buf,  %s,
6515  quote_identifier(rte-eref-aliasname));
6516 gavealias = true;
6517 }

I see similar code in ExplainTargetRel()
1778 if (objectname == NULL ||
1779 strcmp(rte-eref-aliasname, objectname) != 0)
1780 appendStringInfo(es-str,  %s,
1781  quote_identifier(rte-eref-aliasname));

Based on this, here is patch to not add schemaname in the prefix for a
variable.

I have run make check. All except inherit.sql passed. The expected output
change is included in the patch.

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 9ad54c5..2e87183 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -3858,21 +3858,26 @@ get_variable(Var *var, int levelsup, bool showstar, deparse_context *context)
 		return NULL;
 	}
 
 	/* Identify names to use */
 	schemaname = NULL;			/* default assumptions */
 	refname = rte-eref-aliasname;
 
 	/* Exceptions occur only if the RTE is alias-less */
 	if (rte-alias == NULL)
 	{
-		if (rte-rtekind == RTE_RELATION)
+		/*
+		 * If the rel has been renamed since the rule was made, that's
+		 * 

Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-11 Thread Chetan Suttraway
On Wed, Jan 11, 2012 at 5:13 PM, Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote:

 Hi,
 After running regression, I ran EXPLAIN on one of the queries in
 regression (test create_misc) and got following output
 regression=# explain verbose select * into table ramp from road where name
 ~ '.*Ramp';
  QUERY
 PLAN

 
  Result  (cost=0.00..154.00 rows=841 width=67)
Output: public.road.name, public.road.thepath
-  Append  (cost=0.00..154.00 rows=841 width=67)
  -  Seq Scan on public.road  (cost=0.00..135.05 rows=418 width=67)
Output: public.road.name, public.road.thepath
Filter: (public.road.name ~ '.*Ramp'::text)
  -  Seq Scan on public.ihighway road  (cost=0.00..14.99 rows=367
 width=67)
 ^
Output: public.road.name, public.road.thepath
^^,   ^^
Filter: (public.road.name ~ '.*Ramp'::text)
  ^^^
  -  Seq Scan on public.shighway road  (cost=0.00..3.96 rows=56
 width=67)
Output: public.road.name, public.road.thepath
Filter: (public.road.name ~ '.*Ramp'::text)
 (12 rows)

 regression=# \d+ road
 Table public.road
  Column  | Type | Modifiers | Storage  | Stats target | Description
 -+--+---+--+--+-
  name| text |   | extended |  |
  thepath | path |   | extended |  |
 Indexes:
 rix btree (name)
 Child tables: ihighway,
   shighway
 Has OIDs: no

 Table road has children ihighway and shighway as seen in the \d+
 output above. The EXPLAIN output of Seq Scan node on children has
 public.road as prefix for variables. public.road could imply the parent
 table road and thus can cause confusion, as to what's been referreed, the
 columns of parent table or child table. In the EXPLAIN output children
 tables have road as alias (as against public.road). The alias comes
 from RangeTblEntry-eref-aliasname. It might be better to have road as
 prefix in the variable names over public.road.

 The reason why this happens is the code in get_variable()
 3865 /* Exceptions occur only if the RTE is alias-less */
 3866 if (rte-alias == NULL)
 3867 {
 3868 if (rte-rtekind == RTE_RELATION)
 3869 {
 3870 /*
 3871  * It's possible that use of the bare refname would find
 another
 3872  * more-closely-nested RTE, or be ambiguous, in which
 case we need
 3873  * to specify the schemaname to avoid these errors.
 3874  */
 3875 if (find_rte_by_refname(rte-eref-aliasname, context) !=
 rte)
 3876 schemaname =
 get_namespace_name(get_rel_namespace(rte-relid));
 3877 }

 If there is no alias, we find out the schema name and later add it to the
 prefix. In the inherited table case, we are actually creating a kind of
 alias for the children table and thus we should not find out the schema
 name and add it to the prefix. This case has been taken care of in
 get_from_clause_item(),
 6505 else if (rte-rtekind == RTE_RELATION 
 6506 strcmp(rte-eref-aliasname,
 get_relation_name(rte-relid)) != 0)
 6507 {
 6508 /*
 6509  * Apparently the rel has been renamed since the rule was
 made.
 6510  * Emit a fake alias clause so that variable references
 will still
 6511  * work.  This is not a 100% solution but should work in
 most
 6512  * reasonable situations.
 6513  */
 6514 appendStringInfo(buf,  %s,
 6515  quote_identifier(rte-eref-aliasname));
 6516 gavealias = true;
 6517 }

 I see similar code in ExplainTargetRel()
 1778 if (objectname == NULL ||
 1779 strcmp(rte-eref-aliasname, objectname) != 0)
 1780 appendStringInfo(es-str,  %s,
 1781  quote_identifier(rte-eref-aliasname));

 Based on this, here is patch to not add schemaname in the prefix for a
 variable.

 I have run make check. All except inherit.sql passed. The expected output
 change is included in the patch.

 --
 Best Wishes,
 Ashutosh Bapat
 EntepriseDB Corporation
 The Enterprise Postgres Company



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


A table can inherit from one or more parent table. So in that case,
qualifying schema/table name
helps in finding out where the column is coming from.

Regards,
Chetan

-- 
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

 Website: www.enterprisedb.com
EnterpriseDB 

Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables

2012-01-11 Thread Ashutosh Bapat
On Wed, Jan 11, 2012 at 5:25 PM, Chetan Suttraway 
chetan.suttra...@enterprisedb.com wrote:



 On Wed, Jan 11, 2012 at 5:13 PM, Ashutosh Bapat 
 ashutosh.ba...@enterprisedb.com wrote:

 Hi,
 After running regression, I ran EXPLAIN on one of the queries in
 regression (test create_misc) and got following output
 regression=# explain verbose select * into table ramp from road where
 name ~ '.*Ramp';
  QUERY
 PLAN

 
  Result  (cost=0.00..154.00 rows=841 width=67)
Output: public.road.name, public.road.thepath
-  Append  (cost=0.00..154.00 rows=841 width=67)
  -  Seq Scan on public.road  (cost=0.00..135.05 rows=418
 width=67)
Output: public.road.name, public.road.thepath
Filter: (public.road.name ~ '.*Ramp'::text)
  -  Seq Scan on public.ihighway road  (cost=0.00..14.99 rows=367
 width=67)
 ^
Output: public.road.name, public.road.thepath
^^,   ^^
Filter: (public.road.name ~ '.*Ramp'::text)
  ^^^
  -  Seq Scan on public.shighway road  (cost=0.00..3.96 rows=56
 width=67)
Output: public.road.name, public.road.thepath
Filter: (public.road.name ~ '.*Ramp'::text)
 (12 rows)

 regression=# \d+ road
 Table public.road
  Column  | Type | Modifiers | Storage  | Stats target | Description
 -+--+---+--+--+-
  name| text |   | extended |  |
  thepath | path |   | extended |  |
 Indexes:
 rix btree (name)
 Child tables: ihighway,
   shighway
 Has OIDs: no

 Table road has children ihighway and shighway as seen in the \d+
 output above. The EXPLAIN output of Seq Scan node on children has
 public.road as prefix for variables. public.road could imply the parent
 table road and thus can cause confusion, as to what's been referreed, the
 columns of parent table or child table. In the EXPLAIN output children
 tables have road as alias (as against public.road). The alias comes
 from RangeTblEntry-eref-aliasname. It might be better to have road as
 prefix in the variable names over public.road.

 The reason why this happens is the code in get_variable()
 3865 /* Exceptions occur only if the RTE is alias-less */
 3866 if (rte-alias == NULL)
 3867 {
 3868 if (rte-rtekind == RTE_RELATION)
 3869 {
 3870 /*
 3871  * It's possible that use of the bare refname would find
 another
 3872  * more-closely-nested RTE, or be ambiguous, in which
 case we need
 3873  * to specify the schemaname to avoid these errors.
 3874  */
 3875 if (find_rte_by_refname(rte-eref-aliasname, context)
 != rte)
 3876 schemaname =
 get_namespace_name(get_rel_namespace(rte-relid));
 3877 }

 If there is no alias, we find out the schema name and later add it to the
 prefix. In the inherited table case, we are actually creating a kind of
 alias for the children table and thus we should not find out the schema
 name and add it to the prefix. This case has been taken care of in
 get_from_clause_item(),
 6505 else if (rte-rtekind == RTE_RELATION 
 6506 strcmp(rte-eref-aliasname,
 get_relation_name(rte-relid)) != 0)
 6507 {
 6508 /*
 6509  * Apparently the rel has been renamed since the rule
 was made.
 6510  * Emit a fake alias clause so that variable references
 will still
 6511  * work.  This is not a 100% solution but should work in
 most
 6512  * reasonable situations.
 6513  */
 6514 appendStringInfo(buf,  %s,
 6515  quote_identifier(rte-eref-aliasname));
 6516 gavealias = true;
 6517 }

 I see similar code in ExplainTargetRel()
 1778 if (objectname == NULL ||
 1779 strcmp(rte-eref-aliasname, objectname) != 0)
 1780 appendStringInfo(es-str,  %s,
 1781  quote_identifier(rte-eref-aliasname));

 Based on this, here is patch to not add schemaname in the prefix for a
 variable.

 I have run make check. All except inherit.sql passed. The expected output
 change is included in the patch.

 --
 Best Wishes,
 Ashutosh Bapat
 EntepriseDB Corporation
 The Enterprise Postgres Company



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


 A table can inherit from one or more parent table. So in that case,
 qualifying schema/table name
 helps in finding out where the column is coming from.


Do you have any example