Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
I got interested in this problem again now that we have a user complaint about it (bug #7553). Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: On Wed, Feb 1, 2012 at 11:01 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: ... It seems reasonable to me to try to handle relation renames but draw the line at disambiguating column names. But others might find that distinction artificial. I sure do. For me the relation names problem and column aliases problems are two independent problems. I think they are independent problems, and I also think that people are far less likely to trip over column-name problems in practice. Columns of a table are not independent objects and so people aren't so likely to think they can just rename them freely. Moreover, if you rename columns that are used in views, you can get breakage of things like USING or NATURAL joins, and that is something we *cannot* provide a workaround for --- it's a failure inherent in the language definition. As far as the relation-rename problem goes, I propose that what we should do is have ruleutils.c invent nonconflicting fake aliases for each RTE in the query tree. This would allow getting rid of some of the dubious heuristics in get_variable: it should just print the chosen alias and be done. (It still has to do something different for unnamed joins, but we can leave that part alone I think.) We can do this as follows: 1. If there's a user-assigned alias, use that. (It's possible this is not unique within the query, but that's okay because any actual variable reference must be to the most closely nested such RTE.) 2. Otherwise, if the relation's current name doesn't conflict with any previously-assigned alias, use that. 3. Otherwise, append something (underscore and some digits probably) to the relation's current name to construct a string not matching any previously-assigned alias. This might result in printouts that are a bit uglier than the old way in such cases, but anybody who's offended can select their own aliases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
Looking at the code, it seems that the fake aliases (eref) for relations (may be views as well) are not generated per say, but they do not get changed when the relation name changes OR in case of inherited tables, they do not get changed when the inheritance is expanded (expand_inherited_rtentry). So, there is not question of generating them so as to not collide with other aliases in the query. However I did not find answers to these questions 1. What is the use of eref in case of relation when the relation name itself can be provided by the reloid? 2. Can we use schema qualified relation name in get_from_clause_item() and get_variable() instead of use eref-aliasname. I have noticed that the logic in get_rte_attribute_name() gives preference to the column names in catalog tables over eref-colnames. Anyone? On Mon, Jan 30, 2012 at 10:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: So, as I understand we have two problems here 1. Prefixing schemaname to the fake alises if there is another RTE with same name. There may not be a relation with that name (fake alias name given) in the schema chosen as prefix. 2. Fake aliases themselves can be conflicting. Well, the issue is more that a fake alias might unintentionally collide with a regular alias elsewhere in the query. There's no guard against that in the current behavior, and ISTM there needs to be. The one possibly-simplifying thing about this whole issue is that we needn't cater for references that couldn't have been made in the original query. For instance, if the inner and outer queries both have explicit aliases tx, it's impossible for the inner query to have referred to any columns of the outer tx --- so we don't have to try to make it possible in the dumped form. If I understand correctly, if we solve the second problem, first problem will not occur. Is that correct? I don't believe that the problem has anything to do with the names of other tables that might happen to exist in the database. It's a matter of what RTE names/aliases are exposed for variable references in different parts of the query. regards, tom lane -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: Looking at the code, it seems that the fake aliases (eref) for relations (may be views as well) are not generated per say, but they do not get changed when the relation name changes OR in case of inherited tables, they do not get changed when the inheritance is expanded (expand_inherited_rtentry). So, there is not question of generating them so as to not collide with other aliases in the query. Well, what I was considering was exactly generating new aliases that don't collide with anything else in the query. The fact that the code doesn't do that now doesn't mean we can't make it do that. However I did not find answers to these questions 1. What is the use of eref in case of relation when the relation name itself can be provided by the reloid? eref is stored mainly so that parsing code doesn't have to repeatedly look up what the effective RTE name is. The alias field is meant to represent whether there was an AS clause or not, and if so exactly what it said. So eref is a derived result whereas alias is essentially raw grammar output. Because of the possibility that the relation gets renamed, it's probably best if we don't rely on eref anymore after initial parsing of a query, ie ruleutils.c probably shouldn't use it. (Too lazy to go check right now if that's already true, but it seems like a good goal to pursue if we're going to change this code.) 2. Can we use schema qualified relation name in get_from_clause_item() and get_variable() instead of use eref-aliasname. No. If there is an alias, it is flat wrong to use the relation name instead, with or without schema name. You might want to go study the SQL spec a bit in this area. I have noticed that the logic in get_rte_attribute_name() gives preference to the column names in catalog tables over eref-colnames. Hm. What it should probably do is look at alias first, and if the alias field doesn't specify a column name, then go to the catalogs to get the current name. Thinking about this some more, it seems like there are ways for a user to shoot himself in the foot pretty much irretrievably. Consider CREATE TABLE t (x int); CREATE VIEW v AS SELECT y FROM t t(y); ALTER TABLE t ADD COLUMN y int; On dump and reload, we'll have CREATE TABLE t (x int, y int); CREATE VIEW v AS SELECT y FROM t t(y); and now the CREATE VIEW will fail, complaining (correctly) that the column reference y is ambiguous. Should ruleutils be expected to take it upon itself to prevent that? We could conceive of fixing it by inventing column aliases out of whole cloth: CREATE VIEW v AS SELECT y FROM t t(y, the_other_y); but that seems a little much, not to mention that such a view definition would be horribly confusing to work with. On the other hand it isn't all that far beyond what I had in mind of inventing relation aliases to cure relation-name conflicts. Should we take the existence of such cases as evidence that we shouldn't try hard in this area? It seems reasonable to me to try to handle relation renames but draw the line at disambiguating column names. But others might find that distinction artificial. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: On the other hand it isn't all that far beyond what I had in mind of inventing relation aliases to cure relation-name conflicts. Should we take the existence of such cases as evidence that we shouldn't try hard in this area? It seems reasonable to me to try to handle relation renames but draw the line at disambiguating column names. But others might find that distinction artificial. I sure do. I mean, in Oracle, if you rename a table or column involved in a view, then the view breaks. Blammo! The reference is by object name, not by some internal identifier a la OID. If you put back an object with the correct name (either the original one or a different one), you can re-enable the view. We've decide that we don't want that behavior: instead, our references are to the object itself rather than to the name of the object. Renaming the object doesn't change what the reference points to. But given that position, it seems to me that we ought to be willing to work pretty hard to make sure that when we dump-and-reload the database, things stay sane. Otherwise, we're sort of in this unsatisfying in-between place where references are *mostly* by internal identifier but everyone once in a while it falls apart and name collisions can break everything. Yech! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Feb 1, 2012 at 10:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: Looking at the code, it seems that the fake aliases (eref) for relations (may be views as well) are not generated per say, but they do not get changed when the relation name changes OR in case of inherited tables, they do not get changed when the inheritance is expanded (expand_inherited_rtentry). So, there is not question of generating them so as to not collide with other aliases in the query. Well, what I was considering was exactly generating new aliases that don't collide with anything else in the query. The fact that the code doesn't do that now doesn't mean we can't make it do that. However I did not find answers to these questions 1. What is the use of eref in case of relation when the relation name itself can be provided by the reloid? eref is stored mainly so that parsing code doesn't have to repeatedly look up what the effective RTE name is. The alias field is meant to represent whether there was an AS clause or not, and if so exactly what it said. So eref is a derived result whereas alias is essentially raw grammar output. Because of the possibility that the relation gets renamed, it's probably best if we don't rely on eref anymore after initial parsing of a query, ie ruleutils.c probably shouldn't use it. (Too lazy to go check right now if that's already true, but it seems like a good goal to pursue if we're going to change this code.) 2. Can we use schema qualified relation name in get_from_clause_item() and get_variable() instead of use eref-aliasname. No. If there is an alias, it is flat wrong to use the relation name instead, with or without schema name. You might want to go study the SQL spec a bit in this area. To clarify matters a bit, item 2 is in conjunction with item 1. Aliases, if provided, are output irrespective of whether we get the relation name from eref or catalogs. ruleutils should just ignore eref (for RTE_RELATION only) and get the relation name from given OID. I have noticed that the logic in get_rte_attribute_name() gives preference to the column names in catalog tables over eref-colnames. Hm. What it should probably do is look at alias first, and if the alias field doesn't specify a column name, then go to the catalogs to get the current name. It does give preference to aliases today. I compared preferences of colnames in eref and that obtained from catalogs. Thinking about this some more, it seems like there are ways for a user to shoot himself in the foot pretty much irretrievably. Consider CREATE TABLE t (x int); CREATE VIEW v AS SELECT y FROM t t(y); ALTER TABLE t ADD COLUMN y int; On dump and reload, we'll have CREATE TABLE t (x int, y int); CREATE VIEW v AS SELECT y FROM t t(y); and now the CREATE VIEW will fail, complaining (correctly) that the column reference y is ambiguous. Should ruleutils be expected to take it upon itself to prevent that? We could conceive of fixing it by inventing column aliases out of whole cloth: CREATE VIEW v AS SELECT y FROM t t(y, the_other_y); but that seems a little much, not to mention that such a view definition would be horribly confusing to work with. On the other hand it isn't all that far beyond what I had in mind of inventing relation aliases to cure relation-name conflicts. Should we take the existence of such cases as evidence that we shouldn't try hard in this area? It seems reasonable to me to try to handle relation renames but draw the line at disambiguating column names. But others might find that distinction artificial. I agree. The example of the colnames was only to show that the preference alias relation information from catalogs eref exists somewhere in the code. regards, tom lane -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Feb 1, 2012 at 11:01 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 1, 2012 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: On the other hand it isn't all that far beyond what I had in mind of inventing relation aliases to cure relation-name conflicts. Should we take the existence of such cases as evidence that we shouldn't try hard in this area? It seems reasonable to me to try to handle relation renames but draw the line at disambiguating column names. But others might find that distinction artificial. I sure do. I mean, in Oracle, if you rename a table or column involved in a view, then the view breaks. Blammo! The reference is by object name, not by some internal identifier a la OID. If you put back an object with the correct name (either the original one or a different one), you can re-enable the view. We've decide that we don't want that behavior: instead, our references are to the object itself rather than to the name of the object. Renaming the object doesn't change what the reference points to. But given that position, it seems to me that we ought to be willing to work pretty hard to make sure that when we dump-and-reload the database, things stay sane. Otherwise, we're sort of in this unsatisfying in-between place where references are *mostly* by internal identifier but everyone once in a while it falls apart and name collisions can break everything. Yech! For me the relation names problem and column aliases problems are two independent problems. While the first one looks easy to fix, the other problem may be hard to solve. We can solve the first problem and things will be better than what we have today. If you agree, I will provide a patch to fix the relation names problems by ignoring the eref (for RTE_RELATION only) in ruleutils. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
Thanks Tom for giving a stronger case. I found the problem whille looking at inherited tables, and didn't think beyond inherited tables. Since inherited tables are expanded when subquery planner is invoked, I thought the problem will occur only in Explain output as we won't generate queries, that can be used elsewhere after/during planning. So, as I understand we have two problems here 1. Prefixing schemaname to the fake alises if there is another RTE with same name. There may not be a relation with that name (fake alias name given) in the schema chosen as prefix. 2. Fake aliases themselves can be conflicting. If I understand correctly, if we solve the second problem, first problem will not occur. Is that correct? On Sat, Jan 28, 2012 at 8:08 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: It's a feature, not a bug, that we schema-qualify names when VERBOSE is specified. That was done on purpose for the benefit of external tools that might need this information to disambiguate which object is being referenced. Table *aliases*, of course, should not be schema-qualified, but I don't think that's what we're doing. You could make it more clear by including an alias in the query, like this: explain verbose select * into table ramp from road hwy where name ~ '.*Ramp'; I think you are both focusing on the wrong thing. There is a lot of squishiness in what EXPLAIN prints out, since SQL notation is not always well suited to what an execution plan actually does. But this code has a hard and fast requirement that it dump view definitions correctly, else pg_dump doesn't work. And after looking at this I think Ashutosh has in fact found a bug. Consider this example: regression=# create schema s1; CREATE SCHEMA regression=# create schema s2; CREATE SCHEMA regression=# create table s1.t1 (f1 int); CREATE TABLE regression=# create table s2.t1 (f1 int); CREATE TABLE regression=# create view v1 as regression-# select * from s1.t1 where exists ( regression(# select 1 from s2.t1 where s2.t1.f1 = s1.t1.f1 regression(# ); CREATE VIEW regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.t1 WHERE (EXISTS ( SELECT 1 FROM s2.t1 WHERE t1.f1 = s1.t1.f1)); regression=# alter table s2.t1 rename to tx; ALTER TABLE regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.t1 WHERE (EXISTS ( SELECT 1 FROM s2.tx t1 WHERE t1.f1 = s1.t1.f1)); Both of the above displays of the view are formally correct, in that the variables will be taken to refer to the correct upper or lower RTE. But let's change that back and rename the other table: regression=# alter table s2.tx rename to t1; ALTER TABLE regression=# alter table s1.t1 rename to tx; ALTER TABLE regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.tx t1 WHERE (EXISTS ( SELECT 1 FROM s2.t1 WHERE t1.f1 = s1.t1.f1)); This is just plain wrong, as you'll see if you try to execute that query: regression=# SELECT t1.f1 regression-#FROM s1.tx t1 regression-# WHERE (EXISTS ( SELECT 1 regression(#FROM s2.t1 regression(# WHERE t1.f1 = s1.t1.f1)); ERROR: invalid reference to FROM-clause entry for table t1 LINE 5: WHERE t1.f1 = s1.t1.f1)); ^ HINT: There is an entry for table t1, but it cannot be referenced from this part of the query. (The HINT is a bit confused here, but the query is certainly invalid.) So what we have here is a potential failure to dump and reload view definitions, which is a lot more critical in my book than whether EXPLAIN's output is confusing. If we stick with the existing rule for attaching a fake alias to renamed RTEs, I think that Ashutosh's patch or something like it is probably appropriate, because the variable-printing code ought to be in step with the RTE-printing code. Unfortunately, I think the hack to attach a fake alias to renamed RTEs creates some issues of its own. Consider select * from s1.t1 where exists (select 1 from s2.t2 t1 where t1.f1 = s1.t1.f1); If s1.t1 is now renamed to s1.tx, it is still possible to express the same semantics: select * from s1.tx where exists (select 1 from s2.t2 t1 where t1.f1 = s1.tx.f1); But when we attach a fake alias, it's broken:
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes: So, as I understand we have two problems here 1. Prefixing schemaname to the fake alises if there is another RTE with same name. There may not be a relation with that name (fake alias name given) in the schema chosen as prefix. 2. Fake aliases themselves can be conflicting. Well, the issue is more that a fake alias might unintentionally collide with a regular alias elsewhere in the query. There's no guard against that in the current behavior, and ISTM there needs to be. The one possibly-simplifying thing about this whole issue is that we needn't cater for references that couldn't have been made in the original query. For instance, if the inner and outer queries both have explicit aliases tx, it's impossible for the inner query to have referred to any columns of the outer tx --- so we don't have to try to make it possible in the dumped form. If I understand correctly, if we solve the second problem, first problem will not occur. Is that correct? I don't believe that the problem has anything to do with the names of other tables that might happen to exist in the database. It's a matter of what RTE names/aliases are exposed for variable references in different parts of the query. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
I don't believe that the problem has anything to do with the names of other tables that might happen to exist in the database. It's a matter of what RTE names/aliases are exposed for variable references in different parts of the query. Names of other tables come into picture when we schema qualify the fake aliases in the generated query. See examples in first post. regards, tom lane -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Jan 11, 2012 at 6:43 AM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: Hi, After running regression, I ran EXPLAIN on one of the queries in regression (test create_misc) and got following output regression=# explain verbose select * into table ramp from road where name ~ '.*Ramp'; QUERY PLAN Result (cost=0.00..154.00 rows=841 width=67) Output: public.road.name, public.road.thepath - Append (cost=0.00..154.00 rows=841 width=67) - Seq Scan on public.road (cost=0.00..135.05 rows=418 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) - Seq Scan on public.ihighway road (cost=0.00..14.99 rows=367 width=67) ^ Output: public.road.name, public.road.thepath ^^, ^^ Filter: (public.road.name ~ '.*Ramp'::text) ^^^ - Seq Scan on public.shighway road (cost=0.00..3.96 rows=56 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) (12 rows) regression=# \d+ road Table public.road Column | Type | Modifiers | Storage | Stats target | Description -+--+---+--+--+- name | text | | extended | | thepath | path | | extended | | Indexes: rix btree (name) Child tables: ihighway, shighway Has OIDs: no Table road has children ihighway and shighway as seen in the \d+ output above. The EXPLAIN output of Seq Scan node on children has public.road as prefix for variables. public.road could imply the parent table road and thus can cause confusion, as to what's been referreed, the columns of parent table or child table. In the EXPLAIN output children tables have road as alias (as against public.road). The alias comes from RangeTblEntry-eref-aliasname. It might be better to have road as prefix in the variable names over public.road. It's a feature, not a bug, that we schema-qualify names when VERBOSE is specified. That was done on purpose for the benefit of external tools that might need this information to disambiguate which object is being referenced. Table *aliases*, of course, should not be schema-qualified, but I don't think that's what we're doing. You could make it more clear by including an alias in the query, like this: explain verbose select * into table ramp from road hwy where name ~ '.*Ramp'; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
Robert Haas robertmh...@gmail.com writes: It's a feature, not a bug, that we schema-qualify names when VERBOSE is specified. That was done on purpose for the benefit of external tools that might need this information to disambiguate which object is being referenced. Table *aliases*, of course, should not be schema-qualified, but I don't think that's what we're doing. You could make it more clear by including an alias in the query, like this: explain verbose select * into table ramp from road hwy where name ~ '.*Ramp'; I think you are both focusing on the wrong thing. There is a lot of squishiness in what EXPLAIN prints out, since SQL notation is not always well suited to what an execution plan actually does. But this code has a hard and fast requirement that it dump view definitions correctly, else pg_dump doesn't work. And after looking at this I think Ashutosh has in fact found a bug. Consider this example: regression=# create schema s1; CREATE SCHEMA regression=# create schema s2; CREATE SCHEMA regression=# create table s1.t1 (f1 int); CREATE TABLE regression=# create table s2.t1 (f1 int); CREATE TABLE regression=# create view v1 as regression-# select * from s1.t1 where exists ( regression(# select 1 from s2.t1 where s2.t1.f1 = s1.t1.f1 regression(# ); CREATE VIEW regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.t1 WHERE (EXISTS ( SELECT 1 FROM s2.t1 WHERE t1.f1 = s1.t1.f1)); regression=# alter table s2.t1 rename to tx; ALTER TABLE regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.t1 WHERE (EXISTS ( SELECT 1 FROM s2.tx t1 WHERE t1.f1 = s1.t1.f1)); Both of the above displays of the view are formally correct, in that the variables will be taken to refer to the correct upper or lower RTE. But let's change that back and rename the other table: regression=# alter table s2.tx rename to t1; ALTER TABLE regression=# alter table s1.t1 rename to tx; ALTER TABLE regression=# \d+ v1 View public.v1 Column | Type | Modifiers | Storage | Description +-+---+-+- f1 | integer | | plain | View definition: SELECT t1.f1 FROM s1.tx t1 WHERE (EXISTS ( SELECT 1 FROM s2.t1 WHERE t1.f1 = s1.t1.f1)); This is just plain wrong, as you'll see if you try to execute that query: regression=# SELECT t1.f1 regression-#FROM s1.tx t1 regression-# WHERE (EXISTS ( SELECT 1 regression(#FROM s2.t1 regression(# WHERE t1.f1 = s1.t1.f1)); ERROR: invalid reference to FROM-clause entry for table t1 LINE 5: WHERE t1.f1 = s1.t1.f1)); ^ HINT: There is an entry for table t1, but it cannot be referenced from this part of the query. (The HINT is a bit confused here, but the query is certainly invalid.) So what we have here is a potential failure to dump and reload view definitions, which is a lot more critical in my book than whether EXPLAIN's output is confusing. If we stick with the existing rule for attaching a fake alias to renamed RTEs, I think that Ashutosh's patch or something like it is probably appropriate, because the variable-printing code ought to be in step with the RTE-printing code. Unfortunately, I think the hack to attach a fake alias to renamed RTEs creates some issues of its own. Consider select * from s1.t1 where exists (select 1 from s2.t2 t1 where t1.f1 = s1.t1.f1); If s1.t1 is now renamed to s1.tx, it is still possible to express the same semantics: select * from s1.tx where exists (select 1 from s2.t2 t1 where t1.f1 = s1.tx.f1); But when we attach a fake alias, it's broken: select * from s1.tx t1 where exists (select 1 from s2.t2 t1 where t1.f1 = ?.f1); There is no way to reference the outer RTE anymore from the subquery, because the conflicting lower alias masks it. We may be between a rock and a hard place though, because it's not that hard to demonstrate cases where not adding a fake alias breaks it too: select * from s1.t1 tx where exists (select 1 from s2.t1 where s2.t1.f1 = tx.f1); If s2.t1 is renamed to s2.tx, there's no longer any way to reference the upper alias tx, unless you alias the lower RTE to some different name. I think that when we put in the fake-alias behavior, we made a value judgment that this type of situation was more common than the other, but I'm not really sure why. Maybe what we need to do instead is create totally-made-up, unique aliases when something like this
[HACKERS] Confusing EXPLAIN output in case of inherited tables
Hi, After running regression, I ran EXPLAIN on one of the queries in regression (test create_misc) and got following output regression=# explain verbose select * into table ramp from road where name ~ '.*Ramp'; QUERY PLAN Result (cost=0.00..154.00 rows=841 width=67) Output: public.road.name, public.road.thepath - Append (cost=0.00..154.00 rows=841 width=67) - Seq Scan on public.road (cost=0.00..135.05 rows=418 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) - Seq Scan on public.ihighway road (cost=0.00..14.99 rows=367 width=67) ^ Output: public.road.name, public.road.thepath ^^, ^^ Filter: (public.road.name ~ '.*Ramp'::text) ^^^ - Seq Scan on public.shighway road (cost=0.00..3.96 rows=56 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) (12 rows) regression=# \d+ road Table public.road Column | Type | Modifiers | Storage | Stats target | Description -+--+---+--+--+- name| text | | extended | | thepath | path | | extended | | Indexes: rix btree (name) Child tables: ihighway, shighway Has OIDs: no Table road has children ihighway and shighway as seen in the \d+ output above. The EXPLAIN output of Seq Scan node on children has public.road as prefix for variables. public.road could imply the parent table road and thus can cause confusion, as to what's been referreed, the columns of parent table or child table. In the EXPLAIN output children tables have road as alias (as against public.road). The alias comes from RangeTblEntry-eref-aliasname. It might be better to have road as prefix in the variable names over public.road. The reason why this happens is the code in get_variable() 3865 /* Exceptions occur only if the RTE is alias-less */ 3866 if (rte-alias == NULL) 3867 { 3868 if (rte-rtekind == RTE_RELATION) 3869 { 3870 /* 3871 * It's possible that use of the bare refname would find another 3872 * more-closely-nested RTE, or be ambiguous, in which case we need 3873 * to specify the schemaname to avoid these errors. 3874 */ 3875 if (find_rte_by_refname(rte-eref-aliasname, context) != rte) 3876 schemaname = get_namespace_name(get_rel_namespace(rte-relid)); 3877 } If there is no alias, we find out the schema name and later add it to the prefix. In the inherited table case, we are actually creating a kind of alias for the children table and thus we should not find out the schema name and add it to the prefix. This case has been taken care of in get_from_clause_item(), 6505 else if (rte-rtekind == RTE_RELATION 6506 strcmp(rte-eref-aliasname, get_relation_name(rte-relid)) != 0) 6507 { 6508 /* 6509 * Apparently the rel has been renamed since the rule was made. 6510 * Emit a fake alias clause so that variable references will still 6511 * work. This is not a 100% solution but should work in most 6512 * reasonable situations. 6513 */ 6514 appendStringInfo(buf, %s, 6515 quote_identifier(rte-eref-aliasname)); 6516 gavealias = true; 6517 } I see similar code in ExplainTargetRel() 1778 if (objectname == NULL || 1779 strcmp(rte-eref-aliasname, objectname) != 0) 1780 appendStringInfo(es-str, %s, 1781 quote_identifier(rte-eref-aliasname)); Based on this, here is patch to not add schemaname in the prefix for a variable. I have run make check. All except inherit.sql passed. The expected output change is included in the patch. -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 9ad54c5..2e87183 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -3858,21 +3858,26 @@ get_variable(Var *var, int levelsup, bool showstar, deparse_context *context) return NULL; } /* Identify names to use */ schemaname = NULL; /* default assumptions */ refname = rte-eref-aliasname; /* Exceptions occur only if the RTE is alias-less */ if (rte-alias == NULL) { - if (rte-rtekind == RTE_RELATION) + /* + * If the rel has been renamed since the rule was made, that's + *
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Jan 11, 2012 at 5:13 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: Hi, After running regression, I ran EXPLAIN on one of the queries in regression (test create_misc) and got following output regression=# explain verbose select * into table ramp from road where name ~ '.*Ramp'; QUERY PLAN Result (cost=0.00..154.00 rows=841 width=67) Output: public.road.name, public.road.thepath - Append (cost=0.00..154.00 rows=841 width=67) - Seq Scan on public.road (cost=0.00..135.05 rows=418 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) - Seq Scan on public.ihighway road (cost=0.00..14.99 rows=367 width=67) ^ Output: public.road.name, public.road.thepath ^^, ^^ Filter: (public.road.name ~ '.*Ramp'::text) ^^^ - Seq Scan on public.shighway road (cost=0.00..3.96 rows=56 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) (12 rows) regression=# \d+ road Table public.road Column | Type | Modifiers | Storage | Stats target | Description -+--+---+--+--+- name| text | | extended | | thepath | path | | extended | | Indexes: rix btree (name) Child tables: ihighway, shighway Has OIDs: no Table road has children ihighway and shighway as seen in the \d+ output above. The EXPLAIN output of Seq Scan node on children has public.road as prefix for variables. public.road could imply the parent table road and thus can cause confusion, as to what's been referreed, the columns of parent table or child table. In the EXPLAIN output children tables have road as alias (as against public.road). The alias comes from RangeTblEntry-eref-aliasname. It might be better to have road as prefix in the variable names over public.road. The reason why this happens is the code in get_variable() 3865 /* Exceptions occur only if the RTE is alias-less */ 3866 if (rte-alias == NULL) 3867 { 3868 if (rte-rtekind == RTE_RELATION) 3869 { 3870 /* 3871 * It's possible that use of the bare refname would find another 3872 * more-closely-nested RTE, or be ambiguous, in which case we need 3873 * to specify the schemaname to avoid these errors. 3874 */ 3875 if (find_rte_by_refname(rte-eref-aliasname, context) != rte) 3876 schemaname = get_namespace_name(get_rel_namespace(rte-relid)); 3877 } If there is no alias, we find out the schema name and later add it to the prefix. In the inherited table case, we are actually creating a kind of alias for the children table and thus we should not find out the schema name and add it to the prefix. This case has been taken care of in get_from_clause_item(), 6505 else if (rte-rtekind == RTE_RELATION 6506 strcmp(rte-eref-aliasname, get_relation_name(rte-relid)) != 0) 6507 { 6508 /* 6509 * Apparently the rel has been renamed since the rule was made. 6510 * Emit a fake alias clause so that variable references will still 6511 * work. This is not a 100% solution but should work in most 6512 * reasonable situations. 6513 */ 6514 appendStringInfo(buf, %s, 6515 quote_identifier(rte-eref-aliasname)); 6516 gavealias = true; 6517 } I see similar code in ExplainTargetRel() 1778 if (objectname == NULL || 1779 strcmp(rte-eref-aliasname, objectname) != 0) 1780 appendStringInfo(es-str, %s, 1781 quote_identifier(rte-eref-aliasname)); Based on this, here is patch to not add schemaname in the prefix for a variable. I have run make check. All except inherit.sql passed. The expected output change is included in the patch. -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers A table can inherit from one or more parent table. So in that case, qualifying schema/table name helps in finding out where the column is coming from. Regards, Chetan -- EnterpriseDB Corporation The Enterprise PostgreSQL Company Website: www.enterprisedb.com EnterpriseDB
Re: [HACKERS] Confusing EXPLAIN output in case of inherited tables
On Wed, Jan 11, 2012 at 5:25 PM, Chetan Suttraway chetan.suttra...@enterprisedb.com wrote: On Wed, Jan 11, 2012 at 5:13 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: Hi, After running regression, I ran EXPLAIN on one of the queries in regression (test create_misc) and got following output regression=# explain verbose select * into table ramp from road where name ~ '.*Ramp'; QUERY PLAN Result (cost=0.00..154.00 rows=841 width=67) Output: public.road.name, public.road.thepath - Append (cost=0.00..154.00 rows=841 width=67) - Seq Scan on public.road (cost=0.00..135.05 rows=418 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) - Seq Scan on public.ihighway road (cost=0.00..14.99 rows=367 width=67) ^ Output: public.road.name, public.road.thepath ^^, ^^ Filter: (public.road.name ~ '.*Ramp'::text) ^^^ - Seq Scan on public.shighway road (cost=0.00..3.96 rows=56 width=67) Output: public.road.name, public.road.thepath Filter: (public.road.name ~ '.*Ramp'::text) (12 rows) regression=# \d+ road Table public.road Column | Type | Modifiers | Storage | Stats target | Description -+--+---+--+--+- name| text | | extended | | thepath | path | | extended | | Indexes: rix btree (name) Child tables: ihighway, shighway Has OIDs: no Table road has children ihighway and shighway as seen in the \d+ output above. The EXPLAIN output of Seq Scan node on children has public.road as prefix for variables. public.road could imply the parent table road and thus can cause confusion, as to what's been referreed, the columns of parent table or child table. In the EXPLAIN output children tables have road as alias (as against public.road). The alias comes from RangeTblEntry-eref-aliasname. It might be better to have road as prefix in the variable names over public.road. The reason why this happens is the code in get_variable() 3865 /* Exceptions occur only if the RTE is alias-less */ 3866 if (rte-alias == NULL) 3867 { 3868 if (rte-rtekind == RTE_RELATION) 3869 { 3870 /* 3871 * It's possible that use of the bare refname would find another 3872 * more-closely-nested RTE, or be ambiguous, in which case we need 3873 * to specify the schemaname to avoid these errors. 3874 */ 3875 if (find_rte_by_refname(rte-eref-aliasname, context) != rte) 3876 schemaname = get_namespace_name(get_rel_namespace(rte-relid)); 3877 } If there is no alias, we find out the schema name and later add it to the prefix. In the inherited table case, we are actually creating a kind of alias for the children table and thus we should not find out the schema name and add it to the prefix. This case has been taken care of in get_from_clause_item(), 6505 else if (rte-rtekind == RTE_RELATION 6506 strcmp(rte-eref-aliasname, get_relation_name(rte-relid)) != 0) 6507 { 6508 /* 6509 * Apparently the rel has been renamed since the rule was made. 6510 * Emit a fake alias clause so that variable references will still 6511 * work. This is not a 100% solution but should work in most 6512 * reasonable situations. 6513 */ 6514 appendStringInfo(buf, %s, 6515 quote_identifier(rte-eref-aliasname)); 6516 gavealias = true; 6517 } I see similar code in ExplainTargetRel() 1778 if (objectname == NULL || 1779 strcmp(rte-eref-aliasname, objectname) != 0) 1780 appendStringInfo(es-str, %s, 1781 quote_identifier(rte-eref-aliasname)); Based on this, here is patch to not add schemaname in the prefix for a variable. I have run make check. All except inherit.sql passed. The expected output change is included in the patch. -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers A table can inherit from one or more parent table. So in that case, qualifying schema/table name helps in finding out where the column is coming from. Do you have any example