Re: [HACKERS] LEFT JOIN LATERAL can remove rows from LHS
On 06/18/2013 01:52 AM, Jeremy Evans wrote: Maybe I am misunderstanding how LATERAL is supposed to work, but my expectation is that doing a LEFT JOIN should not remove rows from the LHS. I have added this to the list of 9.3 blockers. https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] LEFT JOIN LATERAL can remove rows from LHS
Maybe I am misunderstanding how LATERAL is supposed to work, but my expectation is that doing a LEFT JOIN should not remove rows from the LHS. I would expect all of the following select queries would return a single row, but that isn't the case: CREATE TABLE i (n integer); CREATE TABLE j (n integer); INSERT INTO i VALUES (10); SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j) j ON true; n | n +--- 10 | (1 row) SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON true; n | n ---+--- (0 rows) INSERT INTO j VALUES (10); SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON true; n | n + 10 | 10 (1 row) SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON false; n | n ---+--- (0 rows) Is the error in PostgreSQL or my understanding of LATERAL subqueries? Please CC me when responding as I don't currently subscribe to the list. Thanks, Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LEFT JOIN LATERAL can remove rows from LHS
On 06/18/2013 01:52 AM, Jeremy Evans wrote: Maybe I am misunderstanding how LATERAL is supposed to work, but my expectation is that doing a LEFT JOIN should not remove rows from the LHS. I would expect all of the following select queries would return a single row, but that isn't the case: CREATE TABLE i (n integer); CREATE TABLE j (n integer); INSERT INTO i VALUES (10); SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j) j ON true; n | n +--- 10 | (1 row) SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON true; n | n ---+--- (0 rows) INSERT INTO j VALUES (10); SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON true; n | n + 10 | 10 (1 row) SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON false; n | n ---+--- (0 rows) This is a bug. If you block the optimizer from rearranging the lateral join condition, it gives the correct answer: No blocking: SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n)) j ON true; n | n ---+--- (0 rows) QUERY PLAN - Nested Loop Left Join (cost=0.00..65.01 rows=12 width=8) (actual time=0.027..0.027 rows=0 loops=1) Filter: (i.n = j.n) Rows Removed by Filter: 1 - Seq Scan on i (cost=0.00..1.01 rows=1 width=4) (actual time=0.013..0.015 rows=1 loops=1) - Seq Scan on j (cost=0.00..34.00 rows=2400 width=4) (actual time=0.001..0.001 rows=0 loops=1) Total runtime: 0.084 ms (6 rows) Blocking: SELECT * FROM i LEFT JOIN LATERAL (SELECT * FROM j WHERE (i.n = j.n) OFFSET 0) j ON true; n | n +--- 10 | (1 row) QUERY PLAN - Nested Loop Left Join (cost=0.00..41.25 rows=12 width=8) (actual time=0.014..0.015 rows=1 loops=1) - Seq Scan on i (cost=0.00..1.01 rows=1 width=4) (actual time=0.006..0.007 rows=1 loops=1) - Seq Scan on j (cost=0.00..40.00 rows=12 width=4) (actual time=0.001..0.001 rows=0 loops=1) Filter: (i.n = n) Total runtime: 0.057 ms (5 rows) Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LEFT JOIN used in psql describe.c
Bruce Momjian wrote: Tom Lane wrote: Greg Sabino Mullane [EMAIL PROTECTED] writes: Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! I think the idea was to be certain to show every pg_proc entry (or other catalog for other \d commands), no matter how badly broken the catalog interrelationships might be. If there's not an unarguable reason for eliminating the left joins I'd be inclined to keep it like that. What does an inner join buy here, other than brittleness? (Yeah, I have the perspective of a developer who deals with broken situations every day. So?) If we have problems with the system catalogs, I don't see how this join has a high probability of catching the problem. If there was some known problem of the join not always working, I could see the use of LEFT JOIN, but there isn't, so it just seems confusing, and these queries are used by others as models of how to do system joins, so could confuse our users as well. I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. I still think that the LEFT JOINs used in psql system queries is confusing and perhaps adds performance overhead while adding little reliability, but no one else seems to think so so I will drop the idea. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] LEFT JOIN used in psql describe.c
On Tue, 10 May 2005, Bruce Momjian wrote: Bruce Momjian wrote: Tom Lane wrote: Greg Sabino Mullane [EMAIL PROTECTED] writes: Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! I think the idea was to be certain to show every pg_proc entry (or other catalog for other \d commands), no matter how badly broken the catalog interrelationships might be. If there's not an unarguable reason for eliminating the left joins I'd be inclined to keep it like that. What does an inner join buy here, other than brittleness? (Yeah, I have the perspective of a developer who deals with broken situations every day. So?) If we have problems with the system catalogs, I don't see how this join has a high probability of catching the problem. If there was some known problem of the join not always working, I could see the use of LEFT JOIN, but there isn't, so it just seems confusing, and these queries are used by others as models of how to do system joins, so could confuse our users as well. I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. I still think that the LEFT JOINs used in psql system queries is confusing and perhaps adds performance overhead while adding little reliability, but no one else seems to think so so I will drop the idea. I'm a bit confused here, but I believe Tom (at least how I read it) was agreeing with you about pulling the LEFT JOIN out ... I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. ... or am I mis-quoting? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] LEFT JOIN used in psql describe.c
Marc G. Fournier wrote: On Tue, 10 May 2005, Bruce Momjian wrote: Bruce Momjian wrote: Tom Lane wrote: Greg Sabino Mullane [EMAIL PROTECTED] writes: Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! I think the idea was to be certain to show every pg_proc entry (or other catalog for other \d commands), no matter how badly broken the catalog interrelationships might be. If there's not an unarguable reason for eliminating the left joins I'd be inclined to keep it like that. What does an inner join buy here, other than brittleness? (Yeah, I have the perspective of a developer who deals with broken situations every day. So?) If we have problems with the system catalogs, I don't see how this join has a high probability of catching the problem. If there was some known problem of the join not always working, I could see the use of LEFT JOIN, but there isn't, so it just seems confusing, and these queries are used by others as models of how to do system joins, so could confuse our users as well. I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. I still think that the LEFT JOINs used in psql system queries is confusing and perhaps adds performance overhead while adding little reliability, but no one else seems to think so so I will drop the idea. I'm a bit confused here, but I believe Tom (at least how I read it) was agreeing with you about pulling the LEFT JOIN out ... I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. ... or am I mis-quoting? I am actually quoting myself in the posting, so the words are mine, not Tom's. The basic issue is code simplicity vs. robustness, and I am leaning to the former because there is no known robustness problem. It is actually opposite of our opinions on checking for unreferenced files, where I want robustness (because it is a known problem) and Tom wants simplicity (though he is flexible on this), so it seems the two of us switch sides occasionally. :-) -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] LEFT JOIN used in psql describe.c
Tom Lane wrote: Greg Sabino Mullane [EMAIL PROTECTED] writes: Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! I think the idea was to be certain to show every pg_proc entry (or other catalog for other \d commands), no matter how badly broken the catalog interrelationships might be. If there's not an unarguable reason for eliminating the left joins I'd be inclined to keep it like that. What does an inner join buy here, other than brittleness? (Yeah, I have the perspective of a developer who deals with broken situations every day. So?) If we have problems with the system catalogs, I don't see how this join has a high probability of catching the problem. If there was some known problem of the join not always working, I could see the use of LEFT JOIN, but there isn't, so it just seems confusing, and these queries are used by others as models of how to do system joins, so could confuse our users as well. I think the LEFT JOIN should be removed unless there is a known problem, and if one shows up, we can re-add them later. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] LEFT JOIN used in psql describe.c
Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: printfPQExpBuffer(buf, SELECT c.oid,\n n.nspname,\n c.relname\n FROM pg_catalog.pg_class c\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace\n); processNamePattern(buf, pattern, false, false, n.nspname, c.relname, NULL, pg_catalog.pg_table_is_visible(c.oid)); I thought a pg_class row always pointed to a valid pg_namespace row because of our dependency restrictions. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] LEFT JOIN used in psql describe.c
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: I thought a pg_class row always pointed to a valid pg_namespace row because of our dependency restrictions. Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! The patch has actually been done for some time now, but the tab completion part of it got tricky with things like \df tab and \dfS tab, since the code pretty much assumes that the only differentiation of system/non-system objects occurs in pg_class objects. I'll try to get back to it next week, once DBD::Pg 1.41 is finished up. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200504012315 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFCThy7vJuQZxSWSsgRAuNRAKClDG8QDxnX7LJMWqODtGqKnClpQQCfaZZ8 UwEpCmHJOyfSOuF0MAkQ7xg= =blNw -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] LEFT JOIN used in psql describe.c
Greg Sabino Mullane [EMAIL PROTECTED] writes: Does anyone know why so many LEFT JOINs are used in psql/describe.c to join to the pg_namespace table, like here: Yes, pg_relnamespace is definitely not null. I've actually already removed the left joins from my \df patch, since I had to rewrite some of the queries anyway. If this is wrong, please let me know of course! I think the idea was to be certain to show every pg_proc entry (or other catalog for other \d commands), no matter how badly broken the catalog interrelationships might be. If there's not an unarguable reason for eliminating the left joins I'd be inclined to keep it like that. What does an inner join buy here, other than brittleness? (Yeah, I have the perspective of a developer who deals with broken situations every day. So?) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] LEFT JOIN ...
Morning ... I'm trying to wrack my brain over something here, and no matter how I try and look at it, I'm drawing a blank ... I have two tables that are dependent on each other: notes (86736 tuples) and note_links (173473 tuples) The relationship is that one note can have several 'ppl' link'd to it ... I have a third table: calendar (11014 tuples) ... those calendar entries link to a note. So you have something like: personA --- personB --|-- note_links -- notes --[maybe]-- calendar entry personC --- now, the query I'm workign with is: SELECT n.note, n.nid, n.type, c.act_type, c.status, nl.contact_lvl, CASE WHEN c.act_start IS NULL THEN date_part('epoch', n.added) ELSE date_part('epoch', c.act_start) END AS start FROM note_links nl, notes n LEFT JOIN calendar c ON (n.nid = c.nid) WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') AND (nl.id = 15748 AND contact_lvl = 'company') AND n.nid = nl.nid ORDER BY start DESC; Which explains out as: NOTICE: QUERY PLAN: Sort (cost=7446.32..7446.32 rows=1 width=88) - Nested Loop (cost=306.52..7446.31 rows=1 width=88) - Index Scan using note_links_id on note_links nl (cost=0.00..3.49 rows=1 width=16) - Materialize (cost=6692.63..6692.63 rows=60015 width=72) - Hash Join (cost=306.52..6692.63 rows=60015 width=72) - Seq Scan on notes n (cost=0.00..2903.98 rows=60015 width=36) - Hash (cost=206.22..206.22 rows=10122 width=36) - Seq Scan on calendar c (cost=0.00..206.22 rows=10122 width=36) EXPLAIN and takes forever to run ... Now, if I eliminate the LEFT JOIN part of the above, *one* tuple is returned ... so even with the LEFT JOIN, only *one* tuple is going to be returned ... Is there some way to write the above so that it evaluates: WHERE (n.type = 'A' OR n.type = 'N' OR n.type = 'H' OR n.type = 'C') AND (nl.id = 15748 AND contact_lvl = 'company') AND n.nid = nl.nid first, so that it only has to do the LEFT JOIN on the *one* n.nid that is returned, instead of the 86736 that are in the table? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] LEFT JOIN in pg_dumpall is a bug
Tom Lane writes: $PSQL -d template1 -At -F ' ' \ -c "SELECT datname, usename, pg_encoding_to_char(d.encoding), datistemplate, datpath FROM pg_database d LEFT JOIN pg_shadow u ON (datdba = usesysid) WHERE datallowconn;" | \ while read DATABASE DBOWNER ENCODING ISTEMPLATE DBPATH; do Oops, you're right, the read won't keep the columns straight. Come to think of it, it would do the wrong thing for empty-string datname or usename, too, It won't actually work to restore such a setup, because zero-length identifiers are no longer allowed. Is there a more robust way of reading the data into the script? Provided that 'cut' is portable, then this works for me: TAB=' ' # tab here $PSQL -d template1 -At -F "$TAB" \ -c "SELECT datname, usename, pg_encoding_to_char(d.encoding), datistemplate, datpath FROM pg_database d LEFT JOIN pg_shadow u ON (datdba = usesysid) WHERE datallowconn;" | \ while read THINGS; do DATABASE=`echo "$THINGS" | cut -f 1` DBOWNER=`echo "$THINGS" | cut -f 2` ENCODING=`echo "$THINGS" | cut -f 3` ISTEMPLATE=`echo "$THINGS" | cut -f 4` DBPATH=`echo "$THINGS" | cut -f 5` If 'cut' is not portable, then I don't believe you can do it with IFS-based word splitting, because two adjacent separator characters don't seem to indicate an empty field but are instead taken as one separator. I think I'd rather see a warning, though, and let the script try to dump the DB anyway. Maybe for databases without an owner, but not for empty database or user names. -- Peter Eisentraut [EMAIL PROTECTED] http://yi.org/peter-e/
Re: [HACKERS] LEFT JOIN in pg_dumpall is a bug
Peter Eisentraut [EMAIL PROTECTED] writes: Is there a more robust way of reading the data into the script? Provided that 'cut' is portable, then this works for me: My old copy of Horton's _Portable C Software_ says that cut(1) is a SysV-ism adopted by POSIX. At that time (1990) it wasn't portable, and he recommended using awk or sed instead. If you think depending on POSIX utilities is OK, then use cut. I'd recommend sed, though. The GNU coding standards for Makefiles suggest not depending on programs outside this set: cat cmp cp diff echo egrep expr false grep install-info ln ls mkdir mv pwd rm rmdir sed sleep sort tar test touch true regards, tom lane
Re: [HACKERS] LEFT JOIN in pg_dumpall is a bug
Tom Lane writes: If you think depending on POSIX utilities is OK, then use cut. I'd recommend sed, though. This has gotten pretty silly: TAB=' ' # tab here $PSQL -d template1 -At -F "$TAB" \ -c "SELECT datname, usename, pg_encoding_to_char(d.encoding), datistemplate, datpath, 'x' FROM pg_database d LEFT JOIN pg_shadow u ON (datdba = usesysid) WHERE datallowconn;" | \ while read RECORDS; do DATABASE=`echo "x$RECORDS" | sed "s/^x\([^$TAB]*\).*/\1/"` DBOWNER=`echo "x$RECORDS" | sed "s/^x[^$TAB]*$TAB\([^$TAB]*\).*/\1/"` ENCODING=`echo "x$RECORDS" | sed "s/^x[^$TAB]*$TAB[^$TAB]*$TAB\([^$TAB]*\).*/\1/"` ISTEMPLATE=`echo "x$RECORDS" | sed "s/^x[^$TAB]*$TAB[^$TAB]*$TAB[^$TAB]*$TAB\([^$TAB]*\).*/\1/"` DBPATH=`echo "x$RECORDS" | sed "s/^x[^$TAB]*$TAB[^$TAB]*$TAB[^$TAB]*$TAB[^$TAB]*$TAB\([^$TAB]*\).*/\1/"` I'm not sure whether this is actually an overall improvement. I'm tempted to just coalesce(usename, {some default user}) instead. -- Peter Eisentraut [EMAIL PROTECTED] http://yi.org/peter-e/
Re: [HACKERS] LEFT JOIN in pg_dumpall is a bug
Peter Eisentraut [EMAIL PROTECTED] writes: I'm not sure whether this is actually an overall improvement. I'm tempted to just coalesce(usename, {some default user}) instead. I thought about that to begin with, but figured you wouldn't like it ;-) regards, tom lane
Re: [HACKERS] LEFT JOIN in pg_dumpall is a bug
Peter Eisentraut [EMAIL PROTECTED] writes: This snippet in pg_dumpall $PSQL -d template1 -At -F ' ' \ -c "SELECT datname, usename, pg_encoding_to_char(d.encoding), datistemplate, datpath FROM pg_database d LEFT JOIN pg_shadow u ON (datdba = usesysid) WHERE datallowconn;" | \ while read DATABASE DBOWNER ENCODING ISTEMPLATE DBPATH; do won't actually work if there indeed happens to be a database without a valid owner, because the 'read' command will take ENCODING as the dba name. Oops, you're right, the read won't keep the columns straight. Come to think of it, it would do the wrong thing for empty-string datname or usename, too, and it's only because datpath is the last column that we haven't noticed it doing the wrong thing on empty datpath. Is there a more robust way of reading the data into the script? I guess the real question is, what should be done in this case? I think it might be better to error out and let the user fix his database before backing it up. Possibly. The prior state of the code (before I put in the LEFT JOIN) would silently ignore any database with no matching user, which was definitely NOT a good idea. I think I'd rather see a warning, though, and let the script try to dump the DB anyway. (At a glance, I think pg_dump also has some problems with these sort of constellations.) Yes, there are a number of places where pg_dump should be doing outer joins and isn't. I think Tatsuo is at work on that. regards, tom lane
[HACKERS] left join bug?
hi, there! test=# create table a(id int primary key); NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'a_pkey' for table 'a' CREATE test=# create table b(id int references a); NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s) CREATE test=# insert into a values(45); INSERT 34924 1 test=# insert into a values(43); INSERT 34925 1 test=# insert into a values(34); INSERT 34926 1 test=# select a.id, b.id from a left join b using(id); id | id + 43 | 45 | (2 rows) test=# select * from a; id 45 43 34 (3 rows) test=# select * from b; id (0 rows) test=# insert into b values(34); INSERT 34927 1 test=# select a.id, b.id from a left join b using(id); id | id + 34 | 34 43 | 45 | (3 rows) test=# lark:~$psql --version psql (PostgreSQL) 7.1beta1 contains readline, history, multibyte support [...] lark:~$uname -a FreeBSD xxx 4.2-STABLE FreeBSD 4.2-STABLE #0: Wed Dec 6 17:16:57 NOVT 2000 xxx:/usr/obj/usr/src/sys/alf i386 sorry, if it has already been fixed /fjoe