Re: [HACKERS] Confusing results with lateral references

2015-12-09 Thread Robert Haas
On Fri, Dec 4, 2015 at 10:23 AM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> I am seeing different results with two queries which AFAIU have same
>> semantics and hence are expected to give same results.
>
>> postgres=# select * from t1, (select distinct val, val2 from t2) t2ss where 
>> t1.val = t2ss.val for update of t1;
>
>> postgres=# select * from t1, lateral (select distinct val, val2 from t2 
>> where t2.val = t1.val) t2ss for update of t1;
>
> (I renamed your inline sub-selects to avoid confusion between them and the
> table t2.)
>
> I'm skeptical that those should be claimed to have identical semantics.
>
> In the first example, after we've found the join row (1,1,1,1), we block
> to see if the pending update on t1 will commit.  After it does, we recheck
> the join condition using the updated row from t1 (and the original row
> from t2ss).  The condition fails, so the updated row is not output.

Check.

> The same thing happens in the second example, ie, we consider the updated
> row from t1 and the non-updated row from t2ss (NOT t2).  There are no join
> conditions to recheck (in the outer query level), so the row passes, and
> we output it.

What's surprising is that t2.val = t1.val isn't rechecked here.  I
think that's not really possible, because of the DISTINCT operation,
which prevents us from identifying a single row from t2 that accounts
for the subquery's output row.  Not sure whether it would work without
the DISTINCT.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing results with lateral references

2015-12-04 Thread Tom Lane
Ashutosh Bapat  writes:
> I am seeing different results with two queries which AFAIU have same
> semantics and hence are expected to give same results.

> postgres=# select * from t1, (select distinct val, val2 from t2) t2ss where 
> t1.val = t2ss.val for update of t1;

> postgres=# select * from t1, lateral (select distinct val, val2 from t2 where 
> t2.val = t1.val) t2ss for update of t1;

(I renamed your inline sub-selects to avoid confusion between them and the
table t2.)

I'm skeptical that those should be claimed to have identical semantics.

In the first example, after we've found the join row (1,1,1,1), we block
to see if the pending update on t1 will commit.  After it does, we recheck
the join condition using the updated row from t1 (and the original row
from t2ss).  The condition fails, so the updated row is not output.

The same thing happens in the second example, ie, we consider the updated
row from t1 and the non-updated row from t2ss (NOT t2).  There are no join
conditions to recheck (in the outer query level), so the row passes, and
we output it.

If you'd allowed the FOR UPDATE to propagate into the sub-select, then the
sub-select's conditions would be considered as needing rechecks ... of
course, that would require removing the DISTINCT.

This example does show that a lateral reference to a FOR UPDATE table from
a non-FOR-UPDATE subselect has confusing behavior.  Maybe we ought to
forbid that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing results with lateral references

2015-12-04 Thread Tom Lane
Ashutosh Bapat  writes:
> There's another seemingly wrong result, not with lateral, but with FOR
> UPDATE.

[ shrug... ]  You're getting the post-update images of the two join
rows that would have been reported without FOR UPDATE.  This one is
definitely not a bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Ashutosh Bapat
On Fri, Dec 4, 2015 at 10:58 AM, Amit Langote  wrote:

> On 2015/12/03 21:26, Ashutosh Bapat wrote:
> > Session 1
> > postgres=# begin;
> > BEGIN
> > postgres=# update t1 set val = 2 where val2 = 1;
> > UPDATE 1
> >
> > Session 2
> > postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update
> of
> > t1;
> >
> > query waits
> >
> > Session 1
> > postgres=# commit;
> > COMMIT
> >
> >
> > Session 2 query returns two rows
> > select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
> >  val | val2 | val | val2
> > -+--+-+--
> >2 |1 | |
> >2 |1 | |
> > (2 rows)
> >
> > It's confusing to see two rows from left join result when the table
> really
> > has only a single row. Is this behaviour expected?
>
> Maybe it is. Because the other table still has two (1, 1) rows, LockRows's
> subplan would still produce two rows in result, no?
>
>
Documentation at
http://www.postgresql.org/docs/9.4/static/queries-table-expressions.html
says
(T1) LEFT OUTER JOIN (T2)

First, an inner join is performed. Then, *for each row in T1* that does not
satisfy the join condition with any row in T2, *a joined row is added* with
null values in columns of T2. Thus, the joined table always has at least
one row for each row in T1.
So there should be only one row for each row of outer table that didn't
join with the inner table. IOW a join with no joining rows should have same
number of rows as outer table.


> Thanks,
> Amit
>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Amit Langote
On 2015/12/03 21:26, Ashutosh Bapat wrote:
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
> 
> Session 2
> postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of
> t1;
> 
> query waits
> 
> Session 1
> postgres=# commit;
> COMMIT
> 
> 
> Session 2 query returns two rows
> select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
>  val | val2 | val | val2
> -+--+-+--
>2 |1 | |
>2 |1 | |
> (2 rows)
> 
> It's confusing to see two rows from left join result when the table really
> has only a single row. Is this behaviour expected?

Maybe it is. Because the other table still has two (1, 1) rows, LockRows's
subplan would still produce two rows in result, no?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Ashutosh Bapat
There's another seemingly wrong result, not with lateral, but with FOR
UPDATE.

postgres=# select * from t1;
 val | val2
-+--
   1 |1
(1 row)

postgres=# select * from t2;
 val | val2
-+--
   1 |1
   2 |2
   1 |1
(3 rows)

Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1

Session 2
postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of
t1;

query waits

Session 1
postgres=# commit;
COMMIT


Session 2 query returns two rows
select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
 val | val2 | val | val2
-+--+-+--
   2 |1 | |
   2 |1 | |
(2 rows)

It's confusing to see two rows from left join result when the table really
has only a single row. Is this behaviour expected?

On Thu, Dec 3, 2015 at 3:49 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Hi,
>
> I am seeing different results with two queries which AFAIU have same
> semantics and hence are expected to give same results.
>
> postgres=# \d t1
>   Table "public.t1"
>  Column |  Type   | Modifiers
> +-+---
>  val| integer |
>  val2   | integer |
>
> postgres=# \d t2
>   Table "public.t2"
>  Column |  Type   | Modifiers
> +-+---
>  val| integer |
>  val2   | integer |
>
> There's no data in the table to start with.
>
> postgres=# insert into t1 values (1, 1);
> postgres=# insert into t2 values (1, 1), (2, 2);
>
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
>
> Session 2
> postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
> t1.val = t2.val for update of t1;
>
> query waits here because of FOR UPDATE clause
>
> Session 1
> postgres=# commit;
> COMMIT
>
> Session 2 gives no rows
> postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
> t1.val = t2.val for update of t1;
> val | val2 | val | val2
> -+--+-+--
> (0 rows)
>
>
> Reset values of t1
> postgres=# update t1 set val = 1 where val2 = 1;
> UPDATE 1
>
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
>
> Session 2
> postgres=# select * from t1, lateral (select distinct val, val2 from t2
> where t2.val = t1.val) t2 for update of t1;
>
> query waits here
>
> Session 1
> postgres=# commit;
> COMMIT
>
> Session 2 gives results of the query
> postgres=# select * from t1, lateral (select distinct val, val2 from t2
> where t2.val = t1.val) t2 for update of t1;
>  val | val2 | val | val2
> -+--+-+--
>2 |1 |   1 |1
> (1 row)
>
> AFAIU, both the queries
>
> select * from t1, (select distinct val, val2 from t2) t2 where t1.val =
> t2.val for update of t1;
>
> AND
>
> select * from t1, lateral (select distinct val, val2 from t2 where t2.val
> = t1.val) t2 for update of t1;
>
> have same semantic and should give same results.
>
> Is seeing different results expected behaviour?
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company