Normally we'll only ever remove a LEFT JOIN relation if it's unused
and there's no possibility that the join would cause row duplication.
To check that the join wouldn't cause row duplicate we make use of
proofs, such as unique indexes, or for sub-queries, we make use of
DISTINCT and GROUP BY clauses.

There's another case that we don't handle, and it's VERY simple to test for.

Quite simply, it seems we could remove the join in cases such as:

create table t1 (id int primary key);
create table t2 (id int primary key, b int not null);

insert into t2 values(1,1),(2,1);
insert into t1 values(1);

select distinct t1.* from t1 left join t2 on t1.id=t2.b;


select t1.id from t1 left join t2 on t1.id=t2.b GROUP BY t1.id;

but not:

select t1.id,count(*) from t1 left join t2 on t1.id=t2.b GROUP BY t1.id;

In this case, the join *can* cause row duplicates, but the distinct or
group by would filter these out again anyway, so in these cases, we'd
not only get the benefit of not joining but also not having to remove
the duplicate rows caused by the join.

Given how simple the code is to support this, it seems to me to be
worth handling.

A patch to do this is attached.

 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment: 0001-Support-removing-LEFT-JOINs-with-DISTINCT-GROUP-BY.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to