Re: Poor man's partitioned index .... not being used?

2019-03-22 Thread David Rowley
On Fri, 22 Mar 2019 at 07:57, Gunther  wrote:
> foo=# PREPARE testplan(int) AS
> foo-#SELECT * FROM Test WHERE mod(id,2) = mod($1,2) AND id = $1;
> PREPARE
> foo=# EXPLAIN EXECUTE testplan(8934);
> QUERY PLAN
> --
>  Index Only Scan using test_pk0 on test  (cost=0.42..8.44 rows=1 width=4)
>Index Cond: (id = 8934)
> (2 rows)
>
> That's quite alright actually. Now the questions is, could we use this in a 
> nested loop query plan? That's where I think it can't work:

Not really. In that case, the parameters were replaced with the
specified values (a.k.a custom plan).  That happens for the first 5
executions of a prepared statement, and in this case likely the
planner will continue to use the custom plan since the generic plan
won't know that the partial index is okay to use and the plan costs
would likely go up enough that the custom plan would continue to be
favoured.

> foo=# SET enable_mergejoin TO off;
> SET
> foo=# EXPLAIN SELECT * FROM Test2 a LEFT OUTER JOIN Test b ON( mod(b.id,2) = 
> mod(a.id,2) AND b.id = a.id) LIMIT 10;
>QUERY PLAN
> 
>  Limit  (cost=0.00..102516.78 rows=10 width=8)
>->  Nested Loop Left Join  (cost=0.00..38238760.24 rows=3730 width=8)
>  Join Filter: ((b.id = a.id) AND (mod(b.id, 2) = mod(a.id, 2)))
>  ->  Seq Scan on test2 a  (cost=0.00..54.30 rows=3730 width=4)
>  ->  Materialize  (cost=0.00..9056.93 rows=388129 width=4)
>->  Seq Scan on test b  (cost=0.00..5599.29 rows=388129 
> width=4)
> (6 rows)
>
> It looks like it doesn't want to evaluate the mod(a.id, 2) before it moves to 
> the index query for the nested loop.

Whether partial indexes can be used are not is determined using only
quals that can be applied at the scan level.  In this case your qual
is a join qual, and since no other qual exists that can be evaluated
at the scan level where the index can be used, then it's not
considered.  In any case, nothing there guarantees that one of your
indexes will match all records. For it to work, both of you indexes
would have to be scanned. It's not clear why you think that would be
any better than scanning just one index. I imagine it would only ever
be a win if you could eliminate one of the index scans with some qual
that guarantees that the index can't contain any records matching your
query.

> I wonder if there was a way of marking such expressions as safe in the query, 
> like suggesting a certain evaluation order, i.e.,
>
> SELECT * FROM Test2 a LEFT OUTER JOIN Test b ON(mod(b.id,2) = 
> EVAL(mod(a.id,2)) AND b.id = a.id) LIMIT 10;
>
> It's OK though. It just goes to show that in a case like this, it is best to 
> just go with the partitioned table anyway.

It sounds like you might want something like partition-wise join that
exists in PG11.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: EXPLAIN PLAN for DELETE CASCADE or DELETE not using pkey index despite EXPLAINing that it would?

2019-03-22 Thread Gunther



On 3/21/2019 17:16, Tom Lane wrote:

Gunther  writes:

I have 250 rows to delete, but they are a target to a bunch of child
tables with foreign key on delete cascade.
EXPLAIN DELETE FROM Foo WHERE id = (SELECT fooId FROM Garbage);
shows me that it uses the nested loop by Foo_pkey index to find the 250
items from Garbage to be deleted.
But once that starts, I see HUGE amount of read activity from the
tablespace Foo_main that contains the Foo table, and only the Foo table,
not the Foo_pkey, not any other index, not any other child table, not
even the toast table for Foo is contained in that tablespace (I have the
toast table diverted with symlinks to another volume).

I'm betting you neglected to index the referencing column for one
or more of the foreign keys involved.  You can get away with that
as long as you're not concerned with the speed of DELETE ...

regards, tom lane


I had the same suspicion. But firstly my schema is generated 
automatically and all foreign keys have the indexes.


But what is even more stunning is that the table where this massive read 
activity happens is the Foo heap table. I verified that by using strace 
where all the massive amounts of reads are on those files for the main 
Foo table. And this doesn't make sense, since any foreign key targets 
its primary key. The foreign keys of the child tables are also indexed 
and there is no io on the volumes that hold these child tables, nor is 
the io on the volume that holds the Foo_pkey.






Re: EXPLAIN PLAN for DELETE CASCADE or DELETE not using pkey index despite EXPLAINing that it would?

2019-03-22 Thread Justin Pryzby
On Thu, Mar 21, 2019 at 03:31:42PM -0400, Gunther wrote:
> Hi,
> 
> I have 250 rows to delete, but they are a target to a bunch of child tables
> with foreign key on delete cascade.
> 
> EXPLAIN DELETE FROM Foo WHERE id = (SELECT fooId FROM Garbage);

Probably because:
https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK
"Since a DELETE of a row from the referenced table [...] will require a scan of
the referencing table for rows matching the old value, it is often a good idea
to index the referencing columns too."

Can you show "\d+ foo", specifically its FKs ?

Justin