Re: UNION ALL

2019-08-16 Thread Tom Lane
Mark Pasterkamp  writes:
> I am comparing two queries, q1 and q2 respectively.
> Query q1 is the original query and q2 is an attempt to reduce the cost of
> execution via leveraging the materialized view ci_t_15.
> ...
> Running explain analyze on both queries I get the following execution plans.

Huh ... I wonder why the planner decided to try to parallelize Q2 (and not
Q1)?  That seems like a pretty stupid choice, because if I'm reading the
plan correctly (I might not be) each worker process has to read all of
the "title" table and build its own copy of the hash table.  That seems
likely to swamp whatever performance gain might come from parallelizing
the scan of cast_info --- which is likely to be not much, anyway, on a
laptop with probably-not-great disk I/O bandwidth.

In any case, whether that decision was good or bad, making it differently
renders the performance of Q1 and Q2 not very comparable.  It'd be worth
disabling parallelism (SET max_parallel_workers_per_gather = 0) and
retrying Q2 to get a more apples-to-apples comparison.

Another bit of advice is to increase work_mem, so the hashes don't
have to be split into quite so many batches.

I'm noting also that your queries aren't giving the same results ---
Q2 reports returning fewer rows overall.  Do you have rows where
title.production_year is null, perhaps?

regards, tom lane




Re: UNION ALL

2019-08-15 Thread Ibrar Ahmed
On Fri, Aug 16, 2019 at 12:16 AM <066ce...@free.fr> wrote:

> Generally speaking, when executing UNION ; a DISTINCT is run afterward on
> the resultset.
>
> So, if you're sure that each part of UNION cannot return a line returned
> by another one, you may use UNION ALL, you'll cut the cost of the final
> implicit DISTINCT.
>
>
> - Mail original -
> De: "Mark Pasterkamp" 
> À: pgsql-hackers@lists.postgresql.org
> Envoyé: Jeudi 15 Août 2019 20:37:06
> Objet: UNION ALL
>
>
> Dear all,
>
>
> I was wondering if someone could help me understands what a union all
> actually does.
>
>
> For my thesis I am using Apache Calcite to rewrite queries into using
> materialized views which I then give to a Postgres database.
> For some queries, this means that they will be rewritten in a UNION ALL
> style query between an expression and a table scan of a materialized view.
> However, contrary to what I expected, the UNION ALL query is actually a
> lot slower.
>
>
> As an example, say I have 2 tables: actor and movie. Furthermore, there is
> also a foreign key index on movie to actor.
> I also have a materialized view with the join of these 2 tables for all
> movies <= 2015 called A.
> Now, if I want to query all entries in the join between actor and movie, I
> would assume that a UNION ALL between the join of actor and movie for
> movies >2015 and A is faster than executing the original query..
> If I look at the explain analyze part, I can certainly see a reduction in
> cost up until the UNION ALL part, which carries a respective cost more than
> negating the cost reduction up to a point where I might as well not use the
> existing materialized view.
>
>
> I have some trouble understanding this phenomenon.
> One thought which came to my mind was that perhaps UNION ALL might create
> a temporary table containing both result sets, and then do a table scan and
> return that result.
>
> this would greatly increase IO cost which could attribute to the problem.
> However, I am really not sure what UNION ALL actually does to append both
> result sets so I was wondering if someone would be able to help me out with
> this.
>
>
>
>
> Mark
>
>
> 066ce...@free.fr:  Please, avoid top-posting. It makes harder to follow
the
discussion.

-- 
Ibrar Ahmed


Re: UNION ALL

2019-08-15 Thread 066ce286
Generally speaking, when executing UNION ; a DISTINCT is run afterward on the 
resultset.

So, if you're sure that each part of UNION cannot return a line returned by 
another one, you may use UNION ALL, you'll cut the cost of the final implicit 
DISTINCT.


- Mail original -
De: "Mark Pasterkamp" 
À: pgsql-hackers@lists.postgresql.org
Envoyé: Jeudi 15 Août 2019 20:37:06
Objet: UNION ALL


Dear all, 


I was wondering if someone could help me understands what a union all actually 
does. 


For my thesis I am using Apache Calcite to rewrite queries into using 
materialized views which I then give to a Postgres database. 
For some queries, this means that they will be rewritten in a UNION ALL style 
query between an expression and a table scan of a materialized view. 
However, contrary to what I expected, the UNION ALL query is actually a lot 
slower. 


As an example, say I have 2 tables: actor and movie. Furthermore, there is also 
a foreign key index on movie to actor. 
I also have a materialized view with the join of these 2 tables for all movies 
<= 2015 called A. 
Now, if I want to query all entries in the join between actor and movie, I 
would assume that a UNION ALL between the join of actor and movie for movies 
>2015 and A is faster than executing the original query.. 
If I look at the explain analyze part, I can certainly see a reduction in cost 
up until the UNION ALL part, which carries a respective cost more than negating 
the cost reduction up to a point where I might as well not use the existing 
materialized view. 


I have some trouble understanding this phenomenon. 
One thought which came to my mind was that perhaps UNION ALL might create a 
temporary table containing both result sets, and then do a table scan and 
return that result. 

this would greatly increase IO cost which could attribute to the problem. 
However, I am really not sure what UNION ALL actually does to append both 
result sets so I was wondering if someone would be able to help me out with 
this. 




Mark




Re: UNION ALL

2019-08-15 Thread Tom Lane
Mark Pasterkamp  writes:
> I was wondering if someone could help me understands what a union all
> actually does.

Generally speaking, it runs the first query and then the second query.
You'd really need to provide a lot more detail for anyone to say more
than that.

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards, tom lane