[PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Hedayat Vatankhah
Dear all,
First of all, I should apologize if my email doesn't follow all the guidelines.
I'm trying to do that though!

If referencing to links is OK, you can find the full description of
the issue at:
http://dba.stackexchange.com/questions/127082/postgresql-seems-to-create-inefficient-plans-in-simple-conditional-joins

It contains table definitions, queries, explan/explan analyze for them, and
a description of test conditions. But I'll provide a summary of the planning
issue below.

I'm using postgresql 9.3. I've run VACCUME ANALYZE on DB and it is
not modified after that.

Consider these tables:
CREATE TABLE t1
(
  id bigint NOT NULL DEFAULT nextval('ids_seq'::regclass),
  total integer NOT NULL,
  price integer NOT NULL,
  CONSTRAINT pk_t1 PRIMARY KEY (id)
)

CREATE TABLE t2
(
  id bigint NOT NULL,
  category smallint NOT NULL,
  CONSTRAINT pk_t2 PRIMARY KEY (id),
  CONSTRAINT fk_id FOREIGN KEY (id)
  REFERENCES t1 (id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION
)

Personally, I expect both queries below to perform exactly the same:

SELECT
t1.id, *
FROM
t1
INNER JOIN
t2 ON t1.id = t2.id
where t1.id > -9223372036513411363;

And:

SELECT
t1.id, *
FROM
t1
INNER JOIN
t2 ON t1.id = t2.id
where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;

Unfortunately, they do not. PostgreSQL creates different plans for these
queries, which results in very poor performance for the first one compared
to the second (What I'm testing against is a DB with around 350 million
rows in t1, and slightly less in t2).

EXPLAIN output:
First query: http://explain.depesz.com/s/uauk
Second query: link: http://explain.depesz.com/s/uQd

The problem with the plan for the first query is that it limits
index scan on t1 with the where condition, but doesn't do so for t2.

A similar behavior happens if you replace INNER JOIN with LEFT JOIN,
and if you use "USING (id) where id > -9223372036513411363" instead
of "ON ...".

But it is important to get the first query right. Consider that I want to create
a view on SELECT statement (without condition) to simplify creating queries on
the data. If providing a single id column in the view, a SELECT query
on the view
with such a condition on id column will result in a query similar to
the first one.
With this problem, I should provide both ID columns in the view so that queries
can add each condition on ID column for both of them. Now assume what happens
when we are joining many tables together with ID column...

Is there anything wrong with my queries or with me expecting both queries to be
the sam? Can I do anything so that PostgreSQL will behave similarly for the
first query? Or if this is fixed in newer versions?

Thanks in advance,
Hedayat


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-01-30 Thread Tomas Vondra

Hi,

On 01/29/2016 04:17 PM, Albe Laurenz wrote:

I have a query that runs *slower* if I increase work_mem.

The execution plans are identical in both cases, except that a temp file
is used when work_mem is smaller.

The relevant lines of EXPLAIN ANALYZE output are:

With work_mem='100MB':
->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=4296.986..106087.683 rows=187222 loops=1)
   Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
   Buffers: shared hit=1181177 dirtied=1, temp read=7232 written=7230

With work_mem='500MB':
->  Hash Join  (cost=46738.74..285400.61 rows=292 width=8) (actual 
time=3802.849..245970.049 rows=187222 loops=1)
   Hash Cond: ("*SELECT* 1_2".postadresse_id = p.postadresse_id)
   Buffers: shared hit=1181175 dirtied=111

I ran operf on both backends, and they look quite similar, except that the
number of samples is different (this is "opreport -c" output):

CPU: Intel Sandy Bridge microarchitecture, speed 2899.8 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 9
samples  %image name   symbol name
---
   112   0.0019  postgres ExecProcNode
   3020116  49.9904  postgres ExecScanHashBucket
   3021162  50.0077  postgres ExecHashJoin
3020116  92.8440  postgres ExecScanHashBucket
   3020116  49.9207  postgres ExecScanHashBucket [self]
   3020116  49.9207  postgres ExecScanHashBucket
   8190  0.1354  vmlinux  apic_timer_interrupt

What could be an explanation for this?
Is this known behaviour?


There is a bunch of possible causes for such behavior, but it's quite 
impossible to say if this is an example of one of them as you have not 
posted the interesting parts of the explain plan. Also, knowing 
PostgreSQL version would be useful.


I don't think the example you posted is due to exceeding on-CPU cache as 
that's just a few MBs per socket, so the smaller work_mem is 
significantly larger.


What I'd expect to be the issue here is under-estimate of the hash table 
size, resulting in too few buckets and thus long chains of tuples that 
need to be searched sequentially. Smaller work_mem values usually limit 
the length of those chains in favor of batching.


Please, post the whole explain plan - especially the info about number 
of buckets/batches and the Hash node details.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread David Rowley
On 31 January 2016 at 01:30, Hedayat Vatankhah  wrote:
> Personally, I expect both queries below to perform exactly the same:
>
> SELECT
> t1.id, *
> FROM
> t1
> INNER JOIN
> t2 ON t1.id = t2.id
> where t1.id > -9223372036513411363;
>
> And:
>
> SELECT
> t1.id, *
> FROM
> t1
> INNER JOIN
> t2 ON t1.id = t2.id
> where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
>
> Unfortunately, they do not. PostgreSQL creates different plans for these
> queries, which results in very poor performance for the first one compared
> to the second (What I'm testing against is a DB with around 350 million
> rows in t1, and slightly less in t2).
>
> EXPLAIN output:
> First query: http://explain.depesz.com/s/uauk
> Second query: link: http://explain.depesz.com/s/uQd

Yes, unfortunately you've done about the only thing that you can do,
and that's just include both conditions in the query. Is there some
special reason why you can't just write the t2.id > ... condition in
the query too? or is the query generated dynamically by some software
that you have no control over?

I'd personally quite like to see improvements in this area, and even
wrote a patch [1] which fixes this problem too. The problem I had when
proposing the fix for this was that I was unable to report details
about how many people are hit by this planner limitation. The patch I
proposed caused a very small impact on planning time for many queries,
and was thought by many not to apply in enough cases for it to be
worth slowing down queries which cannot possibly benefit. Of course I
agree with this, I've no interest in slowing down planning on queries,
but at the same time understand the annoying poor optimisation in this
area.

Although please remember the patch I proposed was merely a first draft
proposal. Not for production use.

[1] 
http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Vitalii Tymchyshyn
It may be more for -hackers, but I often hear "this wont be used because of
planning time increase". Now as I know we have statistics on real query
time after few runs that is used to decide if plan should be switched.
Can this statistics be used to apply advanced planning features for
relatively long running queries? E.g. a parameter like
sophisticated_planning_l1_threshold=500ms. If query runs over this
threshold, replan it with more sophisticated features taking few more
millis. Possibly different levels can be introduced. Also allow to set
threshold to 0, saying "apply to all queries right away".
Another good option is to threshold against cumulative query time. E.g. if
there was 1 runs 0.5 millis each, it may be beneficial to spend few
millis to get 0.2 millis each.

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 10:57 David Rowley  пише:

> On 31 January 2016 at 01:30, Hedayat Vatankhah 
> wrote:
> > Personally, I expect both queries below to perform exactly the same:
> >
> > SELECT
> > t1.id, *
> > FROM
> > t1
> > INNER JOIN
> > t2 ON t1.id = t2.id
> > where t1.id > -9223372036513411363;
> >
> > And:
> >
> > SELECT
> > t1.id, *
> > FROM
> > t1
> > INNER JOIN
> > t2 ON t1.id = t2.id
> > where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;
> >
> > Unfortunately, they do not. PostgreSQL creates different plans for these
> > queries, which results in very poor performance for the first one
> compared
> > to the second (What I'm testing against is a DB with around 350 million
> > rows in t1, and slightly less in t2).
> >
> > EXPLAIN output:
> > First query: http://explain.depesz.com/s/uauk
> > Second query: link: http://explain.depesz.com/s/uQd
>
> Yes, unfortunately you've done about the only thing that you can do,
> and that's just include both conditions in the query. Is there some
> special reason why you can't just write the t2.id > ... condition in
> the query too? or is the query generated dynamically by some software
> that you have no control over?
>
> I'd personally quite like to see improvements in this area, and even
> wrote a patch [1] which fixes this problem too. The problem I had when
> proposing the fix for this was that I was unable to report details
> about how many people are hit by this planner limitation. The patch I
> proposed caused a very small impact on planning time for many queries,
> and was thought by many not to apply in enough cases for it to be
> worth slowing down queries which cannot possibly benefit. Of course I
> agree with this, I've no interest in slowing down planning on queries,
> but at the same time understand the annoying poor optimisation in this
> area.
>
> Although please remember the patch I proposed was merely a first draft
> proposal. Not for production use.
>
> [1]
> http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread David Rowley
On 31 January 2016 at 06:14, Vitalii Tymchyshyn  wrote:
> It may be more for -hackers, but I often hear "this wont be used because of
> planning time increase". Now as I know we have statistics on real query time
> after few runs that is used to decide if plan should be switched.
> Can this statistics be used to apply advanced planning features for
> relatively long running queries? E.g. a parameter like
> sophisticated_planning_l1_threshold=500ms. If query runs over this
> threshold, replan it with more sophisticated features taking few more
> millis. Possibly different levels can be introduced. Also allow to set
> threshold to 0, saying "apply to all queries right away".
> Another good option is to threshold against cumulative query time. E.g. if
> there was 1 runs 0.5 millis each, it may be beneficial to spend few
> millis to get 0.2 millis each.

I agree with you. I recently was working with long running queries on
a large 3TB database. I discovered a new optimisation was possible,
and wrote a patch to implement. On testing the extra work which the
optimiser performed took 7 micoseconds, and this saved 6 hours of
execution time. Now, I've never been much of an investor in my life,
but a 3 billion times return on an investment seems quite favourable.
Of course, that's quite an extreme case, but it's hard to ignore the
benefit is still significant in less extreme cases.

The idea you've mentioned here is very similar to what I bought up at
the developer meeting a few days ago, see AOB section in [1]

Unfortunately I didn't really get many of the correct people on my
side with it, and some wanted examples of specific patches, which is
completely not what I wanted to talk about. I was more aiming for some
agreement for generic infrastructure to do exactly as you describe.

[1]  https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Vitalii Tymchyshyn
Well, as I can see it was just few phrases unless I miss something. May be
it's worth to bring it to -hackers for a wider discussion?

Best regards, Vitalii Tymchyshyn

Сб, 30 січ. 2016 12:31 David Rowley  пише:

> On 31 January 2016 at 06:14, Vitalii Tymchyshyn  wrote:
> > It may be more for -hackers, but I often hear "this wont be used because
> of
> > planning time increase". Now as I know we have statistics on real query
> time
> > after few runs that is used to decide if plan should be switched.
> > Can this statistics be used to apply advanced planning features for
> > relatively long running queries? E.g. a parameter like
> > sophisticated_planning_l1_threshold=500ms. If query runs over this
> > threshold, replan it with more sophisticated features taking few more
> > millis. Possibly different levels can be introduced. Also allow to set
> > threshold to 0, saying "apply to all queries right away".
> > Another good option is to threshold against cumulative query time. E.g.
> if
> > there was 1 runs 0.5 millis each, it may be beneficial to spend few
> > millis to get 0.2 millis each.
>
> I agree with you. I recently was working with long running queries on
> a large 3TB database. I discovered a new optimisation was possible,
> and wrote a patch to implement. On testing the extra work which the
> optimiser performed took 7 micoseconds, and this saved 6 hours of
> execution time. Now, I've never been much of an investor in my life,
> but a 3 billion times return on an investment seems quite favourable.
> Of course, that's quite an extreme case, but it's hard to ignore the
> benefit is still significant in less extreme cases.
>
> The idea you've mentioned here is very similar to what I bought up at
> the developer meeting a few days ago, see AOB section in [1]
>
> Unfortunately I didn't really get many of the correct people on my
> side with it, and some wanted examples of specific patches, which is
> completely not what I wanted to talk about. I was more aiming for some
> agreement for generic infrastructure to do exactly as you describe.
>
> [1]  https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting
>
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] PostgreSQL seems to create inefficient plans in simple conditional joins

2016-01-30 Thread Hedayat Vatankhah

Hi,

/*David Rowley*/ wrote on Sun, 31 Jan 2016 04:57:04 +1300:

On 31 January 2016 at 01:30, Hedayat Vatankhah  wrote:

Personally, I expect both queries below to perform exactly the same:

SELECT
 t1.id, *
FROM
 t1
INNER JOIN
 t2 ON t1.id = t2.id
 where t1.id > -9223372036513411363;

And:

SELECT
 t1.id, *
FROM
 t1
INNER JOIN
 t2 ON t1.id = t2.id
 where t1.id > -9223372036513411363 and t2.id > -9223372036513411363;

Unfortunately, they do not. PostgreSQL creates different plans for these
queries, which results in very poor performance for the first one compared
to the second (What I'm testing against is a DB with around 350 million
rows in t1, and slightly less in t2).

EXPLAIN output:
First query: http://explain.depesz.com/s/uauk
Second query: link: http://explain.depesz.com/s/uQd

Yes, unfortunately you've done about the only thing that you can do,
and that's just include both conditions in the query. Is there some
special reason why you can't just write the t2.id > ... condition in
the query too? or is the query generated dynamically by some software
that you have no control over?
I can, but it would make my application code much more complex. I was 
hoping to be able to hide the complexity of DB data model in DB itself 
using views, triggers etc. If I want to add such conditions, the query 
generator in my application code would be more complex, and certainly 
the internal structure of DB will be visible to it.


I'm working to re-design a DB which can grow large and slow, as I guess 
that we can find a more optimal design before trying optimizations like 
using materialized views and other common optimizations. I've found two 
completely different approaches for such problems: de-normalizing data, 
highly normalizing data (6NF) like Anchor Modeling approach. I decided 
to experiment with something similar to the latter one (not that 
extreme!) specially since our current design was not that normalized, 
and it performs poorly. I'm investigating why it should perform so bad 
with my queries, and this problem was one of the reasons. In such a 
design, views are used to present the JOIN of many tables as a single 
table, so that using the model is easy and transparent. But usually a 
single table doesn't have 10 ID columns (which can change as the model 
changes) for which you should repeat any conditions to get acceptable 
results!
While it can be done, it is so annoying: the application should know how 
many tables are joined together, and repeat the condition for all such 
columns. And the problem become worse when you are going to create a 
relation between two different IDs of different data, e.g. relating 
customer info (composed of joining 5 tables) with info about items (s)he 
bought (composed of joining 3 tables).


Anyway, it seems that this is what I should implement in my application 
code. I just hope that adding explicit conditions for each joined table 
will not turn off any other optimizations!


Such an optimization seemed so natural to me that I didn't believe that 
PostgreSQL doesn't understand that a condition on ID applies to all id 
columns in a JOINed query, that I simplified my query step by step until 
I reached the minimum problematic query which is very similar to the one 
I posted here. It was at this point that I finally realized that maybe 
PostgreSQL really doesn't understand it, and I was ... shocked!




I'd personally quite like to see improvements in this area, and even
wrote a patch [1] which fixes this problem too. The problem I had when
proposing the fix for this was that I was unable to report details
about how many people are hit by this planner limitation. The patch I
proposed caused a very small impact on planning time for many queries,
and was thought by many not to apply in enough cases for it to be
worth slowing down queries which cannot possibly benefit. Of course I
agree with this, I've no interest in slowing down planning on queries,
but at the same time understand the annoying poor optimisation in this
area.

Although please remember the patch I proposed was merely a first draft
proposal. Not for production use.

[1] 
http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com


That's great, I might consider experimenting with this too.

Regards,
Hedayat




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance