Thanks for the reply.

I changed the statistics on av.customer_id as suggested and the number returned 
by pg_stats went from 202,333 to 904,097.  There are 11.2 million distinct 
customer_ids on the 14.8 million vehicle records.  Rerunning the query showed 
no significant change in time (624 seconds vs. 639 seconds) - plan is at 

I went through the query looking for fields used in joins and conditions and 
applied the same steps to 7 other fields over 4 of the tables.  Most n_distinct 
values did not change much but two did change from 1.# million to -1<x<0 which 
seems better based on n_distinct's definition.  This improved the query a 
little; from 624 seconds down to 511 seconds.  That plan is at 
https://explain.depesz.com/s/te50.  This is the same query that ran in 272 
seconds on 9.3 with the same data and previous statistics settings.

It has now been decided to try upgrading to 9.4 as that is the minimum to 
support Django 1.11 (which we are trying to upgrade a backend service to).  The 
hope is whatever feature we have not configured properly in 9.6 is not there in 

On 11/6/17, 9:21 AM, "Justin Pryzby" <pry...@telsasoft.com> wrote:

    On Mon, Nov 06, 2017 at 01:18:00PM +0000, Adam Torres wrote:
    > Good morning all,
    > We have a problem with performance after upgrading from 9.3 to 9.6 where 
certain queries take 9 times longer to run.  On our initial attempt to upgrade, 
we noticed the system as a whole was taking longer to run through normal daily 
processes.  The query with the largest run time was picked to act as a 
measuring stick.
    > https://explain.depesz.com/s/z71u
    > Planning time: 8.218 ms
    > Execution time: 639319.525 ms
    > Same query as run on 9.3
    > https://explain.depesz.com/s/gjN3
    > Total runtime: 272897.150 ms
    Actually it looks to me like both query plans are poor..
    ..because of this:
    | Hash Join (cost=85,086.25..170,080.80 ROWS=40 width=115) (actual 
time=32.673..84.427 ROWS=13,390 loops=1)
    |    Hash Cond: (av.customer_id = cc_1.id)
    If there are a large number of distinct customer_ids (maybe with nearly 
    frequencies), it might help to
    ..same for cc_1.id.  And re-analyze those tables (are they large??).
    see if statistics improve:
    SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, 
tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv,
    FROM pg_stats WHERE attname~'customers_customer' AND tablename='id' GROUP 
BY 1,2,3,4,5 ORDER BY 1
    Goal is to get at least an accurate value for n_distinct (but preferably 
    storing the most frequent IDs).  I wouldn't bother re-running the query 
    you find that increasing stats target causes the plan to change.

Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:

Reply via email to