Re: [PERFORM] yet another q

Samuel Gendler Wed, 18 Aug 2010 23:50:52 -0700

On Wed, Aug 18, 2010 at 11:14 PM, Samuel Gendler
<sgend...@ideasculptor.com>wrote:


> Please forgive the barrage of questions.  I'm just learning how to tune
> things in postgres and I've still got a bit of learning curve to get over,
> apparently.  I have done a lot of reading, though, I swear.
>
> I've got two identical queries except for a change of one condition which
> cuts the number of rows in half - which also has the effect of eliminating
> one partition from consideration (partitioned over time and I cut the time
> period in half).  The query plans are considerably different as a result.
> The net result is that the fast query is 5x faster than the slow query.  I'm
> curious if the alternate query plan is inherently faster or is it just a
> case of the algorithm scaling worse than linearly with the row count, which
> certainly wouldn't be surprising.  The big win, for me, is that the sort
> uses vastly less memory.  The slow plan requires work_mem to be 1500MB to
> even make it 5x worse.  With a more reasonable work_mem (400MB), it drops to
> something like 15x worse because it has to sort on disk.
>
> fast plan: http://explain.depesz.com/s/iZ
> slow plan: http://explain.depesz.com/s/Dv2
>
> query:
>
>
> EXPLAIN ANALYZE SELECT
>            t_lookup.display_name as group,
>            to_char(t_fact.time, 'DD/MM HH24:MI') as category,
>            substring(t_lookup.display_name from 1 for 20) as label,
>            round(sum(t_fact.total_ms)/sum(t_fact.count)) as value
>        FROM
>            portal.providers t_lookup,
>            day_scale_radar_performance_fact t_fact
>        WHERE
>            t_fact.probe_type_num < 3
>            and t_lookup.provider_id = t_fact.provider_id
>            and t_lookup.provider_owner_customer_id =
> t_fact.provider_owner_customer_id
>            and t_fact.provider_owner_customer_id = 0
>            and t_fact.time between timezone('UTC', '2010-08-18 15:00:00') -
> interval '30 day' and timezone('UTC', '2010-08-18 15:00:00')
>        GROUP BY
>            t_fact.provider_owner_customer_id, t_fact.provider_id,
>            t_lookup.display_name,
>            t_fact.time
>        ORDER BY
>            t_fact.time
>
> table structure:
>
>        Table "perf_reporting.abstract_radar_performance_fact"
>           Column           |            Type             | Modifiers
> ----------------------------+-----------------------------+-----------
> count                      | bigint                      | not null
> total_ms                   | bigint                      | not null
> time                       | timestamp without time zone | not null
> market_num                 | integer                     | not null
> country_num                | integer                     | not null
> autosys_num                | integer                     | not null
> provider_owner_zone_id     | integer                     | not null
> provider_owner_customer_id | integer                     | not null
> provider_id                | integer                     | not null
> probe_type_num             | integer                     | not null
>
> with individual indexes on the everything from time to the bottom on the
> child tables
>
> and
>
>
>                             Table "portal.providers"
>           Column           |            Type             |       Modifiers
>
>
> ----------------------------+-----------------------------+------------------------
> btime                      | timestamp without time zone | not null default
> now()
> mtime                      | timestamp without time zone | not null default
> now()
> version                    | integer                     | not null default
> 1
> provider_id                | integer                     | not null
> provider_owner_zone_id     | integer                     | not null
> provider_owner_customer_id | integer                     | not null
> provider_category_id       | integer                     | not null
> name                       | character varying(255)      | not null
> display_name               | character varying(255)      | not null
>
> with indexes on every column with name ending in '_id'
>
>
It gets more complicated:

When I dropped to a query over 15 days instead of 30 days, I saw a huge bump
in performance (about 16 secs), the query plan for which is here:

http://explain.depesz.com/s/iaf

note: the query is identical to the one below, but with the interval changed
to 15 days from 30 days, which also keeps the query within a single
partition.  Note that the sort requires almost no memory and occurs after
the aggregation.  I thought my problems were solved, since reducing the
normal window over which queries are performed is something the app can
tolerate.

However, if I keep the same 15 day window (so row count is approximately the
same), but change the time window start date by 2 days (still keeping the
entire query within the same partition), I get a completely different query
plan.  There is effectively no difference between the two queries other than
the start date of the time window in the where clause, but one executes in
twice the time (35 secs or thereabouts).

http://explain.depesz.com/s/LA

Just for completeness' sake, I changed the query such that it is still 15
days, but this time crosses a partition boundary.  The plan is very similar
to the previous one and executes in about the same time (35 secs or so)

http://explain.depesz.com/s/Aqw

Statistics are up to date and were performed with default_statistics_target
= 100

Is there any way I can force the more efficient HashAggregate then sort plan
instead of sort then GroupAggregate?

Re: [PERFORM] yet another q

Reply via email to