Re: [HACKERS] Planner avoidance of index only scans for partial indexes

Merlin Moncure Fri, 17 Aug 2012 08:36:37 -0700

On Thu, Aug 16, 2012 at 5:20 PM, Josh Berkus <j...@agliodbs.com> wrote:
> I'm mainly looking for numbers for the 9.2 release.  Like, "up to 14X
> speedup on data warehousing queries".


OK:
I work in the multi family industry and we have a large database which
we are organizing into an on-demand OLAP style data delivery project.
The data is organized into a fact table like this:
(PropertyId, Floorplan, UnitType, TimeSeries, AggregationTypeCode,
MetricCode, MetricValue)

Where metric value contains a single numeric value for the calculated
metric (say, average rent).  The other fields point at the identifying
criteria for the metric: property it pertains to, etc.  TimeSeries
represents a point in time: It's a string that is 'Y2012M01',
'Y2011Q3', etc.

The table is partitioned on a two year basis.  The 2010_2011 partition
has 37million records and is only expected to grow as we add new
properties and metrics.  One of the important questions this table has
to answer is to gather metric values like this:

SELECT PropertyId, AGG(MetricValue) FROM PrepertyMetric
WHERE Floorplan = w and UnitType = x and TimeSeries = y and MetricCode = z;

And that query might return 7k-15k records depending. It is extremely
important to be fast and cache warm-up is a huge issue for us as we
have various nightly processes that blow the cache out.  The main
interface to the function is a routine called GetChartData which runs
the query above N times looking back in time from a known point
(typically 12 times) and returns arrays which get converted to json
when going out the door.

On the 9.2 database I organized one of the partitions (only) to
utilize IOS and started capturing timings when I came in this morning
with a completely cold cache.  Partition tables have been chilled as
part of the nightly build.

hese are 100% real world results.  output data is 3 parallel arrays of size 12.

9.1 Run 1 (Cold), 12M partition
Time: 5147.000 ms

9.1 Run 2 (Warm), 12M partition
Time: 219.000 ms

9.2 Run 1 (Cold) 12M partition, Index Only scan
Time: 257.000 ms

9.2 Run 2 (Warm) 2M partition, Index Only scan
Time: 92.000 ms

9.1 Run 1 (Cold) 37M partition
Time: 22074.000 ms

9.1 Run 2 (Warm) 37M partition
Time: 435.000 ms

9.2 Run 1 (Cold) 37M partition (not IOS!)
Time: 7629.000 ms

9.2 Run 2 (Warm) 37M partition (not IOS!)
Time: 183.000 ms

The takeaway is that for this query I'm seeing end to end query
execution improvement of 58% in the worst case (warm cache) and 20x or
more in the best case when the cache is cold: that 22 sec time is
unpredictable and can run into the minutes as the index randomizes the
i/o to the heap (the table is not clustered on this index nor can it
be).   This 'best case' is extremely important to us as the generally
the first thing the user sees when firing up the application the first
time and with 9.2 it's a 'night and day' experience.  Here is the
bottom line in 9.1 vs 9.2 IOS:

9.1:
ysanalysis=# explain (buffers, analyze) SELECT ...
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Result  (cost=0.00..0.26 rows=1 width=0) (actual
time=219.979..219.979 rows=1 loops=1)
   Buffers: shared hit=176227
 Total runtime: 219.990 ms

9.2:
ysanalysis=# explain (buffers, analyze) SELECT ...
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=92.909..92.909
rows=1 loops=1)
   Buffers: shared hit=27198
 Total runtime: 92.921 ms

That's going to translate into less cache pressure and overall cpu
efficiency.  It's a also a good technique to manage i/o patterns
especially for those of us that don't have the luxury of an all-SSD
storage unit (my backend is a
generally-good-but-you-never-know-performance enterprise SAN).  Good
knowledge of the database internals is helpful and precise indexing is
a must.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Planner avoidance of index only scans for partial indexes

Reply via email to