Re: [PERFORM] Query running slow

PFC Wed, 16 Apr 2008 16:15:52 -0700

On Wed, 16 Apr 2008 23:14:11 +0200, samantha mahindrakar<[EMAIL PROTECTED]> wrote:

Hi....
Iam finding the following query is working a bit slow:

EXECUTE '(SELECT ARRAY(SELECT DISTINCT date_part(''day'',measurement_start)

FROM ' || gettablestring(dates)|| '
WHERE lane_id IN (' || lanesidarr || ')))'
INTO temparr;


This function is trying to find all the days in a prticular month
whihc has data for the particular lane and put it in an array...which
can be used later.
gettablestring(dates) returns the partition name from which the data
needs to be extracted. These partitions have index on the
measurement_start field.
lanesidarr is a lane number. The partition has an index on this field to.
Could anyone give me some hints???/

OK so I guess you have one partition per month since there is no month inyour WHERE.If this is a table which hasn't got much write activity (probably thecase for last month's partition, for instance), CLUSTER it on somethingappropriate that you use often in queries, like lane_id here.And you can use SELECT foo GROUP BY foo, this will use a hash, it isfaster than a sort.

        Example :

CREATE TABLE blop AS SELECT '2008-01-01'::TIMESTAMP + ((n%30)*'1DAY'::INTERVAL) AS t FROM generate_series(1,100000) AS n;

ALTER TABLE blop ADD d DATE NULL;
UPDATE blop SET d=t;
VACUUM FULL ANALYZE blop;

-- Now blop contains 100K timestamps and 100K dates from the month 2008-01

EXPLAIN ANALYZE SELECT DISTINCT EXTRACT( DAY from t )  FROM blop;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------

Unique (cost=10051.82..10551.82 rows=30 width=8) (actualtime=221.740..289.801 rows=30 loops=1)-> Sort (cost=10051.82..10301.82 rows=100000 width=8) (actualtime=221.737..250.911 rows=100000 loops=1)

         Sort Key: (date_part('day'::text, t))
         Sort Method:  quicksort  Memory: 5955kB

-> Seq Scan on blop (cost=0.00..1747.00 rows=100000 width=8)(actual time=0.021..115.254 rows=100000 loops=1)

 Total runtime: 290.237 ms
(6 lignes)

Temps : 290,768 ms

EXPLAIN ANALYZE SELECT EXTRACT( DAY from t ) AS day FROM blop GROUP BY day;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------

HashAggregate (cost=1997.00..1997.38 rows=30 width=8) (actualtime=198.375..198.390 rows=30 loops=1)-> Seq Scan on blop (cost=0.00..1747.00 rows=100000 width=8) (actualtime=0.021..129.779 rows=100000 loops=1)

 Total runtime: 198.437 ms
(3 lignes)

Temps : 198,894 ms

==> Hash is faster than Sort

EXPLAIN ANALYZE SELECT d FROM blop GROUP BY d;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------

HashAggregate (cost=1747.00..1747.30 rows=30 width=4) (actualtime=101.829..101.842 rows=30 loops=1)-> Seq Scan on blop (cost=0.00..1497.00 rows=100000 width=4) (actualtime=0.012..33.428 rows=100000 loops=1)

 Total runtime: 101.905 ms
(3 lignes)

Temps : 102,516 ms

==> Not computing the EXTRACT is faster obviously

(actually EXPLAIN ANALYZE adds some overhead, the query really takes 60 ms)


        If you have an index lane_id, measurement_date, you can always do :

for day in 1..31:
        find 1 row with which has this day
reutrn the days you found











--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Query running slow

Reply via email to