On Wed, 16 Apr 2008 23:14:11 +0200, samantha mahindrakar <[EMAIL PROTECTED]> wrote:

Hi....
Iam finding the following query is working a bit slow:
EXECUTE '(SELECT ARRAY(SELECT DISTINCT date_part(''day'', measurement_start)
FROM ' || gettablestring(dates)|| '
WHERE lane_id IN (' || lanesidarr || ')))'
INTO temparr;

This function is trying to find all the days in a prticular month
whihc has data for the particular lane and put it in an array...which
can be used later.
gettablestring(dates) returns the partition name from which the data
needs to be extracted. These partitions have index on the
measurement_start field.
lanesidarr is a lane number. The partition has an index on this field to.
Could anyone give me some hints???/

OK so I guess you have one partition per month since there is no month in your WHERE. If this is a table which hasn't got much write activity (probably the case for last month's partition, for instance), CLUSTER it on something appropriate that you use often in queries, like lane_id here. And you can use SELECT foo GROUP BY foo, this will use a hash, it is faster than a sort.
        Example :

CREATE TABLE blop AS SELECT '2008-01-01'::TIMESTAMP + ((n%30)*'1 DAY'::INTERVAL) AS t FROM generate_series(1,100000) AS n;
ALTER TABLE blop ADD d DATE NULL;
UPDATE blop SET d=t;
VACUUM FULL ANALYZE blop;

-- Now blop contains 100K timestamps and 100K dates from the month 2008-01

EXPLAIN ANALYZE SELECT DISTINCT EXTRACT( DAY from t )  FROM blop;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Unique (cost=10051.82..10551.82 rows=30 width=8) (actual time=221.740..289.801 rows=30 loops=1) -> Sort (cost=10051.82..10301.82 rows=100000 width=8) (actual time=221.737..250.911 rows=100000 loops=1)
         Sort Key: (date_part('day'::text, t))
         Sort Method:  quicksort  Memory: 5955kB
-> Seq Scan on blop (cost=0.00..1747.00 rows=100000 width=8) (actual time=0.021..115.254 rows=100000 loops=1)
 Total runtime: 290.237 ms
(6 lignes)

Temps : 290,768 ms

EXPLAIN ANALYZE SELECT EXTRACT( DAY from t ) AS day FROM blop GROUP BY day;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1997.00..1997.38 rows=30 width=8) (actual time=198.375..198.390 rows=30 loops=1) -> Seq Scan on blop (cost=0.00..1747.00 rows=100000 width=8) (actual time=0.021..129.779 rows=100000 loops=1)
 Total runtime: 198.437 ms
(3 lignes)

Temps : 198,894 ms

==> Hash is faster than Sort

EXPLAIN ANALYZE SELECT d FROM blop GROUP BY d;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1747.00..1747.30 rows=30 width=4) (actual time=101.829..101.842 rows=30 loops=1) -> Seq Scan on blop (cost=0.00..1497.00 rows=100000 width=4) (actual time=0.012..33.428 rows=100000 loops=1)
 Total runtime: 101.905 ms
(3 lignes)

Temps : 102,516 ms

==> Not computing the EXTRACT is faster obviously

(actually EXPLAIN ANALYZE adds some overhead, the query really takes 60 ms)


        If you have an index lane_id, measurement_date, you can always do :

for day in 1..31:
        find 1 row with which has this day
reutrn the days you found











--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to