Re: [pmacct-discussion] PostgreSQL performance

Wim Kerkhoff Wed, 31 May 2006 14:08:16 -0700

Sven Anderson wrote:
> Hi all,
>
> I want to share with you an interesting experience I just had.
>
> I did the following SELECT:
>
> SELECT "ip_dst",SUM("bytes"),SUM("packets"),SUM("flows") FROM
> "tcom_v5_20060530" WHERE "stamp_inserted">='2006-05-30 00:00:00' AND
> "stamp_inserted"<'2006-05-30 02:00:00' AND "port_src"='53' AND
> "port_dst"='53' AND "ip_src"='xxx.xxx.xxx.xxx' GROUP BY "ip_dst" ORDER BY
> SUM("bytes") DESC LIMIT '10';
>
> which seemed to never end (>30 minutes).
>
> I had only two indexes on this table:
>     "tcom_v5_20060530_idx" btree (ip_src, ip_dst, stamp_inserted)
>     "tcom_v5_20060530_stamp_idx" btree (stamp_inserted)
>
> After dropping the tcom_v5_20060530_idx, the query was answered in 20 seconds.
>   
Try run the query as EXPLAIN SELECT ... to see how it's using indexes.


I have also put some links to resources about PostgreSQL optimization here:
    http://nyetwork.org/wiki/PostgreSQL

I think the (ip_src, ip_dst, stamp_inserted) index is probably so large 
that it's useless - it's probably has references to every single row. 
Your query is only using ip_src, so why have ip_dst in the index? Might 
be better to create seperate indexes then.

If you only are doing hourly ranges in the WHERE, then you can make the 
index more efficient by doing something like

tcom_v5_20060530_idx2 btree (date_trunc('hour', stamp_inserted), ip_src)

Then change the WHERE statements to use date_trunc as well.

Just some ideas, HTH!

Wim



_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] PostgreSQL performance

Reply via email to