I agree with Wim. Use explain to see how the indexes are being hit. IIRC
that with btree index the order of the index elements is important.
I.E. stamp_inserted, ip_src, ip_dst may work better since the query
optimizer can gather all the "time" branches that meet your query, then
look at the ip_src leaves of those branches to come up with the result
set. The order it by the ip_dst leaves.
With the current index it has to gather all the ip_src branches that
meet the query, then look at the time leaves to come up with the answer
set. Then order then by the ip_dst leaves.
Also, src_port is not in any of the indexes so the db will have to do a
scan of all the initial results to see which also meet that criteria.
When you deleted that index the db gathered all the stamp_stamp inserted
branches then did a scan of those records for ip_src and port_dst. Still
having to to a scan but on a much smaller result set apparently.
The science / voodoo of indexes and query optimizers is mind boggling.
There are several articles and books on just this one subject. Good
indexes, optimized queries and a "smart" optimizer can make a HUGH
difference in db performance.
Chris
Wim Kerkhoff wrote:
> Sven Anderson wrote:
>> Hi all,
>>
>> I want to share with you an interesting experience I just had.
>>
>> I did the following SELECT:
>>
>> SELECT "ip_dst",SUM("bytes"),SUM("packets"),SUM("flows") FROM
>> "tcom_v5_20060530" WHERE "stamp_inserted">='2006-05-30 00:00:00' AND
>> "stamp_inserted"<'2006-05-30 02:00:00' AND "port_src"='53' AND
>> "port_dst"='53' AND "ip_src"='xxx.xxx.xxx.xxx' GROUP BY "ip_dst" ORDER BY
>> SUM("bytes") DESC LIMIT '10';
>>
>> which seemed to never end (>30 minutes).
>>
>> I had only two indexes on this table:
>> "tcom_v5_20060530_idx" btree (ip_src, ip_dst, stamp_inserted)
>> "tcom_v5_20060530_stamp_idx" btree (stamp_inserted)
>>
>> After dropping the tcom_v5_20060530_idx, the query was answered in 20
>> seconds.
>>
> Try run the query as EXPLAIN SELECT ... to see how it's using indexes.
>
> I have also put some links to resources about PostgreSQL optimization here:
> http://nyetwork.org/wiki/PostgreSQL
>
> I think the (ip_src, ip_dst, stamp_inserted) index is probably so large
> that it's useless - it's probably has references to every single row.
> Your query is only using ip_src, so why have ip_dst in the index? Might
> be better to create seperate indexes then.
>
> If you only are doing hourly ranges in the WHERE, then you can make the
> index more efficient by doing something like
>
> tcom_v5_20060530_idx2 btree (date_trunc('hour', stamp_inserted), ip_src)
>
> Then change the WHERE statements to use date_trunc as well.
>
> Just some ideas, HTH!
>
> Wim
>
>
>
> _______________________________________________
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
>
>
_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists