I agree with Wim. Use explain to see how the indexes are being hit. IIRC 
that with btree index the order of the index elements is important.

I.E. stamp_inserted, ip_src, ip_dst may work better since the query 
optimizer can gather all the "time" branches that meet your query, then 
look at the ip_src leaves of those branches to come up with the result 
set. The order it by the ip_dst leaves.

With the current index it has to gather all the ip_src branches that 
meet the query, then look at the time leaves to come up with the answer 
set. Then order then by the ip_dst leaves.

Also, src_port is not in any of the indexes so the db will have to do a 
scan of all the initial results to see which also meet that criteria.

When you deleted that index the db gathered all the stamp_stamp inserted 
branches then did a scan of those records for ip_src and port_dst. Still 
having to to a scan but on a much smaller result set apparently.

The science / voodoo of indexes and query optimizers is mind boggling. 
  There are several articles and books on just this one subject. Good 
indexes, optimized queries and a "smart" optimizer can make a HUGH 
difference in db performance.

Chris

Wim Kerkhoff wrote:
> Sven Anderson wrote:
>> Hi all,
>>
>> I want to share with you an interesting experience I just had.
>>
>> I did the following SELECT:
>>
>> SELECT "ip_dst",SUM("bytes"),SUM("packets"),SUM("flows") FROM
>> "tcom_v5_20060530" WHERE "stamp_inserted">='2006-05-30 00:00:00' AND
>> "stamp_inserted"<'2006-05-30 02:00:00' AND "port_src"='53' AND
>> "port_dst"='53' AND "ip_src"='xxx.xxx.xxx.xxx' GROUP BY "ip_dst" ORDER BY
>> SUM("bytes") DESC LIMIT '10';
>>
>> which seemed to never end (>30 minutes).
>>
>> I had only two indexes on this table:
>>     "tcom_v5_20060530_idx" btree (ip_src, ip_dst, stamp_inserted)
>>     "tcom_v5_20060530_stamp_idx" btree (stamp_inserted)
>>
>> After dropping the tcom_v5_20060530_idx, the query was answered in 20 
>> seconds.
>>   
> Try run the query as EXPLAIN SELECT ... to see how it's using indexes.
> 
> I have also put some links to resources about PostgreSQL optimization here:
>     http://nyetwork.org/wiki/PostgreSQL
> 
> I think the (ip_src, ip_dst, stamp_inserted) index is probably so large 
> that it's useless - it's probably has references to every single row. 
> Your query is only using ip_src, so why have ip_dst in the index? Might 
> be better to create seperate indexes then.
> 
> If you only are doing hourly ranges in the WHERE, then you can make the 
> index more efficient by doing something like
> 
> tcom_v5_20060530_idx2 btree (date_trunc('hour', stamp_inserted), ip_src)
> 
> Then change the WHERE statements to use date_trunc as well.
> 
> Just some ideas, HTH!
> 
> Wim
> 
> 
> 
> _______________________________________________
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
> 
> 

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to