[PERFORM] Performance tradeoff

2005-03-02 Thread Shawn Chisholm
Hi All, I am wondering about the relative performance of insert into table1 select distinct a,b from ... and insert into table1 select a,b from ... group by a,b when querying tables of different sizes (10K, 100K, 1s, 10s, 100s of millions of rows). The distinct way tends to sort/unique and

Re: [PERFORM] Performance tradeoff

2005-03-02 Thread Josh Berkus
Shawn, I can also change the schema to a certain extent, so would it be worthwhile to put indices on the queried tables (or refactor them) hoping the distinct does an index scan instead of sort... would the query planner take advantage of that? Use the GROUP BY, with an index on the grouped