Tom Lane wrote:
Ole Langbehn [EMAIL PROTECTED] writes:
What do you think about the idea of an UniqueSort which would do
sort+unique in one pass ?
This is what oracle does and it is quite fast with it...
Hashing is at least as fast, if not faster.
regards, tom lane
I got good mileage in a
The really tricky part is that a DISTINCT ON needs to know about a
first()
aggregate. And to make optimal use of indexes, a last() aggregate as
well. And
ideally the planner/executor needs to know something is magic about
first()/last() (and potentially min()/max() at some point) and that
Hashing is at least as fast, if not faster.
regards, tom lane
Probably quite faster if the dataset is not huge...
UniqueSort would be useful for GROUP BY x ORDER BY x though
---(end of broadcast)---
TIP 3: if
I don't really think it would be a useful plan anyway. What *would* be
useful is to support HashAggregate as an implementation alternative for
DISTINCT --- currently I believe we only consider that for GROUP BY.
The DISTINCT planning code is fairly old and crufty and hasn't been
redesigned
=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?= [EMAIL PROTECTED] writes:
Present state is that DISTINCT and UNION are slow with or without using
the GROUP BY trick. Including the index skip scan in the planning options
would only happen when appropriate cases are detected. This detection
Pierre-Frédéric Caillaud [EMAIL PROTECTED] writes:
I see this as a minor annoyance only because I can write GROUP BY
instead of DISTINCT and get the speed boost. It probably annoys people
trying to port applications to postgres though, forcing them to rewrite
their queries.
Yeah,
Ole Langbehn [EMAIL PROTECTED] writes:
What do you think about the idea of an UniqueSort which would do
sort+unique in one pass ?
This is what oracle does and it is quite fast with it...
Hashing is at least as fast, if not faster.
regards, tom lane
You could try :
explain analyze select land from customer_dim group by land;
It will be a lot faster but I can't make it use the index on my machine...
Example :
create table dummy as (select id, id%255 as number from a large table
with 1M rows);
so we have a table
Am Mittwoch, 6. Oktober 2004 12:19 schrieb Pierre-Frédéric Caillaud:
You could try :
explain analyze select land from customer_dim group by land;
It will be a lot faster but I can't make it use the index on my machine...
this already speeds up my queries to about 1/4th of the time, which is
Pierre-Frédéric Caillaud [EMAIL PROTECTED] writes:
I don't know WHY (oh why) postgres does not use this kind of strategy
when distinct'ing an indexed field... Anybody got an idea ?
Well there are two questions here. Why given the current plans available does
postgres choose a sequential
There are even three questions here :
- given that 'SELECT DISTINCT field FROM table' is exactly
the same as 'SELECT field FROM table GROUP BY field, postgres could
transform the first into the second and avoid itself a (potentially
killer) sort.
On my example the table was not too
Greg Stark [EMAIL PROTECTED] writes:
why isn't a skip index scan plan available? Well, nobody's written the code
yet.
I don't really think it would be a useful plan anyway. What *would* be
useful is to support HashAggregate as an implementation alternative for
DISTINCT --- currently I believe
Tom Lane [EMAIL PROTECTED] writes:
Greg Stark [EMAIL PROTECTED] writes:
why isn't a skip index scan plan available? Well, nobody's written the code
yet.
I don't really think it would be a useful plan anyway.
Well it would clearly be useful in this test case, where has a small number
Greg Stark [EMAIL PROTECTED] writes:
But regardless of how uncommon it is, it could be considered important in
another sense: when you need it there really isn't any alternative. It's an
algorithmic improvement with no bound on the performance difference.
[ shrug... ] There are an infinite
14 matches
Mail list logo