> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
> [EMAIL PROTECTED] On Behalf Of Alvaro Herrera
> Sent: Friday, April 13, 2007 4:24 PM
> To: Tom Lane
> Cc: [EMAIL PROTECTED]; PostgreSQL Performance; Steve
> Subject: Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM]
> Strangely Variable Query Performance)
> Tom Lane wrote:
> > One idea I thought about was to sort by index scan cost, using
> > selectivity only as a tiebreaker for cost, rather than the other way
> > around as is currently done.  This seems fairly plausible because
> > indexscans that are cheaper than other indexscans likely return
> > rows too, and so selectivity is already accounted for to some extent
> > at least you can't have an enormously worse selectivity at lower
> > whereas Steve's example proves it doesn't work the other way.  But
> > worried about breaking the reasoning about redundant indexes that's
> > mentioned in the comments.
> >
> > Another alternative that would respond to the immediate problem is
> > maintain the current sort order, but as we come to each index,
> > using that one alone, and throw away whatever AND we might have
built up
> > if that one alone beats the AND-so-far.  This seems more
> > as it's unlikely to break any cases that work well now, but on the
> > hand it feels like plastering another wart atop a structure that's
> > already rather rickety.
> >
> > Has anyone got any thoughts about the best way to do this?
> How about doing both: sort the index by index scan cost; then pick the
> first index on the list and start adding indexes when they lower the
> cost.  When adding each index, consider it by itself against the
> already stacked indexes.  If the cost is lower, put this index at the
> top of the list, and restart the algorithm (after the sorting step of
> course).
> I think the concern about condition redundancy should be attacked
> separately.  How about just comparing whether they have common
> of conditions?  I admit I don't understand what would happen with
> indexes defined like (lower(A), B, C) versus (A, B) for example.

Instead of sorting, I suggest the quickselect() algorithm, which is
Probably, if the list is small, it won't matter much, but it might offer
some tangible benefit.

Here is an example of the algorithm:

#include <stdlib.h>
typedef double  Etype; /* Season to taste. */

extern Etype    RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t   RandRange(size_t a, size_t b);
extern size_t   RandomPartition(Etype * A, size_t p, size_t r);
extern size_t   Partition(Etype * A, size_t p, size_t r);

** In the following code, every reference to CLR means:
**    "Introduction to Algorithms"
**    By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
**    ISBN 0-07-013143-0

** CLR, page 187
Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i)
    size_t          q,
    if (p == r)
        return A[p];
    q = RandomPartition(A, p, r);
    k = q - p + 1;

    if (i <= k)
        return RandomSelect(A, p, q, i);
        return RandomSelect(A, q + 1, r, i - k);

size_t RandRange(size_t a, size_t b)
    size_t c = (size_t) ((double) rand() / ((double) RAND_MAX + 1) * (b
- a));
    return c + a;

** CLR, page 162
size_t RandomPartition(Etype A[], size_t p, size_t r)
    size_t          i = RandRange(p, r);
    Etype           Temp;
    Temp = A[p];
    A[p] = A[i];
    A[i] = Temp;
    return Partition(A, p, r);

** CLR, page 154
size_t Partition(Etype A[], size_t p, size_t r)
    Etype           x,
    size_t          i,

    x = A[p];
    i = p - 1;
    j = r + 1;

    for (;;) {
        do {
        } while (!(A[j] <= x));
        do {
        } while (!(A[i] >= x));
        if (i < j) {
            temp = A[i];
            A[i] = A[j];
            A[j] = temp;
        } else
            return j;

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to