Re: [PERFORM] Query only slow on first run

Craig James Wed, 28 Nov 2007 06:57:50 -0800

tmp wrote:

what exactly is that
"random_number" column
A random float that is initialized when the row is created and nevermodified afterwards. The physical row ordering will clearly not matchthe random_number ordering. However, other queries uses a row orderingby the primary key so I don't think it would make much sense to make theindex on random_number a clustering index just in order to speed up thissingle query.
 and why are you desirous of ordering by it?
In order to simulate a random pick of K rows. See [1].


A trick that I used is to sample the random column once, and create a much 
smaller table of the first N rows, where N is the sample size you want, and use 
that.

If you need a different N samples each time, you can create a temporary table, 
put your random N rows into that, do an ANALYZE, and then join to this smaller 
table.  The overall performance can be MUCH faster even though you're creating 
and populating a whole table, than the plan that Postgres comes up with. This 
seems wrong-headed (why shouldn't Postgres be able to be as efficient on its 
own?), but it works.

Craig


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [PERFORM] Query only slow on first run

Reply via email to