On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote:
> On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli <s...@keybit.net> wrote:
> > I'd love to see enhanced CTID operators, to fetch all visible tuples in a 
> > page
> > using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.
> Among other things, this would enable user-space implementation of
> tablesample. Given the operator =~(tid, int) that matches the page
> number and planner/executor integration so that it results in a TID
> scan, you would need the following functions:
> random_pages(tbl regclass, samples int) returns int[]
> aggregate function:
> reservoir_sample(item anyelement, samples int) returns anyarray
> Implementations for both of the functions could be adapted from analyze.c.
> Then tablesample could be implemented with the following query:
> SELECT (SELECT reservoir_sample(some_table, 50) AS samples
>    FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
> FROM random_pages('some_table', 50) AS rnd_pgtids;
> Actually, now that I think about it, it could actually be implemented
> without any modifications to core at some cost to efficiency.
> random_pages would have to return tid[] that contains for each
> generated pagenumber all possible tids on that page.

This is exactly what I'm after.
I've actually started crafting such a TableSample function and I'm in the
process to refine the signature so your suggested interface above is 
very useful, thanks !

But I don't understand the reservoir_sample call, what is it supposed to do ?
And how flexibly "anyarray" return would be ? Could you return arbitrary 
typed rowtypes from it ?

> By making the building blocks available users get more flexibility.
> The downside would be that we can't automatically make better sampling
> methods available.

One approach doesn't preclude the other. TABLESAMPLE will still be useful,
also for SQL compliance.


  |   __/  |    Delivering high quality PostGIS 2.0 !
  |  / 2.0 |    http://strk.keybit.net - http://vizzuality.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to