On 10 April 2015 at 15:26, Peter Eisentraut <pete...@gmx.net> wrote: > What is your intended use case for this feature?
Likely use cases are: * Limits on numbers of rows in sample. Some research colleagues have published a new mathematical analysis that will allow a lower limit than previously considered. * Time limits on sampling. This allows data visualisation approaches to gain approximate answers in real time. * Stratified sampling. Anything with some kind of filtering, lifting or bias. Allows filtering out known incomplete data. * Limits on sample error Later use cases would allow custom aggregates to work together with custom sampling methods, so we might work our way towards i) an SUM() function that provides the right answer even when used with a sample scan, ii) custom aggregates that report the sample error, allowing you to get both AVG() and AVG_STDERR(). That would be technically possible with what we have here, but I think a lot more thought required yet. These have all come out of detailed discussions with two different groups of data mining researchers. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers