Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-07 Thread Josh Berkus
Petr, Just user-tested SYSTEM_ROWS and SYSTEM_TIME. They work as expected. Useful! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Josh Berkus
On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results, with 0 blocks being possible outcome,

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Alvaro Herrera
Petr Jelinek wrote: On 2015-08-06 22:25, Josh Berkus wrote: If there is no appropriate place, I'll just write a blog. There is a blog post on 2ndQ blog page which tries to describe the sampling methods visually, not sure if it's more obvious from that or not. It's somewhat broken on planet

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Simon Riggs
On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: This table has around 185 rows per page. As the sample size goes up, the number times I get zero rows goes down, but those results seem to still include data pages with zero rows. For example, here's a series of results from a

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Josh Berkus
On 08/06/2015 01:14 PM, Josh Berkus wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a range of possible results,

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Simon Riggs
On 6 August 2015 at 21:14, Josh Berkus j...@agliodbs.com wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually. This gives a

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Petr Jelinek
On 2015-08-06 22:25, Josh Berkus wrote: On 08/06/2015 01:19 PM, Simon Riggs wrote: For me, the docs seem exactly correct. The mathematical implications of that just aren't recorded explicitly. Well, for the SELECT page, all we need is the following (one changed sentence): The SYSTEM method

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Josh Berkus
On 08/06/2015 12:45 PM, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: Speaking from a user perspective, SYSTEM seems broken to me. I can't imagine using it for anything with a that degree of variation in the number of results returned, especially if it's

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Petr Jelinek
On 2015-08-06 22:17, Josh Berkus wrote: On 08/06/2015 01:14 PM, Josh Berkus wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to implement block sampling. 1. Test P for each block individually.

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Josh Berkus
On 08/06/2015 01:19 PM, Simon Riggs wrote: On 6 August 2015 at 21:14, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: On 08/06/2015 01:10 PM, Simon Riggs wrote: Given, user-stated probability of accessing a block of P and N total blocks, there are a few ways to

Re: [HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Simon Riggs
On 6 August 2015 at 20:14, Josh Berkus j...@agliodbs.com wrote: The results of 0, 185 and 925 are not. It really seems like SYSTEM is treating 0.04% as a maximum, but taking a random number of data pages somewhere around that maximum, using math which can choose numbers of pages far outside

[HACKERS] Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

2015-08-06 Thread Josh Berkus
Version: 9.5alpha2 Issue: when requesting small samples, SYSTEM often returns zero rows, and sometimes returns unexpected numbers of rows. Example: create table thous ( id int, val text ); insert into thous select i, i::text || '-val' from generate_series(1,10) as gs(i); analyze; This is