On Thu, Apr 9, 2015 at 5:12 PM, Michael Paquier <michael.paqu...@gmail.com> wrote: > On Thu, Apr 9, 2015 at 4:30 AM, Jeff Janes <jeff.ja...@gmail.com> wrote: >> On Mon, Apr 6, 2015 at 11:02 AM, Petr Jelinek <p...@2ndquadrant.com> wrote: >>> >>> On 06/04/15 14:30, Petr Jelinek wrote: >>>> >>>> On 06/04/15 11:02, Simon Riggs wrote: >>>> >>>>> Are we ready for a final detailed review and commit? >>>>> >>>> >>>> I plan to send v12 in the evening with some additional changes that came >>>> up from Amit's comments + some improvements to error reporting. I think >>>> it will be ready then. >>>> >>> >>> Ok so here it is. >>> >>> Changes vs v11: >>> - changed input parameter list to expr_list >>> - improved error reporting, particularly when input parameters are wrong >>> - fixed SELECT docs to show correct syntax and mention that there can be >>> more sampling methods >>> - added name of the sampling method to the explain output - I don't like >>> the code much there as it has to look into RTE but on the other hand I don't >>> want to create new scan node just so it can hold the name of the sampling >>> method for explain >>> - made views containing TABLESAMPLE clause not autoupdatable >>> - added PageIsAllVisible() check before trying to check for tuple >>> visibility >>> - some typo/white space fixes >> >> >> >> Compiler warnings: >> explain.c: In function 'ExplainNode': >> explain.c:861: warning: 'sname' may be used uninitialized in this function >> >> >> Docs spellings: >> >> "PostgreSQL distrribution" extra r. >> >> "The optional parameter REPEATABLE acceps" accepts. But I don't know that >> 'accepts' is the right word. It makes the seed value sound optional to >> REPEATABLE. >> >> "each block having same chance" should have "the" before "same". >> >> "Both of those sampling methods currently...". I think it should be "these" >> not "those", as this sentence is immediately after their introduction, not >> at a distance. >> >> "...tuple contents and decides if to return in, or zero if none" Something >> here is confusing. "return it", not "return in"? >> >> Other comments: >> >> Do we want tab completions for psql? (I will never remember how to spell >> BERNOULLI). > > Yes. I think so. > >> Needs a Cat version bump. > > The committer who will pick up this patch will normally do it. > > Patch 1 is simple enough and looks fine to me. > > Regarding patch 2... > I found for now some typos: > + <title><structname>pg_tabesample_method</structname></title> > + <productname>PostgreSQL</productname> distrribution: > > Also, I am wondering if the sampling logic based on block analysis is > actually correct, for example for now this fails and I think that we > should support it: > =# with query_select as (select generate_series(1, 10) as a) select > query_select.a from query_select tablesample system (100.0/11) > REPEATABLE (9999); > ERROR: 42P01: relation "query_select" does not exist > > How does the SQL spec define exactly TABLESAMPLE? Shouldn't we get a > sample from a result set? > Thoughts?
Just to be clear, the example above being misleading... Doing table sampling using SYSTEM at physical level makes sense. In this case I think that we should properly error out when trying to use this method on something not present at physical level. But I am not sure that this restriction applies to BERNOUILLI: you may want to apply it on other things than physical relations, like views or results of WITH clauses. Also, based on the fact that we support custom sampling methods, I think that it should be up to the sampling method to define on what kind of objects it supports sampling, and where it supports sampling fetching, be it page-level fetching or analysis from an existing set of tuples. Looking at the patch, TABLESAMPLE is just allowed on tables and matviews, this limitation is too restrictive IMO. -- Michael -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers