On Fri, Apr 3, 2015 at 3:06 AM, Petr Jelinek <p...@2ndquadrant.com> wrote: > > Hi, > > so here is version 11. >
Thanks, patch looks much better, but I think still few more things needs to discussed/fixed. 1. +tablesample_clause: + TABLESAMPLE ColId '(' func_arg_list ')' opt_repeatable_clause Why do you want to allow func_arg_list? Basically if user tries to pass multiple arguments in TABLESAMPLE method's clause like (10,20), then I think that should be caught in grammer level as an error. SQL - 2003 specs says that argument to REPAEATABLE and TABLESAMPLE method is same <numeric value expr> It seems to me that you want to allow it to make it extendable to user defined Tablesample methods, but not sure if it is right to use func_arg_list for the same because sample method arguments can have different definition than function arguments. Now even if we want to keep sample method arguments same as function arguments that sounds like a separate discussion. In general, I feel you already have good basic infrastructure for supportting other sample methods, but it is better to keep the new DDL's for doing the same as a separate patch than this patch, as that way we can reduce the scope of this patch, OTOH if you or others feel that it is mandatory to have new DDL's support for other tablesample methods then we have to deal with this now itself. 2. postgres=# explain update test_tablesample TABLESAMPLE system(30) set id = 2; ERROR: syntax error at or near "TABLESAMPLE" LINE 1: explain update test_tablesample TABLESAMPLE system(30) set i... postgres=# Delete from test_tablesample TABLESAMPLE system(30); ERROR: syntax error at or near "TABLESAMPLE" LINE 1: Delete from test_tablesample TABLESAMPLE system(30); Isn't TABLESAMPLE clause suppose to work with Update/Delete statements? 3. +static RangeTblEntry * +transformTableSampleEntry(ParseState *pstate, RangeTableSample *r) .. + parser_errposition(pstate, + exprLocation((Node *) r)))); Better to align exprLocation() with pstate 4. SampleTupleVisible() { .. else { /* No pagemode, we have to check the tuple itself. */ Snapshot snapshot = scan->rs_snapshot; Buffer buffer = scan->rs_cbuf; bool visible = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer); .. } I think it is better to check if PageIsAllVisible() in above code before visibility check as that can avoid visibility checks. 5. ParseTableSample(ParseState *pstate, char *samplemethod, Node *repeatable, List *sampleargs) { .. if (con->val.type == T_Null) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("REPEATABLE clause must be NOT NULL numeric value"), parser_errposition (pstate, con->location))); .. } InitSamplingMethod(SampleScanState *scanstate, TableSampleClause *tablesample) { .. if (fcinfo.argnull[1]) ereport(ERROR, (errcode (ERRCODE_NULL_VALUE_NOT_ALLOWED), errmsg("REPEATABLE clause cannot be NULL"))); .. } I think it would be better if we can have same error message and probably the same error code in both of the above cases. 6. extern TableSampleClause * ParseTableSample(ParseState *pstate, char *samplemethod, Node *repeatable, List *sampleargs) It is better to expose function (usage of extern) via header file. Is there a need to mention extern here? 7. ParseTableSample() { .. arg = coerce_to_specific_type(pstate, arg, INT4OID, "REPEATABLE"); .. } What is the reason for coercing value of REPEATABLE clause to INT4OID when numeric value is expected for the clause. If user gives the input value as -2.3, it will generate a seed which doesn't seem to be right. 8. +DATA(insert OID = 3295 ( tsm_system_init PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 2278 "2281 23 700" _null_ _null_ _null_ _null_ tsm_system_init _null_ _null_ _null_ )); +DATA(insert OID = 3301 ( tsm_bernoulli_init PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 2278 "2281 23 700" _null_ _null_ _null_ _null_ tsm_bernoulli_init _null_ _null_ _null_ )); Datatype for second argument is kept as 700 (Float4), shouldn't it be 1700 (Numeric)? 9. postgres=# explain SELECT t1.id FROM test_tablesample as t1 TABLESAMPLE SYSTEM ( 10), test_tablesample as t2 TABLESAMPLE BERNOULLI (20); QUERY PLAN ---------------------------------------------------------------------------- Nested Loop (cost=0.00..4.05 rows=100 width=4) -> Sample Scan on test_tablesample t1 (cost=0.00..0.01 rows=1 width=4) -> Sample Scan on test_tablesample t2 (cost=0.00..4.02 rows=2 width=0) (3 rows) Isn't it better to display sample method name in explain. I think it can help in case of join queries. We can use below format to display: Sample Scan (System) on test_tablesample ... 10. Bernoulli.c +/* State */ +typedef struct +{ + uint32 seed; /* random seed */ + BlockNumber startblock; /* starting block, we use ths for syncscan support */ typo. /ths/this With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com