Hi,
Thank you for looking at it .
On Sun, Oct 28, 2018 at 7:19 PM Tomas Vondra <[email protected]>
wrote:
>
> 1) I think this deserves at least some regression tests. Plenty of tests
> already use COPY, but there's no coverage for the new piece. So let's
> add a new test suite, or maybe add a couple of tests into copy2.sql.
>
>
> 2) In copy.sqml, the new item is defined like this
>
> <term><literal>WHEN Clause</literal></term>
>
> I suggest we use just <term><literal>WHEN</literal></term>, that's what
> the other items do (see ENCODING).
>
> The other thing is that this does not say what expressions are allowed
> in the WHEN clause. It seems pretty close to WHEN clause for triggers,
> which e.g. mentions that subselects are not allowed. I'm pretty sure
> that's true here too, because it fails like this (118 = T_SubLink):
>
> test=# copy t(a,b,c) from '/tmp/t.data' when ((select 1) < 10);
> ERROR: unrecognized node type: 118
>
> So, the patch needs to detect this, produce a reasonable error message
> and document the limitations in copy.sqml, just like we do for CREATE
> TRIGGER.
>
fixed
>
> 3) For COPY TO, the WHEN clause is accepted but ignored, leading to
> confusing cases like this:
>
> test=# copy t(a,b,c) to '/tmp/t.data' when ((select 100) < 10);
> COPY 151690
>
> So, it contains subselect, but unlike COPY FROM it does not fail
> (because we never execute it). The fun part is that the expression is
> logically false, so a user might expect it to filter rows, yet we copy
> everything.
>
> IMHO we need to either error-out in these cases, complaining about WHEN
> not being supported for COPY TO, or make it work (effectively treating
> it as a simpler alternative to COPY (subselect) TO).
>
English is not my first language but I chose error-out because WHEN
condition for COPY TO seems to me semantically incorrect
>
> AFAICS we could just get rid of the extra when_cluase variable and mess
> with the cstate->whenClause directly, depending on how (3) gets fixed.
>
I did it this way because CopyState structure memory allocate and
initialize in BeginCopyFrom but the analysis done before it
>
> 5) As I mentioned, the CREATE TRIGGER already has WHEN clause, but it
> requires it to be 'WHEN (expr)'. I suggest we do the same thing here,
> requiring the parentheses.
>
>
> 6) The skip logic in CopyFrom() seems to be slightly wrong. It does
> work, but the next_record label is defined after CHECK_FOR_INTERRUPTS()
> so a COPY will not respond to Ctrl-C unless it finds a row matching the
> WHEN condition. If you have a highly selective condition, that's a bit
> inconvenient.
>
> It also skips
>
> MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>
> so I wonder what the heap_form_tuple() right after the next_record label
> will use for tuples right after a skipped one. I'd bet it'll use the
> oldcontext (essentially the long-lived context), essentially making it
> a memory leak.
>
> So I suggest to get rid of the next_record label, and use 'continue'
> instead of the 'goto next_record'.
>
> fixed
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 13a8b68d95..f9350afcdb 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -25,6 +25,7 @@ PostgreSQL documentation
COPY <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
FROM { '<replaceable class="parameter">filename</replaceable>' | PROGRAM '<replaceable class="parameter">command</replaceable>' | STDIN }
[ [ WITH ] ( <replaceable class="parameter">option</replaceable> [, ...] ) ]
+ [ WHEN ( <replaceable class="parameter">condition</replaceable> ) ]
COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] | ( <replaceable class="parameter">query</replaceable> ) }
TO { '<replaceable class="parameter">filename</replaceable>' | PROGRAM '<replaceable class="parameter">command</replaceable>' | STDOUT }
@@ -364,6 +365,30 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>WHEN</literal></term>
+ <listitem>
+ <para>
+ The optional <literal>WHEN</literal> clause has the general form
+<synopsis>
+WHEN <replaceable class="parameter">condition</replaceable>
+</synopsis>
+ where <replaceable class="parameter">condition</replaceable> is
+ any expression that evaluates to a result of type
+ <type>boolean</type>. Any row that does not satisfy this
+ condition will not be inserted to the table. A row satisfies the
+ condition if it returns true when the actual row values are
+ substituted for any variable references.
+ </para>
+
+ <para>
+ Currently, <literal>WHEN</literal> expressions cannot contain
+ subqueries.
+ </para>
+
+ </listitem>
+ </varlistentry>
+
</variablelist>
</refsect1>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 9bc67ce60f..e70c315561 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -38,7 +38,10 @@
#include "miscadmin.h"
#include "optimizer/clauses.h"
#include "optimizer/planner.h"
+#include "optimizer/prep.h"
#include "nodes/makefuncs.h"
+#include "parser/parse_coerce.h"
+#include "parser/parse_expr.h"
#include "parser/parse_relation.h"
#include "port/pg_bswap.h"
#include "rewrite/rewriteHandler.h"
@@ -147,6 +150,7 @@ typedef struct CopyStateData
bool convert_selectively; /* do selective binary conversion? */
List *convert_select; /* list of column names (can be NIL) */
bool *convert_select_flags; /* per-column CSV/TEXT CS flags */
+ Node *whenClause; /* WHEN condition (or NULL) */
/* these are just for error messages, see CopyFromErrorCallback */
const char *cur_relname; /* table name for error messages */
@@ -178,6 +182,7 @@ typedef struct CopyStateData
ExprState **defexprs; /* array of default att expressions */
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table;
+ ExprState *qualexpr;
TransitionCaptureState *transition_capture;
@@ -797,6 +802,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
Relation rel;
Oid relid;
RawStmt *query = NULL;
+ Node *whenClause= NULL;
/*
* Disallow COPY to/from file or program except to users with the
@@ -849,6 +855,21 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
rte = addRangeTableEntryForRelation(pstate, rel, NULL, false, false);
rte->requiredPerms = (is_from ? ACL_INSERT : ACL_SELECT);
+ if (stmt->whenClause)
+ {
+ /* add rte to column namespace */
+ addRTEtoQuery(pstate, rte, false, false, true);
+
+ /* Transform the raw expression tree */
+ whenClause = transformExpr(pstate, stmt->whenClause, EXPR_KIND_COPY_FROM_WHEN);
+
+ /* Make sure it yields a boolean result. */
+ whenClause = coerce_to_boolean(pstate, whenClause, "WHEN");
+
+ whenClause = (Node *) canonicalize_qual((Expr *) whenClause, false);
+ whenClause = (Node *) make_ands_implicit((Expr *) whenClause);
+ }
+
tupDesc = RelationGetDescr(rel);
attnums = CopyGetAttnums(tupDesc, rel, stmt->attlist);
foreach(cur, attnums)
@@ -997,6 +1018,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
NULL, stmt->attlist, stmt->options);
+ cstate->whenClause=whenClause;
*processed = CopyFrom(cstate); /* copy from file to database */
EndCopyFrom(cstate);
}
@@ -2534,6 +2556,10 @@ CopyFrom(CopyState cstate)
ExecSetupChildParentMapForLeaf(proute);
}
+ if (cstate->whenClause)
+ cstate->qualexpr = ExecInitQual(castNode(List, cstate->whenClause),
+ &mtstate->ps);
+
/*
* It's more efficient to prepare a bunch of tuples for insertion, and
* insert them in one heap_multi_insert() call, than call heap_insert()
@@ -2686,6 +2712,13 @@ CopyFrom(CopyState cstate)
slot = myslot;
ExecStoreTuple(tuple, slot, InvalidBuffer, false);
+ if (cstate->whenClause)
+ {
+ econtext->ecxt_scantuple = myslot;
+ if (!ExecQual(cstate->qualexpr, econtext))
+ continue;
+ }
+
/* Determine the partition to heap_insert the tuple into */
if (proute)
{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 7c8220cf65..b3fe529619 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3312,6 +3312,7 @@ _copyCopyStmt(const CopyStmt *from)
COPY_SCALAR_FIELD(is_program);
COPY_STRING_FIELD(filename);
COPY_NODE_FIELD(options);
+ COPY_NODE_FIELD(whenClause);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 378f2facb8..471b628157 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1221,6 +1221,7 @@ _equalCopyStmt(const CopyStmt *a, const CopyStmt *b)
COMPARE_SCALAR_FIELD(is_program);
COMPARE_STRING_FIELD(filename);
COMPARE_NODE_FIELD(options);
+ COMPARE_NODE_FIELD(whenClause);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 87f5e95827..65ffb8beb5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -509,6 +509,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <defelt> copy_generic_opt_elem
%type <list> copy_generic_opt_list copy_generic_opt_arg_list
%type <list> copy_options
+%type <node> opt_when_clause
%type <typnam> Typename SimpleTypename ConstTypename
GenericType Numeric opt_float
@@ -2970,7 +2971,7 @@ ClosePortalStmt:
*****************************************************************************/
CopyStmt: COPY opt_binary qualified_name opt_column_list opt_oids
- copy_from opt_program copy_file_name copy_delimiter opt_with copy_options
+ copy_from opt_program copy_file_name copy_delimiter opt_with copy_options opt_when_clause
{
CopyStmt *n = makeNode(CopyStmt);
n->relation = $3;
@@ -2979,6 +2980,7 @@ CopyStmt: COPY opt_binary qualified_name opt_column_list opt_oids
n->is_from = $6;
n->is_program = $7;
n->filename = $8;
+ n->whenClause = $12;
if (n->is_program && n->filename == NULL)
ereport(ERROR,
@@ -2986,6 +2988,12 @@ CopyStmt: COPY opt_binary qualified_name opt_column_list opt_oids
errmsg("STDIN/STDOUT not allowed with PROGRAM"),
parser_errposition(@8)));
+ if (!n->is_from && n->whenClause != NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("WHEN clause not allowed with COPY TO"),
+ parser_errposition(@12)));
+
n->options = NIL;
/* Concatenate user-supplied flags */
if ($2)
@@ -3178,6 +3186,11 @@ copy_generic_opt_arg_list:
}
;
+opt_when_clause:
+ WHEN '(' a_expr ')' { $$ = $3; }
+ | /*EMPTY*/ { $$ = NULL; }
+ ;
+
/* beware of emitting non-string list elements here; see commands/define.c */
copy_generic_opt_arg_list_item:
opt_boolean_or_string { $$ = (Node *) makeString($1); }
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 61727e1d71..ec6e3fa836 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -523,6 +523,14 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
break;
+ case EXPR_KIND_COPY_FROM_WHEN:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in copy from WHEN conditions");
+ else
+ err = _("grouping operations are not allowed in copy from WHEN conditions");
+
+ break;
+
/*
* There is intentionally no default: case here, so that the
* compiler will warn if we add a new ParseExprKind without
@@ -902,6 +910,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
case EXPR_KIND_CALL_ARGUMENT:
err = _("window functions are not allowed in CALL arguments");
break;
+ case EXPR_KIND_COPY_FROM_WHEN:
+ err = _("window functions are not allowed in copy from WHEN conditions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 385e54a9b6..466e4c7c31 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1849,6 +1849,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_CALL_ARGUMENT:
err = _("cannot use subquery in CALL argument");
break;
+ case EXPR_KIND_COPY_FROM_WHEN:
+ err = _("cannot use subquery in copy from WHEN condition");
+ break;
/*
* There is intentionally no default: case here, so that the
@@ -3475,6 +3478,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "PARTITION BY";
case EXPR_KIND_CALL_ARGUMENT:
return "CALL";
+ case EXPR_KIND_COPY_FROM_WHEN:
+ return "WHEN";
/*
* There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 44257154b8..7d65b2733f 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2370,6 +2370,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
case EXPR_KIND_CALL_ARGUMENT:
err = _("set-returning functions are not allowed in CALL arguments");
break;
+ case EXPR_KIND_COPY_FROM_WHEN:
+ err = _("set-returning functions are not allowed in copy from WHEN conditions");
+ break;
/*
* There is intentionally no default: case here, so that the
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 07ab1a3dde..8dc700c793 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1959,6 +1959,7 @@ typedef struct CopyStmt
bool is_program; /* is 'filename' a program to popen? */
char *filename; /* filename, or NULL for STDIN/STDOUT */
List *options; /* List of DefElem nodes */
+ Node *whenClause; /* WHEN condition (or NULL) */
} CopyStmt;
/* ----------------------
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0230543810..e29ee0788a 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -69,7 +69,8 @@ typedef enum ParseExprKind
EXPR_KIND_TRIGGER_WHEN, /* WHEN condition in CREATE TRIGGER */
EXPR_KIND_POLICY, /* USING or WITH CHECK expr in policy */
EXPR_KIND_PARTITION_EXPRESSION, /* PARTITION BY expression */
- EXPR_KIND_CALL_ARGUMENT /* procedure argument in CALL */
+ EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
+ EXPR_KIND_COPY_FROM_WHEN /* WHEN condition in COPY FROM */
} ParseExprKind;
diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
index ca3efadc48..707d47b385 100644
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -192,7 +192,7 @@ ECPG: where_or_current_clauseWHERECURRENT_POFcursor_name block
char *cursor_marker = $4[0] == ':' ? mm_strdup("$0") : $4;
$$ = cat_str(2,mm_strdup("where current of"), cursor_marker);
}
-ECPG: CopyStmtCOPYopt_binaryqualified_nameopt_column_listopt_oidscopy_fromopt_programcopy_file_namecopy_delimiteropt_withcopy_options addon
+ECPG: CopyStmtCOPYopt_binaryqualified_nameopt_column_listopt_oidscopy_fromopt_programcopy_file_namecopy_delimiteropt_withcopy_optionsopt_when_clause addon
if (strcmp($6, "from") == 0 &&
(strcmp($7, "stdin") == 0 || strcmp($7, "stdout") == 0))
mmerror(PARSE_ERROR, ET_WARNING, "COPY FROM STDIN is not implemented");
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index eb9e4b9774..ddb40a9c0d 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -49,6 +49,8 @@ CONTEXT: COPY x, line 1: "2002 232 40 50 60 70 80"
COPY x (b, c, d, e) from stdin with oids delimiter ',' null 'x';
COPY x from stdin WITH DELIMITER AS ';' NULL AS '';
COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X' ENCODING 'sql_ascii';
+COPY x from stdin WHEN (a = 50004);
+COPY x from stdin WHEN (a > 60003);
-- check results of copy in
SELECT * FROM x;
a | b | c | d | e
@@ -73,12 +75,15 @@ SELECT * FROM x;
4006 | 6 | BackslashN | \N | before trigger fired
4007 | 7 | XX | XX | before trigger fired
4008 | 8 | Delimiter | : | before trigger fired
+ 50004 | 25 | 35 | 45 | before trigger fired
+ 60004 | 25 | 35 | 45 | before trigger fired
+ 60005 | 26 | 36 | 46 | before trigger fired
1 | 1 | stuff | test_1 | after trigger fired
2 | 2 | stuff | test_2 | after trigger fired
3 | 3 | stuff | test_3 | after trigger fired
4 | 4 | stuff | test_4 | after trigger fired
5 | 5 | stuff | test_5 | after trigger fired
-(25 rows)
+(28 rows)
-- COPY w/ oids on a table w/o oids should fail
CREATE TABLE no_oids (
@@ -114,6 +119,9 @@ COPY x TO stdout;
4006 6 BackslashN \\N before trigger fired
4007 7 XX XX before trigger fired
4008 8 Delimiter : before trigger fired
+50004 25 35 45 before trigger fired
+60004 25 35 45 before trigger fired
+60005 26 36 46 before trigger fired
1 1 stuff test_1 after trigger fired
2 2 stuff test_2 after trigger fired
3 3 stuff test_3 after trigger fired
@@ -140,6 +148,9 @@ N before trigger fired
BackslashN before trigger fired
XX before trigger fired
Delimiter before trigger fired
+35 before trigger fired
+35 before trigger fired
+36 before trigger fired
stuff after trigger fired
stuff after trigger fired
stuff after trigger fired
@@ -166,6 +177,9 @@ I'm null before trigger fired
6 before trigger fired
7 before trigger fired
8 before trigger fired
+25 before trigger fired
+25 before trigger fired
+26 before trigger fired
1 after trigger fired
2 after trigger fired
3 after trigger fired
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index f3a6d228fa..94d592b887 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -95,6 +95,20 @@ COPY x from stdin WITH DELIMITER AS ':' NULL AS E'\\X' ENCODING 'sql_ascii';
4008:8:Delimiter:\::\:
\.
+COPY x from stdin WHEN (a = 50004);
+50003 24 34 44 54
+50004 25 35 45 55
+50005 26 36 46 56
+\.
+
+COPY x from stdin WHEN (a > 60003);
+60001 22 32 42 52
+60002 23 33 43 53
+60003 24 34 44 54
+60004 25 35 45 55
+60005 26 36 46 56
+\.
+
-- check results of copy in
SELECT * FROM x;