On Tue, Oct 14, 2025 at 4:08 AM Masahiko Sawada <[email protected]> wrote: > > > > +-- Test COPY TO with a foreign table or when the foreign table is a > > > partition > > > +COPY async_p3 TO stdout; --error > > > +ERROR: cannot copy from foreign table "async_p3" > > > +HINT: Try the COPY (SELECT ...) TO variant. > > > > > > async_p3 is a foreign table so it seems not related to this patch. > > > > > I replied in > > https://postgr.es/m/CACJufxGkkMtRUJEbLczRnWp7x2YWqu4r1gEJEv9Po1UPxS6kGQ%40mail.gmail.com > > I kind of doubt anyone would submit a patch just to rewrite a coverage test > > for > > the sake of coverage itself. While we're here, adding nearby coverage tests > > should be fine? > > For me, it's perfectly fine to have patches just for improving the > test coverage and I think we have had such patches ever. Given this > patch expands the supported relation kind, I guess it makes sense to > cover other cases as well in this patch (i.e., foreign tables and > sequences) or to have a separate patch to increase the overall test > coverage of copyto.c. >
Let's have a seperate patch to handle COPY test coverage. > > > > i just found out I ignored the case when partitioned tables have RLS. > > when exporting a partitioned table, > > find_all_inheritors will sort the returned partition by oid. > > in DoCopy, we can do the same: > > make a SortBy node for SelectStmt->sortClause also mark the > > RangeVar->inh as true. > > OR > > ereport(ERRCODE_FEATURE_NOT_SUPPORTED...) for partitioned tables with RLS. > > > > please see the change I made in DoCopy. > > Good catch. However, I guess adding a SortBy node with "tableoid" > doesn't necessarily work in the same way as the 'COPY TO' using > find_all_inheritors(): > > + /* > + * To COPY data from multiple partitions, we rely on the order of > + * the partitions' tableoids, which matches the order produced by > + * find_all_inheritors. > + */ > > The table list returned by find_all_inheritors() is deterministic, but > it doesn't sort the whole list by their OIDs. If I understand > correctly, it retrieves all descendant tables in a BFS order. For > example, if I create the tables in the following sequence: > > create table p (i int) partition by list (i); > create table c12 partition of p for values in (1, 2) partition by list (i); > create table c12_1 partition of c12 for values in (1); > create table c12_2 partition of c12 for values in (2); > create table c3 partition of p for values in (3); > insert into p values (1), (2), (3); > alter table p enable row level security; > create policy policy_p on p using (i > 0); > create user test_user; > grant select on table p to test_user; > > I got the result without RLS: > > copy p to stdout; > 3 > 1 > 2 > > whereas I got the results with RLS: > > copy p to stdout; > 1 > 2 > 3 > > I think that adding SortBy doesn't help more than making the results > deterministic. Or we can re-sort the OID list returned by > find_all_inheritors() to match it. However, I'm not sure that we need > to make COPY TO results deterministic in the first place. It's not > guaranteed that the order of tuples returned from 'COPY TO rel' where > rel is not a partitioned table is sorted nor even deterministic (e.g., > due to sync scans). If 'rel' is a partitioned table without RLS, the > order of tables to scan is deterministic but returned tuples within a > single partition is not. Given that sorting the whole results is not > cost free, I'm not sure that guaranteeing this ordering also for > partitioned tables with RLS would be useful for users. > I removed the "SortBy node", and also double checked the patch again. Please check the attached v18.
From fc87e123872b60ccaa37c02b80bae6765d27f4a8 Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Wed, 15 Oct 2025 09:44:00 +0800 Subject: [PATCH v18 1/1] support COPY partitioned_table TO MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This is for implementation of ``COPY (partitioned_table) TO``. it will be faster than ``COPY (select * from partitioned_table) TO``. If the destination table is a partitioned table, COPY table TO copies the same rows as SELECT * FROM table. reivewed by: vignesh C <[email protected]> reivewed by: David Rowley <[email protected]> reivewed by: Melih Mutlu <[email protected]> reivewed by: Kirill Reshke <[email protected]> reivewed by: Atsushi Torikoshi <[email protected]> reivewed by: Álvaro Herrera <[email protected]> reivewed by: Masahiko Sawada <[email protected]> reivewed by: Chao Li <[email protected]> discussion: https://postgr.es/m/CACJufxEZt+G19Ors3bQUq-42-61__C=y5k2wk=sHEFRusu7=i...@mail.gmail.com commitfest entry: https://commitfest.postgresql.org/patch/5467 --- .../postgres_fdw/expected/postgres_fdw.out | 5 + contrib/postgres_fdw/sql/postgres_fdw.sql | 3 + doc/src/sgml/ref/copy.sgml | 9 +- src/backend/commands/copy.c | 6 + src/backend/commands/copyto.c | 153 ++++++++++++++---- src/test/regress/expected/copy.out | 18 +++ src/test/regress/expected/rowsecurity.out | 21 +++ src/test/regress/sql/copy.sql | 15 ++ src/test/regress/sql/rowsecurity.sql | 3 + 9 files changed, 200 insertions(+), 33 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 91bbd0d8c73..cd28126049d 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -11599,6 +11599,11 @@ SELECT * FROM result_tbl ORDER BY a; (3 rows) DELETE FROM result_tbl; +-- Test COPY TO when foreign table is partition +COPY async_pt TO stdout; --error +ERROR: cannot copy from foreign table "async_p1" +DETAIL: Partition "async_p1" is a foreign table in partitioned table "async_pt" +HINT: Try the COPY (SELECT ...) TO variant. DROP FOREIGN TABLE async_p3; DROP TABLE base_tbl3; -- Check case where the partitioned table has local/remote partitions diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 3b7da128519..9a8f9e28135 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -3941,6 +3941,9 @@ INSERT INTO result_tbl SELECT * FROM async_pt WHERE b === 505; SELECT * FROM result_tbl ORDER BY a; DELETE FROM result_tbl; +-- Test COPY TO when foreign table is partition +COPY async_pt TO stdout; --error + DROP FOREIGN TABLE async_p3; DROP TABLE base_tbl3; diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index c2d1fbc1fbe..ecd300097fc 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -539,13 +539,16 @@ COPY <replaceable class="parameter">count</replaceable> <para> <command>COPY TO</command> can be used with plain - tables and populated materialized views. - For example, + tables, populated materialized views and partitioned tables. + For example, if <replaceable class="parameter">table</replaceable> is a plain table, <literal>COPY <replaceable class="parameter">table</replaceable> TO</literal> copies the same rows as <literal>SELECT * FROM ONLY <replaceable class="parameter">table</replaceable></literal>. + If <replaceable class="parameter">table</replaceable> is a partitioned table or a materialized view + <literal>COPY <replaceable class="parameter">table</replaceable> TO</literal> + copies the same rows as <literal>SELECT * FROM <replaceable class="parameter">table</replaceable></literal>. However it doesn't directly support other relation types, - such as partitioned tables, inheritance child tables, or views. + such as inheritance child tables, or views. To copy all rows from such relations, use <literal>COPY (SELECT * FROM <replaceable class="parameter">table</replaceable>) TO</literal>. </para> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index fae9c41db65..9e12adb81a1 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -251,11 +251,17 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt, * relation which we have opened and locked. Use "ONLY" so that * COPY retrieves rows from only the target table not any * inheritance children, the same as when RLS doesn't apply. + * + * However, when COPY data from a partitioned table, we should not + * use "ONLY", since we also need to retrieve rows from its child + * partitions too. */ from = makeRangeVar(get_namespace_name(RelationGetNamespace(rel)), pstrdup(RelationGetRelationName(rel)), -1); from->inh = false; /* apply ONLY */ + if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE) + from->inh = true; /* Build query */ select = makeNode(SelectStmt); diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c index e5781155cdf..74497240105 100644 --- a/src/backend/commands/copyto.c +++ b/src/backend/commands/copyto.c @@ -18,7 +18,9 @@ #include <unistd.h> #include <sys/stat.h> +#include "access/table.h" #include "access/tableam.h" +#include "catalog/pg_inherits.h" #include "commands/copyapi.h" #include "commands/progress.h" #include "executor/execdesc.h" @@ -82,6 +84,7 @@ typedef struct CopyToStateData List *attnumlist; /* integer list of attnums to copy */ char *filename; /* filename, or NULL for STDOUT */ bool is_program; /* is 'filename' a program to popen? */ + List *partitions; /* OID list of partitions to copy data from */ copy_data_dest_cb data_dest_cb; /* function for writing data */ CopyFormatOptions opts; @@ -116,6 +119,8 @@ static void CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot); static void CopyAttributeOutText(CopyToState cstate, const char *string); static void CopyAttributeOutCSV(CopyToState cstate, const char *string, bool use_quote); +static void CopyRelationTo(CopyToState cstate, Relation rel, Relation root_rel, + uint64 *processed); /* built-in format-specific routines */ static void CopyToTextLikeStart(CopyToState cstate, TupleDesc tupDesc); @@ -602,6 +607,10 @@ EndCopy(CopyToState cstate) pgstat_progress_end_command(); MemoryContextDelete(cstate->copycontext); + + if (cstate->partitions) + list_free(cstate->partitions); + pfree(cstate); } @@ -643,6 +652,7 @@ BeginCopyTo(ParseState *pstate, PROGRESS_COPY_COMMAND_TO, 0 }; + List *children = NIL; if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION) { @@ -673,11 +683,34 @@ BeginCopyTo(ParseState *pstate, errmsg("cannot copy from sequence \"%s\"", RelationGetRelationName(rel)))); else if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) - ereport(ERROR, - (errcode(ERRCODE_WRONG_OBJECT_TYPE), - errmsg("cannot copy from partitioned table \"%s\"", - RelationGetRelationName(rel)), - errhint("Try the COPY (SELECT ...) TO variant."))); + { + /* + * Collect OIDs of relation containing data, so that later + * DoCopyTo can copy the data from them. + */ + children = find_all_inheritors(RelationGetRelid(rel), AccessShareLock, NULL); + + foreach_oid(child, children) + { + char relkind = get_rel_relkind(child); + + if (relkind == RELKIND_FOREIGN_TABLE) + { + char *relation_name = get_rel_name(child); + + ereport(ERROR, + errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("cannot copy from foreign table \"%s\"", relation_name), + errdetail("Partition \"%s\" is a foreign table in partitioned table \"%s\"", + relation_name, RelationGetRelationName(rel)), + errhint("Try the COPY (SELECT ...) TO variant.")); + } + + /* Exclude tables with no data */ + if (RELKIND_HAS_PARTITIONS(relkind)) + children = foreach_delete_current(children, child); + } + } else ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), @@ -713,6 +746,7 @@ BeginCopyTo(ParseState *pstate, cstate->rel = rel; tupDesc = RelationGetDescr(cstate->rel); + cstate->partitions = children; } else { @@ -722,6 +756,7 @@ BeginCopyTo(ParseState *pstate, DestReceiver *dest; cstate->rel = NULL; + cstate->partitions = NIL; /* * Run parse analysis and rewrite. Note this also acquires sufficient @@ -1030,7 +1065,7 @@ DoCopyTo(CopyToState cstate) TupleDesc tupDesc; int num_phys_attrs; ListCell *cur; - uint64 processed; + uint64 processed = 0; if (fe_copy) SendCopyBegin(cstate); @@ -1070,33 +1105,24 @@ DoCopyTo(CopyToState cstate) if (cstate->rel) { - TupleTableSlot *slot; - TableScanDesc scandesc; - - scandesc = table_beginscan(cstate->rel, GetActiveSnapshot(), 0, NULL); - slot = table_slot_create(cstate->rel, NULL); - - processed = 0; - while (table_scan_getnextslot(scandesc, ForwardScanDirection, slot)) + /* + * If COPY TO source table is a partitioned table, then open each + * partition and process each individual partition. + */ + if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) { - CHECK_FOR_INTERRUPTS(); + foreach_oid(child, cstate->partitions) + { + Relation scan_rel; - /* Deconstruct the tuple ... */ - slot_getallattrs(slot); - - /* Format and send the data */ - CopyOneRowTo(cstate, slot); - - /* - * Increment the number of processed tuples, and report the - * progress. - */ - pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, - ++processed); + /* We already got the lock in BeginCopyTo */ + scan_rel = table_open(child, NoLock); + CopyRelationTo(cstate, scan_rel, cstate->rel, &processed); + table_close(scan_rel, NoLock); + } } - - ExecDropSingleTupleTableSlot(slot); - table_endscan(scandesc); + else + CopyRelationTo(cstate, cstate->rel, NULL, &processed); } else { @@ -1115,6 +1141,73 @@ DoCopyTo(CopyToState cstate) return processed; } +/* + * Scans a single table and exports its rows to the COPY destination. + * + * root_rel can be set to the root table of rel if rel is a partition + * table so that we can send tuples in root_rel's rowtype, which might + * differ from individual partitions. +*/ +static void +CopyRelationTo(CopyToState cstate, Relation rel, Relation root_rel, uint64 *processed) +{ + TupleTableSlot *slot; + TableScanDesc scandesc; + AttrMap *map = NULL; + TupleTableSlot *root_slot = NULL; + + scandesc = table_beginscan(rel, GetActiveSnapshot(), 0, NULL); + slot = table_slot_create(rel, NULL); + + /* + * A partition's rowtype might differ from the root table's. If we are + * exporting partition data here, we must convert it back to the root + * table's rowtype. + */ + if (root_rel != NULL) + { + root_slot = table_slot_create(root_rel, NULL); + map = build_attrmap_by_name_if_req(RelationGetDescr(root_rel), + RelationGetDescr(rel), + false); + } + + while (table_scan_getnextslot(scandesc, ForwardScanDirection, slot)) + { + TupleTableSlot *copyslot; + + CHECK_FOR_INTERRUPTS(); + + if (map != NULL) + copyslot = execute_attr_map_slot(map, slot, root_slot); + else + { + /* Deconstruct the tuple ... */ + slot_getallattrs(slot); + copyslot = slot; + } + + /* Format and send the data */ + CopyOneRowTo(cstate, copyslot); + + /* + * Increment the number of processed tuples, and report the progress. + */ + pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, + ++(*processed)); + } + + ExecDropSingleTupleTableSlot(slot); + + if (root_slot != NULL) + ExecDropSingleTupleTableSlot(root_slot); + + if (map != NULL) + free_attrmap(map); + + table_endscan(scandesc); +} + /* * Emit one row during DoCopyTo(). */ diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out index ac66eb55aee..af01e84cea1 100644 --- a/src/test/regress/expected/copy.out +++ b/src/test/regress/expected/copy.out @@ -373,3 +373,21 @@ COPY copytest_mv(id) TO stdout WITH (header); id 1 DROP MATERIALIZED VIEW copytest_mv; +-- Tests for COPY TO with partitioned tables. +CREATE TABLE pp (id int,val int) PARTITION BY RANGE (id); +CREATE TABLE pp_1 (val int, id int) PARTITION BY RANGE (id); +CREATE TABLE pp_2 (id int, val int) PARTITION BY RANGE (id); +ALTER TABLE pp ATTACH PARTITION pp_1 FOR VALUES FROM (1) TO (5); +ALTER TABLE pp ATTACH PARTITION pp_2 FOR VALUES FROM (5) TO (10); +CREATE TABLE pp_15 PARTITION OF pp_1 FOR VALUES FROM (1) TO (5); +CREATE TABLE pp_510 PARTITION OF pp_2 FOR VALUES FROM (5) TO (10); +INSERT INTO pp SELECT g, 10 + g FROM generate_series(1,6) g; +COPY pp TO stdout(header); +id val +1 11 +2 12 +3 13 +4 14 +5 15 +6 16 +DROP TABLE PP; diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 5a172c5d91c..42b78a24603 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -986,6 +986,11 @@ NOTICE: f_leak => my first satire 9 | 11 | 1 | regress_rls_dave | awesome science fiction (4 rows) +COPY part_document TO stdout WITH (DELIMITER ','); +1,11,1,regress_rls_bob,my first novel +6,11,1,regress_rls_carol,great science fiction +9,11,1,regress_rls_dave,awesome science fiction +4,55,1,regress_rls_bob,my first satire EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); QUERY PLAN ------------------------------------------------------------------------- @@ -1028,6 +1033,17 @@ NOTICE: f_leak => awesome technology book 10 | 99 | 2 | regress_rls_dave | awesome technology book (10 rows) +COPY part_document TO stdout WITH (DELIMITER ','); +1,11,1,regress_rls_bob,my first novel +2,11,2,regress_rls_bob,my second novel +6,11,1,regress_rls_carol,great science fiction +9,11,1,regress_rls_dave,awesome science fiction +4,55,1,regress_rls_bob,my first satire +8,55,2,regress_rls_carol,great satire +3,99,2,regress_rls_bob,my science textbook +5,99,2,regress_rls_bob,my history book +7,99,2,regress_rls_carol,great technology book +10,99,2,regress_rls_dave,awesome technology book EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); QUERY PLAN ------------------------------------------------------------------------- @@ -1058,6 +1074,11 @@ NOTICE: f_leak => awesome science fiction 9 | 11 | 1 | regress_rls_dave | awesome science fiction (4 rows) +COPY part_document TO stdout WITH (DELIMITER ','); +1,11,1,regress_rls_bob,my first novel +2,11,2,regress_rls_bob,my second novel +6,11,1,regress_rls_carol,great science fiction +9,11,1,regress_rls_dave,awesome science fiction EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); QUERY PLAN ---------------------------------------------------------------------------------- diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql index a1316c73bac..56d506ad4c6 100644 --- a/src/test/regress/sql/copy.sql +++ b/src/test/regress/sql/copy.sql @@ -405,3 +405,18 @@ COPY copytest_mv(id) TO stdout WITH (header); REFRESH MATERIALIZED VIEW copytest_mv; COPY copytest_mv(id) TO stdout WITH (header); DROP MATERIALIZED VIEW copytest_mv; + +-- Tests for COPY TO with partitioned tables. +CREATE TABLE pp (id int,val int) PARTITION BY RANGE (id); +CREATE TABLE pp_1 (val int, id int) PARTITION BY RANGE (id); +CREATE TABLE pp_2 (id int, val int) PARTITION BY RANGE (id); +ALTER TABLE pp ATTACH PARTITION pp_1 FOR VALUES FROM (1) TO (5); +ALTER TABLE pp ATTACH PARTITION pp_2 FOR VALUES FROM (5) TO (10); + +CREATE TABLE pp_15 PARTITION OF pp_1 FOR VALUES FROM (1) TO (5); +CREATE TABLE pp_510 PARTITION OF pp_2 FOR VALUES FROM (5) TO (10); + +INSERT INTO pp SELECT g, 10 + g FROM generate_series(1,6) g; + +COPY pp TO stdout(header); +DROP TABLE PP; diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql index 21ac0ca51ee..2d1be543391 100644 --- a/src/test/regress/sql/rowsecurity.sql +++ b/src/test/regress/sql/rowsecurity.sql @@ -362,16 +362,19 @@ SELECT * FROM pg_policies WHERE schemaname = 'regress_rls_schema' AND tablename SET SESSION AUTHORIZATION regress_rls_bob; SET row_security TO ON; SELECT * FROM part_document WHERE f_leak(dtitle) ORDER BY did; +COPY part_document TO stdout WITH (DELIMITER ','); EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); -- viewpoint from regress_rls_carol SET SESSION AUTHORIZATION regress_rls_carol; SELECT * FROM part_document WHERE f_leak(dtitle) ORDER BY did; +COPY part_document TO stdout WITH (DELIMITER ','); EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); -- viewpoint from regress_rls_dave SET SESSION AUTHORIZATION regress_rls_dave; SELECT * FROM part_document WHERE f_leak(dtitle) ORDER BY did; +COPY part_document TO stdout WITH (DELIMITER ','); EXPLAIN (COSTS OFF) SELECT * FROM part_document WHERE f_leak(dtitle); -- pp1 ERROR -- 2.34.1
