Re: [GENERAL] Re: [BUGS] [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Bernd Helmle



--On 4. November 2014 17:18:14 -0500 Tom Lane  wrote:


Yeah, and I think that it's entirely reasonable for rewriting ALTER TABLEs
to update the xmin of the rewritten tuples; after all, the output data
could be arbitrarily different from what the previous transactions put
into the table.  But that is not the question here.  If the COPY blocks
until the ALTER completes --- as it must --- why is its execution snapshot
not taken *after* the lock is acquired?


COPY waits in DoCopy() but its snapshot gets acquired in PortalRunUtility() 
earlier. SELECT has it's lock already during transform/analyse phase and 
its snapshot is taken much later.  Looks like we need something that 
analyses a utility statement to get its lock before executing.


--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [GENERAL] Re: [BUGS] [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Tom Lane
Andres Freund  writes:
> On 2014-11-04 13:51:23 -0500, Tom Lane wrote:
>> Not sure.  The OP's point is that in a SELECT, you do get unsurprising
>> results, because a SELECT will acquire its execution snapshot after it's
>> gotten AccessShareLock on the table.  Arguably COPY should behave likewise.
>> Or to be even more concrete, COPY (SELECT * FROM tab) TO ... probably
>> already acts like he wants, so why isn't plain COPY equivalent to that?

> Even a plain SELECT essentially acts that way if I recall correctly if
> you use REPEATABLE READ+ and force a snapshot to be acquired
> beforehand. It's imo not very surprising.

"It doesn't fail in a non-default isolation mode" is hardly much of an
argument for this being okay in READ COMMITTED.

> All ALTER TABLE rewrites just disregard visibility of existing
> tuples. Only CLUSTER/VACUUM FULL do the full hangups to keep all the
> necessary tuples + ctid chains around.

Yeah, and I think that it's entirely reasonable for rewriting ALTER TABLEs
to update the xmin of the rewritten tuples; after all, the output data
could be arbitrarily different from what the previous transactions put
into the table.  But that is not the question here.  If the COPY blocks
until the ALTER completes --- as it must --- why is its execution snapshot
not taken *after* the lock is acquired?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Andres Freund
On 2014-11-04 13:51:23 -0500, Tom Lane wrote:
> Bernd Helmle  writes:
> > --On 3. November 2014 18:15:04 +0100 Sven Wegener 
> >  wrote:
> >> I've check git master and 9.x and all show the same behaviour. I came up
> >> with the patch below, which is against curent git master. The patch
> >> modifies the COPY TO code to create a new snapshot, after acquiring the
> >> necessary locks on the source tables, so that it sees any modification
> >> commited by other backends.
> 
> > Well, i have the feeling that there's nothing wrong with it. The ALTER 
> > TABLE command has rewritten all tuples with its own XID, thus the current 
> > snapshot does not "see" these tuples anymore. I suppose that in 
> > SERIALIZABLE or REPEATABLE READ transaction isolation your proposal still 
> > doesn't return the tuples you'd like to see.
> 
> Not sure.  The OP's point is that in a SELECT, you do get unsurprising
> results, because a SELECT will acquire its execution snapshot after it's
> gotten AccessShareLock on the table.  Arguably COPY should behave likewise.
> Or to be even more concrete, COPY (SELECT * FROM tab) TO ... probably
> already acts like he wants, so why isn't plain COPY equivalent to that?

Even a plain SELECT essentially acts that way if I recall correctly if
you use REPEATABLE READ+ and force a snapshot to be acquired
beforehand. It's imo not very surprising.

All ALTER TABLE rewrites just disregard visibility of existing
tuples. Only CLUSTER/VACUUM FULL do the full hangups to keep all the
necessary tuples + ctid chains around.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Sven Wegener
On Tue, 4 Nov 2014, Tom Lane wrote:

> Bernd Helmle  writes:
> > --On 3. November 2014 18:15:04 +0100 Sven Wegener 
> >  wrote:
> >> I've check git master and 9.x and all show the same behaviour. I came up
> >> with the patch below, which is against curent git master. The patch
> >> modifies the COPY TO code to create a new snapshot, after acquiring the
> >> necessary locks on the source tables, so that it sees any modification
> >> commited by other backends.
> 
> > Well, i have the feeling that there's nothing wrong with it. The ALTER 
> > TABLE command has rewritten all tuples with its own XID, thus the current 
> > snapshot does not "see" these tuples anymore. I suppose that in 
> > SERIALIZABLE or REPEATABLE READ transaction isolation your proposal still 
> > doesn't return the tuples you'd like to see.
> 
> Not sure.  The OP's point is that in a SELECT, you do get unsurprising
> results, because a SELECT will acquire its execution snapshot after it's
> gotten AccessShareLock on the table.  Arguably COPY should behave likewise.
> Or to be even more concrete, COPY (SELECT * FROM tab) TO ... probably
> already acts like he wants, so why isn't plain COPY equivalent to that?

No, both plain COPY and COPY (SELECT ...) TO are broken.

Sven


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Sven Wegener
On Tue, 4 Nov 2014, Bernd Helmle wrote:

> --On 3. November 2014 18:15:04 +0100 Sven Wegener 
> wrote:
> 
> > I've check git master and 9.x and all show the same behaviour. I came up
> > with the patch below, which is against curent git master. The patch
> > modifies the COPY TO code to create a new snapshot, after acquiring the
> > necessary locks on the source tables, so that it sees any modification
> > commited by other backends.
> 
> Well, i have the feeling that there's nothing wrong with it. The ALTER TABLE
> command has rewritten all tuples with its own XID, thus the current snapshot
> does not "see" these tuples anymore. I suppose that in SERIALIZABLE or
> REPEATABLE READ transaction isolation your proposal still doesn't return the
> tuples you'd like to see.

No, but with REPEATABLE READ and SERIALIZABLE the plain SELECT case is 
currently broken too. The issue I'm fixing is that COPY (SELECT ...) TO 
and SELECT behave differently.

So, how does an ALTER TABLE should behave transaction-wise? PostgreSQL 
claims transactional DDL support. As mentioned above a parallel SELECT 
with SERIALIZABLE returns an empty result, but it also sees the schema 
change, at least the change shows up in the result tuple description. The 
change doesn't show up in pg_catalog, so that bit is handled correctly. 
The schema change transaction can be rolled back the way it is now, so 
it's kind of transactional, but other transactions seeing the schema 
change in query results is broken.

The empty result might be fixed by just keeping the old XID of rewritten 
tuples, as the exclusive lock ALTER TABLE helds should be enough to make 
sure nobody is actively accessing the rewritten table. But I'm pondering 
if there is a solution for the visible schema change that doesn't involve 
keeping the old data files around or to just raise an serialization error.

Coming back on my mentioned solution of setting the XID of rewritten 
tuples to the FrozenXID. That's broken as in the REPEATABLE 
READ/SERIALIZABLE case there might be tuples that should not be seen by 
older transactions and moving the tuples to the FrozenXID would make them 
visible.

Sven


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Andrew Dunstan


On 11/04/2014 01:51 PM, Tom Lane wrote:

Bernd Helmle  writes:

--On 3. November 2014 18:15:04 +0100 Sven Wegener
 wrote:

I've check git master and 9.x and all show the same behaviour. I came up
with the patch below, which is against curent git master. The patch
modifies the COPY TO code to create a new snapshot, after acquiring the
necessary locks on the source tables, so that it sees any modification
commited by other backends.

Well, i have the feeling that there's nothing wrong with it. The ALTER
TABLE command has rewritten all tuples with its own XID, thus the current
snapshot does not "see" these tuples anymore. I suppose that in
SERIALIZABLE or REPEATABLE READ transaction isolation your proposal still
doesn't return the tuples you'd like to see.

Not sure.  The OP's point is that in a SELECT, you do get unsurprising
results, because a SELECT will acquire its execution snapshot after it's
gotten AccessShareLock on the table.  Arguably COPY should behave likewise.
Or to be even more concrete, COPY (SELECT * FROM tab) TO ... probably
already acts like he wants, so why isn't plain COPY equivalent to that?


Yes, that seems like an outright bug.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Tom Lane
Bernd Helmle  writes:
> --On 3. November 2014 18:15:04 +0100 Sven Wegener 
>  wrote:
>> I've check git master and 9.x and all show the same behaviour. I came up
>> with the patch below, which is against curent git master. The patch
>> modifies the COPY TO code to create a new snapshot, after acquiring the
>> necessary locks on the source tables, so that it sees any modification
>> commited by other backends.

> Well, i have the feeling that there's nothing wrong with it. The ALTER 
> TABLE command has rewritten all tuples with its own XID, thus the current 
> snapshot does not "see" these tuples anymore. I suppose that in 
> SERIALIZABLE or REPEATABLE READ transaction isolation your proposal still 
> doesn't return the tuples you'd like to see.

Not sure.  The OP's point is that in a SELECT, you do get unsurprising
results, because a SELECT will acquire its execution snapshot after it's
gotten AccessShareLock on the table.  Arguably COPY should behave likewise.
Or to be even more concrete, COPY (SELECT * FROM tab) TO ... probably
already acts like he wants, so why isn't plain COPY equivalent to that?

What concerns me more is whether there are other utility statements that
have comparable issues.  We should not fix just COPY if there are more
cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-04 Thread Bernd Helmle



--On 3. November 2014 18:15:04 +0100 Sven Wegener 
 wrote:



I've check git master and 9.x and all show the same behaviour. I came up
with the patch below, which is against curent git master. The patch
modifies the COPY TO code to create a new snapshot, after acquiring the
necessary locks on the source tables, so that it sees any modification
commited by other backends.


Well, i have the feeling that there's nothing wrong with it. The ALTER 
TABLE command has rewritten all tuples with its own XID, thus the current 
snapshot does not "see" these tuples anymore. I suppose that in 
SERIALIZABLE or REPEATABLE READ transaction isolation your proposal still 
doesn't return the tuples you'd like to see.


--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] COPY TO returning empty result with parallel ALTER TABLE

2014-11-03 Thread Sven Wegener
Hi all,

we experienced what seems to be a bug in the COPY TO implementation. When a
table is being rewritten by an ALTER TABLE statement, a parallel COPY TO
results in an empty result.

Consider the following table data:

  CREATE TABLE test (id INTEGER NOT NULL, PRIMARY KEY (id));
  INSERT INTO test (id) SELECT generate_series(1, 1000);

One session does:

  ALTER TABLE test ADD COLUMN dummy BOOLEAN NOT NULL DEFAULT FALSE;

This acquires an exclusive lock to the table.

Another session now performs parallel:

  COPY test TO STDOUT;

This blocks on the exclusive lock held by the ALTER statement. When the ALTER
staement is finished, the COPY statement returns with an empty result. Same
goes for COPY (SELECT ...) TO, whereas the same SELECT executed without COPY
blocks and returns the correct result as expected.

This is my analysis of this issue:

The backend opens the rewritten data files, but it ignores the complete
content, which indicates the data is being ignored because of XIDs. For direct
SELECT statements the top-level query parsing acquires locks on involved tables
and creates a new snapshot, but for COPY statements the parsing and locking is
done later in COPY code. After locking the tables in COPY code, the data
is read with an old snapshot, effectively ignoring all data from the rewritten
table.

I've check git master and 9.x and all show the same behaviour. I came up with
the patch below, which is against curent git master. The patch modifies the
COPY TO code to create a new snapshot, after acquiring the necessary locks on
the source tables, so that it sees any modification commited by other backends.

Despite thinking this is the correct solution, another solution or
optimization would be to have ALTER TABLE, which holds the highest lock level,
set the XID of rewritten tuples to the FrozenXID, as no other backend should
access the table before the ALTER TABLE is committed.

Sven

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 6b83576..fe2d157 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1344,6 +1344,13 @@ BeginCopy(bool is_from,
(errcode(ERRCODE_UNDEFINED_COLUMN),
 errmsg("table \"%s\" does not have 
OIDs",

RelationGetRelationName(cstate->rel;
+
+   /*
+* Use a new snapshot to ensure this query sees
+* results of any previously executed queries.
+*/
+   if (!is_from)
+   PushActiveSnapshot(GetTransactionSnapshot());
}
else
{
@@ -1394,11 +1401,10 @@ BeginCopy(bool is_from,
plan = planner(query, 0, NULL);
 
/*
-* Use a snapshot with an updated command ID to ensure this 
query sees
+* Use a new snapshot to ensure this query sees
 * results of any previously executed queries.
 */
-   PushCopiedSnapshot(GetActiveSnapshot());
-   UpdateActiveSnapshotCommandId();
+   PushActiveSnapshot(GetTransactionSnapshot());
 
/* Create dest receiver for COPY OUT */
dest = CreateDestReceiver(DestCopyOut);
@@ -1741,9 +1747,11 @@ EndCopyTo(CopyState cstate)
ExecutorFinish(cstate->queryDesc);
ExecutorEnd(cstate->queryDesc);
FreeQueryDesc(cstate->queryDesc);
-   PopActiveSnapshot();
}
 
+   /* Discard snapshot */
+   PopActiveSnapshot();
+
/* Clean up storage */
EndCopy(cstate);
 }


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY TO

2013-11-23 Thread Michael Paquier
Have a look at

On Sat, Nov 23, 2013 at 3:48 PM, mohsen soodkhah mohammadi
 wrote:
> in copy.c is one function that its name is CopyOneRowTO.
> in this function have one foreach loop. in this loop first if have this
> condition: !cstate->binary
> what means this condition and when true this condition?
Have a look at CopyStateData at the top of
src/backend/commands/copy.c. This flag simply means that COPY uses the
binary format, that can be used with more or less this command:
COPY ... WITH [ FORMAT ] BINARY;
You can check as well the documentation for more details:
http://www.postgresql.org/docs/9.3/static/sql-copy.html

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] COPY TO

2013-11-22 Thread mohsen soodkhah mohammadi
hello.
in copy.c is one function that its name is CopyOneRowTO.
in this function have one foreach loop. in this loop first if have this
condition: !cstate->binary
what means this condition and when true this condition?
thank you.