Re: [HACKERS] Another logical decoding assertion failure

2014-11-13 Thread Andres Freund
On 2014-08-15 14:53:45 +0200, Antonin Houska wrote:
 http://www.postgresql.org/message-id/blu436-smtp12682d628f61ab9736099c3dc...@phx.gbl
 recalls me that I also saw an assertion failure recently. Although I
 wanted to isolate and report my issue when my vacation is over, this
 report made me curious whether I saw the same. Eventually it seems to be
 a different symptom.
 
 Following are the steps that trigger the failure in my environment
 (9.5devel), although it's not always that straightforward - sometimes I
 need to create and populate one more table. (I use the 'test_decoding'
 contrib module.)
 
 
 postgres=# SELECT pg_create_logical_replication_slot('my_slot',
 'test_decoding');
  pg_create_logical_replication_slot
 
  (my_slot,0/16F3B30)
 (1 row)
 
 postgres=# CREATE TABLE a AS SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL) WITH NO DATA;
 SELECT 0
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 INSERT 0 2
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 The connection to the server was lost. Attempting reset: Failed.
 !

This was actually triggered by a bug in logical decoding. I've pushed a
fix. Thanks for the report!

I still think it's a bad idea to do something like the above, but even
after a fair amount of thinking I can't come up with a case that's an
actual problem. So I think we can just allow it.
Obviously it's not a good idea to use a output plugin without filtering,
and a logged table for this - each subsequent get_changes() will
otherwise contain the rows inserted into the table the previous
round. The attached regression tests contains pretty output demonstrating
that:
 BEGIN
 table public.changeresult: INSERT: data[text]:'BEGIN'
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''BEGIN'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''table public.somechange: INSERT: id[integer]:1'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''COMMIT'''
 table public.changeresult: INSERT: data[text]:'COMMIT'
 table public.changeresult: INSERT: data[text]:'BEGIN'
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''BEGIN'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''table public.changeresult: INSERT: 
data[text]:BEGIN'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''table public.changeresult: INSERT: data[text]:table 
public.somechange: INSERT: id[integer]:1'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''table public.changeresult: INSERT: 
data[text]:COMMIT'''
 table public.changeresult: INSERT: data[text]:'table public.changeresult: 
INSERT: data[text]:''COMMIT'''
 table public.changeresult: INSERT: data[text]:'COMMIT'
 COMMIT
(14 rows)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Another logical decoding assertion failure

2014-08-15 Thread Antonin Houska
http://www.postgresql.org/message-id/blu436-smtp12682d628f61ab9736099c3dc...@phx.gbl
recalls me that I also saw an assertion failure recently. Although I
wanted to isolate and report my issue when my vacation is over, this
report made me curious whether I saw the same. Eventually it seems to be
a different symptom.

Following are the steps that trigger the failure in my environment
(9.5devel), although it's not always that straightforward - sometimes I
need to create and populate one more table. (I use the 'test_decoding'
contrib module.)


postgres=# SELECT pg_create_logical_replication_slot('my_slot',
'test_decoding');
 pg_create_logical_replication_slot

 (my_slot,0/16F3B30)
(1 row)

postgres=# CREATE TABLE a AS SELECT * FROM
pg_logical_slot_get_changes('my_slot', NULL, NULL) WITH NO DATA;
SELECT 0
postgres=# INSERT INTO a SELECT * FROM
pg_logical_slot_get_changes('my_slot', NULL, NULL);
INSERT 0 2
postgres=# INSERT INTO a SELECT * FROM
pg_logical_slot_get_changes('my_slot', NULL, NULL);
The connection to the server was lost. Attempting reset: Failed.
!

The stack trace (should I upload / send the whole core file anywhere?):

(gdb) bt
#0  0x7f12d0640849 in raise () from /lib64/libc.so.6
#1  0x7f12d0641cd8 in abort () from /lib64/libc.so.6
#2  0x008c2d51 in ExceptionalCondition (conditionName=0xa926a0
!(((bool)((relation)-rd_refcnt == 0))),
errorType=0xa92210 FailedAssertion, fileName=0xa92104
relcache.c, lineNumber=1892) at assert.c:54
#3  0x008b2776 in RelationDestroyRelation
(relation=0x7f12c766d240, remember_tupdesc=0 '\000') at relcache.c:1892
#4  0x008b2c28 in RelationClearRelation
(relation=0x7f12c766d240, rebuild=1 '\001') at relcache.c:2106
#5  0x008b2fa3 in RelationFlushRelation
(relation=0x7f12c766d240) at relcache.c:2204
#6  0x008b30b5 in RelationCacheInvalidateEntry
(relationId=16384) at relcache.c:2256
#7  0x008abc65 in LocalExecuteInvalidationMessage
(msg=0x28c1260) at inval.c:567
#8  0x0073e1f5 in ReorderBufferExecuteInvalidations
(rb=0x28b0318, txn=0x28b0430) at reorderbuffer.c:1798
#9  0x0073ddbf in ReorderBufferForget (rb=0x28b0318, xid=1279,
lsn=24154944) at reorderbuffer.c:1645
#10 0x007383f8 in DecodeCommit (ctx=0x2894658,
buf=0x7fff3e658c30, xid=1279, dboid=12736, commit_time=461418357793855,
nsubxacts=0, sub_xids=0x28af650, ninval_msgs=69, msgs=0x28af650) at
decode.c:539
#11 0x00737c91 in DecodeXactOp (ctx=0x2894658,
buf=0x7fff3e658c30) at decode.c:207
#12 0x007379ce in LogicalDecodingProcessRecord (ctx=0x2894658,
record=0x28af610) at decode.c:103
#13 0x0073b0d6 in pg_logical_slot_get_changes_guts
(fcinfo=0x7fff3e658f50, confirm=1 '\001', binary=0 '\000') at
logicalfuncs.c:432
#14 0x0073b221 in pg_logical_slot_get_changes
(fcinfo=0x7fff3e658f50) at logicalfuncs.c:478
#15 0x0063e1a3 in ExecMakeTableFunctionResult
(funcexpr=0x288be78, econtext=0x288b758, argContext=0x28b5580,
expectedDesc=0x288ccd8, randomAccess=0 '\000') at execQual.c:2157
#16 0x0065e302 in FunctionNext (node=0x288ba18) at
nodeFunctionscan.c:95
#17 0x0064537e in ExecScanFetch (node=0x288ba18,
accessMtd=0x65e24b FunctionNext, recheckMtd=0x65e630 FunctionRecheck)
at execScan.c:82
#18 0x006453ed in ExecScan (node=0x288ba18, accessMtd=0x65e24b
FunctionNext, recheckMtd=0x65e630 FunctionRecheck)
at execScan.c:132
#19 0x0065e665 in ExecFunctionScan (node=0x288ba18) at
nodeFunctionscan.c:269
#20 0x0063a649 in ExecProcNode (node=0x288ba18) at
execProcnode.c:426
#21 0x0065ca53 in ExecModifyTable (node=0x288b8c0) at
nodeModifyTable.c:926
#22 0x0063a577 in ExecProcNode (node=0x288b8c0) at
execProcnode.c:377
#23 0x00638465 in ExecutePlan (estate=0x288b518,
planstate=0x288b8c0, operation=CMD_INSERT, sendTuples=0 '\000',
numberTuples=0,
direction=ForwardScanDirection, dest=0x2815ac0) at execMain.c:1475
#24 0x0063667a in standard_ExecutorRun (queryDesc=0x288f948,
direction=ForwardScanDirection, count=0) at execMain.c:308
#25 0x00636512 in ExecutorRun (queryDesc=0x288f948,
direction=ForwardScanDirection, count=0) at execMain.c:256
#26 0x007998ec in ProcessQuery (plan=0x28159c8,
sourceText=0x2854d18 INSERT INTO a SELECT * FROM
pg_logical_slot_get_changes('my_slot', NULL, NULL);, params=0x0,
dest=0x2815ac0,
completionTag=0x7fff3e659920 ) at pquery.c:185
#27 0x0079b16e in PortalRunMulti (portal=0x2890638, isTopLevel=1
'\001', dest=0x2815ac0, altdest=0x2815ac0,
completionTag=0x7fff3e659920 ) at pquery.c:1279
#28 0x0079a7b1 in PortalRun (portal=0x2890638,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x2815ac0,
altdest=0x2815ac0,
completionTag=0x7fff3e659920 ) at pquery.c:816
#29 0x0079487b in exec_simple_query (
query_string=0x2854d18 INSERT INTO a SELECT * FROM
pg_logical_slot_get_changes('my_slot', NULL, NULL);) at 

Re: [HACKERS] Another logical decoding assertion failure

2014-08-15 Thread Andres Freund
On 2014-08-15 14:53:45 +0200, Antonin Houska wrote:
 http://www.postgresql.org/message-id/blu436-smtp12682d628f61ab9736099c3dc...@phx.gbl
 recalls me that I also saw an assertion failure recently. Although I
 wanted to isolate and report my issue when my vacation is over, this
 report made me curious whether I saw the same. Eventually it seems to be
 a different symptom.

That's something separate, yes.

 Following are the steps that trigger the failure in my environment
 (9.5devel), although it's not always that straightforward - sometimes I
 need to create and populate one more table. (I use the 'test_decoding'
 contrib module.)
 
 
 postgres=# SELECT pg_create_logical_replication_slot('my_slot',
 'test_decoding');
  pg_create_logical_replication_slot
 
  (my_slot,0/16F3B30)
 (1 row)
 
 postgres=# CREATE TABLE a AS SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL) WITH NO DATA;
 SELECT 0
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 INSERT 0 2
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 The connection to the server was lost. Attempting reset: Failed.
 !

Is this just to reproduce the issue, or are you really storing the
results of logical decoding in a table again?

Why? That'll just result in circularity, no?  It's not something I
really thought about allowing when writing this. Possibly we can make it
work, but I don't really see the point. We obviously shouldn't just
crash, but that's not my point.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another logical decoding assertion failure

2014-08-15 Thread Antonin Houska
On 08/15/2014 03:16 PM, Andres Freund wrote:
 On 2014-08-15 14:53:45 +0200, Antonin Houska wrote:
 postgres=# SELECT pg_create_logical_replication_slot('my_slot',
 'test_decoding');
  pg_create_logical_replication_slot
 
  (my_slot,0/16F3B30)
 (1 row)

 postgres=# CREATE TABLE a AS SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL) WITH NO DATA;
 SELECT 0
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 INSERT 0 2
 postgres=# INSERT INTO a SELECT * FROM
 pg_logical_slot_get_changes('my_slot', NULL, NULL);
 The connection to the server was lost. Attempting reset: Failed.
 !
 
 Is this just to reproduce the issue, or are you really storing the
 results of logical decoding in a table again?
 
 Why? That'll just result in circularity, no?  It's not something I
 really thought about allowing when writing this. Possibly we can make it
 work, but I don't really see the point. We obviously shouldn't just
 crash, but that's not my point.

It was basically an experiment. I tried to capture changes into a table.
I don't know whether it's legal (i.e. whether the current call of
pg_logical_slot_get_changes() should include changes that the current
transaction did). Unloged / temporary table may be necessary for case
like this.

The reason I report it is that (I think), if such an use case is not
legal, it should be detected somehow and handled using ereport / elog.

// Tony



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers