Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On 03/03/2011 11:36 PM, Noah Misch wrote: Does relation 16902 (attribute_summary) have a foreign key constraint over the sequence_number column, in either direction, with relation 16896? That would explain it: session 1: ALTER TABLE attribute_summary ... sleeps after relation_openrv in transformAlterTableStmt session 2: SELECT 1 FROM rel16896 LIMIT 0; session 2: SELECT 1 FROM attribute_summary LIMIT 0; blocks session 1: wakes up; continues ALTER TABLE: deadlock upon locking rel16896 Ah, OK -- then that would explain it as there are foreign keys on that column. Thanks, hadn't thought about that aspect. Granted, the cure may be worse than the disease. Right. I've already advised they shut down the application during the alter table, which they can do (and in fact already do -- they were restarting the application just prior to this step, which really makes no sense anyway). Thanks, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On Mar 2, 2011, at 2:54 PM, Joe Conway wrote: On 03/02/2011 12:41 PM, Tom Lane wrote: Looks like the process trying to do the ALTER has already got some lower-level lock on the table. It evidently hasn't got AccessExclusiveLock, but nonetheless has something strong enough to block an INSERT, such as ShareLock. Hmmm, is it possible that the following might do that, whereas a simple ALTER TABLE would not? Impossible to tell without seeing what's in the script... ie: if the script was BEGIN; -- Do something to that table that blocks inserts SELECT change_column_type(...); COMMIT; You'd get a deadlock. The script also has several race conditions: - Someone could drop the table after you query pg_class - Someone could alter/drop the column after you query pg_attribute My suggestion would be to try to grab an exclusive lock on the table as the first line in the function (and then don't do anything cute in the declare section, such as use tablename::regprocedure). Speaking of which, I would recommend using the regprocedure and regtype casts instead of querying the catalog directly; that way you have working schema support and you're immune from future catalog changes. Unfortunately you'll still have to do things the hard way to find the column (unless we added regcolumn post 8.3), but you might want to use information_schema, or at least see what it's doing there. The query *technically* should include WHERE attnum 0 (maybe =) AND NOT attisdropped, though it's probably not a big deal that it isn't since ALTER TABLE will save your bacon there (though, I'd include a comment to that effect to protect anyone who decides to blindly cut and paste that query somewhere else where it does matter...). 8--- BEGIN; CREATE OR REPLACE FUNCTION change_column_type ( tablename text, columnname text, newtype text ) RETURNS text AS $$ DECLARE newtypeid oid; tableoidoid; curtypeid oid; BEGIN SELECT INTO newtypeid oid FROM pg_type WHERE oid = newtype::regtype::oid; SELECT INTO tableoid oid FROM pg_class WHERE relname = tablename; IF NOT FOUND THEN RETURN 'TABLE NOT FOUND'; END IF; SELECT INTO curtypeid atttypid FROM pg_attribute WHERE attrelid = tableoid AND attname::text = columnname; IF NOT FOUND THEN RETURN 'COLUMN NOT FOUND'; END IF; IF curtypeid != newtypeid THEN EXECUTE 'ALTER TABLE ' || tablename || ' ALTER COLUMN ' || columnname || ' SET DATA TYPE ' || newtype; RETURN 'CHANGE SUCCESSFUL'; ELSE RETURN 'CHANGE SKIPPED'; END IF; EXCEPTION WHEN undefined_object THEN RETURN 'INVALID TARGET TYPE'; END; $$ LANGUAGE plpgsql; SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); COMMIT; 8--- This text is in a file being run from a shell script with something like: psql dbname script.sql The concurrent INSERTs are being done by the main application code (running on Tomcat). Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On 03/03/2011 03:49 PM, Jim Nasby wrote: On Mar 2, 2011, at 2:54 PM, Joe Conway wrote: On 03/02/2011 12:41 PM, Tom Lane wrote: Looks like the process trying to do the ALTER has already got some lower-level lock on the table. It evidently hasn't got AccessExclusiveLock, but nonetheless has something strong enough to block an INSERT, such as ShareLock. Hmmm, is it possible that the following might do that, whereas a simple ALTER TABLE would not? Impossible to tell without seeing what's in the script... ie: if the script was BEGIN; -- Do something to that table that blocks inserts SELECT change_column_type(...); COMMIT; You'd get a deadlock. The script was exactly the one posted, i.e. BEGIN; CREATE FUNCTION change_column_type(...); SELECT change_column_type(...); COMMIT; That's all there is to it. And the function itself has no specific reference to the table being altered. That's why I'm left scratching my head ;-) Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On Mar 3, 2011, at 6:26 PM, Joe Conway wrote: On 03/03/2011 03:49 PM, Jim Nasby wrote: On Mar 2, 2011, at 2:54 PM, Joe Conway wrote: On 03/02/2011 12:41 PM, Tom Lane wrote: Looks like the process trying to do the ALTER has already got some lower-level lock on the table. It evidently hasn't got AccessExclusiveLock, but nonetheless has something strong enough to block an INSERT, such as ShareLock. Hmmm, is it possible that the following might do that, whereas a simple ALTER TABLE would not? Impossible to tell without seeing what's in the script... ie: if the script was BEGIN; -- Do something to that table that blocks inserts SELECT change_column_type(...); COMMIT; You'd get a deadlock. The script was exactly the one posted, i.e. BEGIN; CREATE FUNCTION change_column_type(...); SELECT change_column_type(...); COMMIT; That's all there is to it. And the function itself has no specific reference to the table being altered. That's why I'm left scratching my head ;-) I suggest grabbing a snapshot of pg_locks for the connection that's creating the function, and then do the same for the insert and see what could potentially conflict... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On Wed, Mar 02, 2011 at 12:25:16PM -0800, Joe Conway wrote: I'm working with a client on an application upgrade script which executes a function to conditionally do an: ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz If this is run while the application is concurrently doing inserts into foo, we are occasionally seeing deadlocks. Aside from the fact that they are better off not altering the table amid concurrent inserts, I'm trying to understand why this is even able to happen. I expect one to block the other, not a deadlock. This is 8.4.1 (I know, I know, I have advised strongly that they upgrade to 8.4.latest). We have not been able to repeat this forcibly. Here is what the log shows: -- 2011-02-25 14:38:07 PST [31686]: [1-1] ERROR: deadlock detected 2011-02-25 14:38:07 PST [31686]: [2-1] DETAIL: Process 31686 waits for AccessExclusiveLock on relation 16896 of database 16386; blocked by process 31634. Process 31634 waits for RowExclusiveLock on relation 16902 of database 16386; blocked by process 31686. Process 31686: SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); Process 31634: insert into attribute_summary (attribute_value, sequence_number, attribute_id) values ($1, $2, $3) 2011-02-25 14:38:07 PST [31686]: [3-1] HINT: See server log for query details. 2011-02-25 14:38:07 PST [31686]: [4-1] CONTEXT: SQL statement ALTER TABLE attribute_summary ALTER COLUMN sequence_number SET DATA TYPE numeric PL/pgSQL function change_column_type line 18 at EXECUTE statement 2011-02-25 14:38:07 PST [31686]: [5-1] STATEMENT: SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); -- Does relation 16902 (attribute_summary) have a foreign key constraint over the sequence_number column, in either direction, with relation 16896? That would explain it: session 1: ALTER TABLE attribute_summary ... sleeps after relation_openrv in transformAlterTableStmt session 2: SELECT 1 FROM rel16896 LIMIT 0; session 2: SELECT 1 FROM attribute_summary LIMIT 0; blocks session 1: wakes up; continues ALTER TABLE: deadlock upon locking rel16896 Off the cuff, I think you could make sure this never deadlocks with a PL/pgSQL recipe like this: LOOP BEGIN LOCK TABLE rel16896; LOCK TABLE attribute_summary NOWAIT; EXIT; EXCEPTION WHEN lock_not_available THEN END; END LOOP; Granted, the cure may be worse than the disease. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER TABLE deadlock with concurrent INSERT
I'm working with a client on an application upgrade script which executes a function to conditionally do an: ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz If this is run while the application is concurrently doing inserts into foo, we are occasionally seeing deadlocks. Aside from the fact that they are better off not altering the table amid concurrent inserts, I'm trying to understand why this is even able to happen. I expect one to block the other, not a deadlock. This is 8.4.1 (I know, I know, I have advised strongly that they upgrade to 8.4.latest). We have not been able to repeat this forcibly. Here is what the log shows: -- 2011-02-25 14:38:07 PST [31686]: [1-1] ERROR: deadlock detected 2011-02-25 14:38:07 PST [31686]: [2-1] DETAIL: Process 31686 waits for AccessExclusiveLock on relation 16896 of database 16386; blocked by process 31634. Process 31634 waits for RowExclusiveLock on relation 16902 of database 16386; blocked by process 31686. Process 31686: SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); Process 31634: insert into attribute_summary (attribute_value, sequence_number, attribute_id) values ($1, $2, $3) 2011-02-25 14:38:07 PST [31686]: [3-1] HINT: See server log for query details. 2011-02-25 14:38:07 PST [31686]: [4-1] CONTEXT: SQL statement ALTER TABLE attribute_summary ALTER COLUMN sequence_number SET DATA TYPE numeric PL/pgSQL function change_column_type line 18 at EXECUTE statement 2011-02-25 14:38:07 PST [31686]: [5-1] STATEMENT: SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); -- Reviewing the release notes, I see some marginally related commits, but nothing that jumps out to me as a specific fix. Thoughts? Thanks, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
Joe Conway m...@joeconway.com writes: I'm working with a client on an application upgrade script which executes a function to conditionally do an: ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz If this is run while the application is concurrently doing inserts into foo, we are occasionally seeing deadlocks. Aside from the fact that they are better off not altering the table amid concurrent inserts, I'm trying to understand why this is even able to happen. I expect one to block the other, not a deadlock. Looks like the process trying to do the ALTER has already got some lower-level lock on the table. It evidently hasn't got AccessExclusiveLock, but nonetheless has something strong enough to block an INSERT, such as ShareLock. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT
On 03/02/2011 12:41 PM, Tom Lane wrote: Looks like the process trying to do the ALTER has already got some lower-level lock on the table. It evidently hasn't got AccessExclusiveLock, but nonetheless has something strong enough to block an INSERT, such as ShareLock. Hmmm, is it possible that the following might do that, whereas a simple ALTER TABLE would not? 8--- BEGIN; CREATE OR REPLACE FUNCTION change_column_type ( tablename text, columnname text, newtype text ) RETURNS text AS $$ DECLARE newtypeid oid; tableoidoid; curtypeid oid; BEGIN SELECT INTO newtypeid oid FROM pg_type WHERE oid = newtype::regtype::oid; SELECT INTO tableoid oid FROM pg_class WHERE relname = tablename; IF NOT FOUND THEN RETURN 'TABLE NOT FOUND'; END IF; SELECT INTO curtypeid atttypid FROM pg_attribute WHERE attrelid = tableoid AND attname::text = columnname; IF NOT FOUND THEN RETURN 'COLUMN NOT FOUND'; END IF; IF curtypeid != newtypeid THEN EXECUTE 'ALTER TABLE ' || tablename || ' ALTER COLUMN ' || columnname || ' SET DATA TYPE ' || newtype; RETURN 'CHANGE SUCCESSFUL'; ELSE RETURN 'CHANGE SKIPPED'; END IF; EXCEPTION WHEN undefined_object THEN RETURN 'INVALID TARGET TYPE'; END; $$ LANGUAGE plpgsql; SELECT change_column_type('attribute_summary', 'sequence_number', 'numeric'); COMMIT; 8--- This text is in a file being run from a shell script with something like: psql dbname script.sql The concurrent INSERTs are being done by the main application code (running on Tomcat). Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support signature.asc Description: OpenPGP digital signature