Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-04 Thread Joe Conway
On 03/03/2011 11:36 PM, Noah Misch wrote:
 Does relation 16902 (attribute_summary) have a foreign key constraint over the
 sequence_number column, in either direction, with relation 16896?  That would
 explain it:
 
 session 1: ALTER TABLE attribute_summary ... sleeps after relation_openrv in 
 transformAlterTableStmt
 session 2: SELECT 1 FROM rel16896 LIMIT 0;
 session 2: SELECT 1 FROM attribute_summary LIMIT 0; blocks
 session 1: wakes up; continues ALTER TABLE: deadlock upon locking rel16896

Ah, OK -- then that would explain it as there are foreign keys on that
column. Thanks, hadn't thought about that aspect.

 Granted, the cure may be worse than the disease.

Right. I've already advised they shut down the application during the
alter table, which they can do (and in fact already do -- they were
restarting the application just prior to this step, which really makes
no sense anyway).

Thanks,

Joe

-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-03 Thread Jim Nasby
On Mar 2, 2011, at 2:54 PM, Joe Conway wrote:
 On 03/02/2011 12:41 PM, Tom Lane wrote:
 Looks like the process trying to do the ALTER has already got some
 lower-level lock on the table.  It evidently hasn't got
 AccessExclusiveLock, but nonetheless has something strong enough to
 block an INSERT, such as ShareLock.
 
 Hmmm, is it possible that the following might do that, whereas a simple
 ALTER TABLE would not?

Impossible to tell without seeing what's in the script... ie: if the script was

BEGIN;
-- Do something to that table that blocks inserts
SELECT change_column_type(...);
COMMIT;

You'd get a deadlock.

The script also has several race conditions:

- Someone could drop the table after you query pg_class
- Someone could alter/drop the column after you query pg_attribute

My suggestion would be to try to grab an exclusive lock on the table as the 
first line in the function (and then don't do anything cute in the declare 
section, such as use tablename::regprocedure).

Speaking of which, I would recommend using the regprocedure and regtype casts 
instead of querying the catalog directly; that way you have working schema 
support and you're immune from future catalog changes. Unfortunately you'll 
still have to do things the hard way to find the column (unless we added 
regcolumn post 8.3), but you might want to use information_schema, or at least 
see what it's doing there. The query *technically* should include WHERE attnum 
 0 (maybe =) AND NOT attisdropped, though it's probably not a big deal that 
it isn't since ALTER TABLE will save your bacon there (though, I'd include a 
comment to that effect to protect anyone who decides to blindly cut and paste 
that query somewhere else where it does matter...).

 8---
 BEGIN;
 
 CREATE OR REPLACE FUNCTION change_column_type
 (
  tablename text,
  columnname text,
  newtype text
 ) RETURNS text AS $$
  DECLARE
newtypeid   oid;
tableoidoid;
curtypeid   oid;
  BEGIN
SELECT INTO newtypeid oid FROM pg_type WHERE oid =
  newtype::regtype::oid;
SELECT INTO tableoid oid FROM pg_class WHERE relname = tablename;
IF NOT FOUND THEN
  RETURN 'TABLE NOT FOUND';
END IF;
 
SELECT INTO curtypeid atttypid FROM pg_attribute WHERE
  attrelid = tableoid AND attname::text = columnname;
IF NOT FOUND THEN
  RETURN 'COLUMN NOT FOUND';
END IF;
 
IF curtypeid != newtypeid THEN
  EXECUTE 'ALTER TABLE ' || tablename || ' ALTER COLUMN ' ||
  columnname || ' SET DATA TYPE ' || newtype;
  RETURN 'CHANGE SUCCESSFUL';
ELSE
  RETURN 'CHANGE SKIPPED';
END IF;
  EXCEPTION
WHEN undefined_object THEN
  RETURN 'INVALID TARGET TYPE';
  END;
 $$ LANGUAGE plpgsql;
 
 SELECT change_column_type('attribute_summary',
  'sequence_number',
  'numeric');
 
 COMMIT;
 8---
 
 This text is in a file being run from a shell script with something like:
 
  psql dbname  script.sql
 
 The concurrent INSERTs are being done by the main application code
 (running on Tomcat).
 
 Joe
 
 -- 
 Joe Conway
 credativ LLC: http://www.credativ.us
 Linux, PostgreSQL, and general Open Source
 Training, Service, Consulting,  24x7 Support
 

--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-03 Thread Joe Conway
On 03/03/2011 03:49 PM, Jim Nasby wrote:
 On Mar 2, 2011, at 2:54 PM, Joe Conway wrote:
 On 03/02/2011 12:41 PM, Tom Lane wrote:
 Looks like the process trying to do the ALTER has already got some
 lower-level lock on the table.  It evidently hasn't got
 AccessExclusiveLock, but nonetheless has something strong enough to
 block an INSERT, such as ShareLock.

 Hmmm, is it possible that the following might do that, whereas a simple
 ALTER TABLE would not?
 
 Impossible to tell without seeing what's in the script... ie: if the script 
 was
 
 BEGIN;
 -- Do something to that table that blocks inserts
 SELECT change_column_type(...);
 COMMIT;
 
 You'd get a deadlock.

The script was exactly the one posted, i.e.
BEGIN;
CREATE FUNCTION change_column_type(...);
SELECT change_column_type(...);
COMMIT;

That's all there is to it. And the function itself has no specific
reference to the table being altered. That's why I'm left scratching my
head ;-)

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-03 Thread Jim Nasby
On Mar 3, 2011, at 6:26 PM, Joe Conway wrote:
 On 03/03/2011 03:49 PM, Jim Nasby wrote:
 On Mar 2, 2011, at 2:54 PM, Joe Conway wrote:
 On 03/02/2011 12:41 PM, Tom Lane wrote:
 Looks like the process trying to do the ALTER has already got some
 lower-level lock on the table.  It evidently hasn't got
 AccessExclusiveLock, but nonetheless has something strong enough to
 block an INSERT, such as ShareLock.
 
 Hmmm, is it possible that the following might do that, whereas a simple
 ALTER TABLE would not?
 
 Impossible to tell without seeing what's in the script... ie: if the script 
 was
 
 BEGIN;
 -- Do something to that table that blocks inserts
 SELECT change_column_type(...);
 COMMIT;
 
 You'd get a deadlock.
 
 The script was exactly the one posted, i.e.
 BEGIN;
 CREATE FUNCTION change_column_type(...);
 SELECT change_column_type(...);
 COMMIT;
 
 That's all there is to it. And the function itself has no specific
 reference to the table being altered. That's why I'm left scratching my
 head ;-)

I suggest grabbing a snapshot of pg_locks for the connection that's creating 
the function, and then do the same for the insert and see what could 
potentially conflict...
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-03 Thread Noah Misch
On Wed, Mar 02, 2011 at 12:25:16PM -0800, Joe Conway wrote:
 I'm working with a client on an application upgrade script which
 executes a function to conditionally do an:
 
   ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz
 
 If this is run while the application is concurrently doing inserts into
 foo, we are occasionally seeing deadlocks. Aside from the fact that they
 are better off not altering the table amid concurrent inserts, I'm
 trying to understand why this is even able to happen. I expect one to
 block the other, not a deadlock.
 
 This is 8.4.1 (I know, I know, I have advised strongly that they upgrade
 to 8.4.latest).
 
 We have not been able to repeat this forcibly. Here is what the log shows:
 --
 2011-02-25 14:38:07 PST [31686]: [1-1] ERROR:  deadlock detected
 2011-02-25 14:38:07 PST [31686]: [2-1] DETAIL:  Process 31686 waits for
 AccessExclusiveLock on relation 16896 of database 16386; blocked by
 process 31634.
 Process 31634 waits for RowExclusiveLock on relation 16902 of
 database 16386; blocked by process 31686.
 Process 31686: SELECT change_column_type('attribute_summary',
 'sequence_number', 'numeric');
 Process 31634: insert into attribute_summary (attribute_value,
 sequence_number, attribute_id) values ($1, $2, $3)
 2011-02-25 14:38:07 PST [31686]: [3-1] HINT:  See server log for query
 details.
 2011-02-25 14:38:07 PST [31686]: [4-1] CONTEXT:  SQL statement ALTER
 TABLE attribute_summary ALTER COLUMN sequence_number SET DATA TYPE numeric
 PL/pgSQL function change_column_type line 18 at EXECUTE statement
 2011-02-25 14:38:07 PST [31686]: [5-1] STATEMENT:  SELECT
 change_column_type('attribute_summary', 'sequence_number', 'numeric');
 --

Does relation 16902 (attribute_summary) have a foreign key constraint over the
sequence_number column, in either direction, with relation 16896?  That would
explain it:

session 1: ALTER TABLE attribute_summary ... sleeps after relation_openrv in 
transformAlterTableStmt
session 2: SELECT 1 FROM rel16896 LIMIT 0;
session 2: SELECT 1 FROM attribute_summary LIMIT 0; blocks
session 1: wakes up; continues ALTER TABLE: deadlock upon locking rel16896

Off the cuff, I think you could make sure this never deadlocks with a PL/pgSQL
recipe like this:

LOOP
BEGIN
LOCK TABLE rel16896;
LOCK TABLE attribute_summary NOWAIT;
EXIT;
EXCEPTION WHEN lock_not_available THEN
END;
END LOOP;

Granted, the cure may be worse than the disease.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-02 Thread Tom Lane
Joe Conway m...@joeconway.com writes:
 I'm working with a client on an application upgrade script which
 executes a function to conditionally do an:

   ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz

 If this is run while the application is concurrently doing inserts into
 foo, we are occasionally seeing deadlocks. Aside from the fact that they
 are better off not altering the table amid concurrent inserts, I'm
 trying to understand why this is even able to happen. I expect one to
 block the other, not a deadlock.

Looks like the process trying to do the ALTER has already got some
lower-level lock on the table.  It evidently hasn't got
AccessExclusiveLock, but nonetheless has something strong enough to
block an INSERT, such as ShareLock.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE deadlock with concurrent INSERT

2011-03-02 Thread Joe Conway
On 03/02/2011 12:41 PM, Tom Lane wrote:
 Looks like the process trying to do the ALTER has already got some
 lower-level lock on the table.  It evidently hasn't got
 AccessExclusiveLock, but nonetheless has something strong enough to
 block an INSERT, such as ShareLock.

Hmmm, is it possible that the following might do that, whereas a simple
ALTER TABLE would not?

8---
BEGIN;

CREATE OR REPLACE FUNCTION change_column_type
(
  tablename text,
  columnname text,
  newtype text
) RETURNS text AS $$
  DECLARE
newtypeid   oid;
tableoidoid;
curtypeid   oid;
  BEGIN
SELECT INTO newtypeid oid FROM pg_type WHERE oid =
  newtype::regtype::oid;
SELECT INTO tableoid oid FROM pg_class WHERE relname = tablename;
IF NOT FOUND THEN
  RETURN 'TABLE NOT FOUND';
END IF;

SELECT INTO curtypeid atttypid FROM pg_attribute WHERE
  attrelid = tableoid AND attname::text = columnname;
IF NOT FOUND THEN
  RETURN 'COLUMN NOT FOUND';
END IF;

IF curtypeid != newtypeid THEN
  EXECUTE 'ALTER TABLE ' || tablename || ' ALTER COLUMN ' ||
  columnname || ' SET DATA TYPE ' || newtype;
  RETURN 'CHANGE SUCCESSFUL';
ELSE
  RETURN 'CHANGE SKIPPED';
END IF;
  EXCEPTION
WHEN undefined_object THEN
  RETURN 'INVALID TARGET TYPE';
  END;
$$ LANGUAGE plpgsql;

SELECT change_column_type('attribute_summary',
  'sequence_number',
  'numeric');

COMMIT;
8---

This text is in a file being run from a shell script with something like:

  psql dbname  script.sql

The concurrent INSERTs are being done by the main application code
(running on Tomcat).

Joe

-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



signature.asc
Description: OpenPGP digital signature