On Wed, Jun 22, 2011 at 11:26 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
> For people that still think there is still risk, I've added a
> parameter called "relaxed_ddl_locking" which defaults to "off". That
> way people can use this feature in an informed way without exposing us
> to support risks from its use.

I can't get excited about adding a GUC that says "you can turn this
off if you want but don't blame us if it breaks".  That just doesn't
seem like good software engineering to me.

> I think we should be clear that this adds *one* lock/unlock for each
> object for each session, ONCE only after each transaction that
> executed a DDL statement against an object. So with 200 sessions, a
> typical ALTER TABLE statement will cause 400 locks. The current lock
> manager maxes out above 50,000 locks per second, so any comparative
> analysis will show this is a minor blip on even a heavy workload.

I think that using locks to address this problem is the wrong
approach.  I think the right fix is to use something other than
SnapshotNow to do the system catalog scans.  However, if we were going
to use locking, then we'd need to ensure that:

(1) No transaction which has made changes to a system catalog may
commit while some other transaction is in the middle of scanning that
catalog.
(2) No transaction which has made changes to a set of system catalogs
may commit while some other transaction is in the midst of fetching
interrelated data from a subset of those catalogs.

It's important to note, I think, that the problem here doesn't occur
at the time that the table rows are updated, because those rows aren't
visible yet.  The problem happens at commit time.  So unless I'm
missing something, ALTER TABLE would only need to acquire the relation
definition lock after it has finished all of its other work.  In fact,
it doesn't even really need to acquire it then, either.  It could just
queue up a list of relation definition locks to acquire immediately
before commit, which would be advantageous if the transaction went on
to abort, since in that case we'd skip the work of acquiring the locks
at all.

In fact, without doing that, this patch would undo much of the point
of the original ALTER TABLE lock strength reduction:

begin;
alter table foo alter column a set storage plain;
<start a new psql session in another window>
select * from foo;

In master, the SELECT completes without blocking.  With this patch
applied, the SELECT blocks, just as it did in 9.0, because it can't
rebuild the relcache entry without getting the relation definition
lock, which the alter table will hold until commit.  In fact, the same
thing happens even if foo has been accessed previously in the same
session.  If there is already an open *transaction* that has accessed
foo, then, it seems, it can continue to do so until it commits.  But
as soon as it tries to start a new transaction, it blocks again.  I
don't quite understand why that is - I didn't think we invalidated the
entire relcache on every commit.  But that's what happens.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to