On Tue, Jun 25, 2013 at 9:07 AM, Robert Haas <robertmh...@gmail.com> wrote:

> On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tan...@gmail.com>
> wrote:
> > If I don't miss something, the requirement for the CONCURRENTLY option
> is to
> > allow simple SELECT reader to read the matview concurrently while the
> view
> > is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
> > UPDATE/SHARE are still blocked.  So, I wonder why it is not possible
> just to
> > acquire ExclusiveLock on the matview while populating the data and swap
> the
> > relfile by taking small AccessExclusiveLock.  This lock escalation is no
> > dead lock hazard, I suppose, because concurrent operation would block the
> > other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
> > AccessExclusiveLock.  Then you don't need the complicated SPI logic or
> > unique key index dependency.
>
> This is no good.  One, all lock upgrades are deadlock hazards.  In
> this case, that plays out as follows: suppose that the session running
> REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
> else.  Some other process takes an AccessShareLock on the materialized
> view and then tries to take a conflicting lock on the other object.
> Kaboom, deadlock.  Granted, the chances of that happening in practice
> are small, but it IS the reason why we typically try to having
> long-running operations perform lock upgrades.  Users get really
> annoyed when their DDL runs for an hour and then rolls back.
>
>
OK, that' not safe.  What I was thinking was something similar to
compare-and-swap, where the whole operation is atomic under an
AccessExclusiveLock.  What if we release ExclusiveLock once a new matview
was created and re-acquire AccessExclusiveLock before trying swap?  Note
matview is a little different from index which I know people are talking
about in REINDEX CONCURRENTLY thread, in that the content of matview does
not change incrementally (at least at this point), but only does change
fully in swapping operation by the same REFRESH MATERIALIZED VIEW command.
The only race condition is between releasing Exclusive lock and re-acquire
AccessExclusiveLock someone else can go ahead with the same operation and
could create another one.  If it happens, let's abort us, because I guess
that's the way our transaction system is working anyway;  in case of unique
key index insertion for example, if I find another guy is inserting the
same value in the index, I wait for the other guy to finish his work and if
his transaction commits I give up, otherwise I go ahead.  Maybe it's
annoying if an hour operation finally gets aborted, but my purpose is
actually achieved by the other guy.  If the primary goal of this feature is
let reader reads the matview concurrently it should be ok?

Hmm, but in such cases the first guy is always win and the second guy who
may come an hour later loses so we cannot get the result from the latest
command... I still wonder there should be some way.


> Two, until we get MVCC catalog scans, it's not safe to update any
> system catalog tuple without an AccessExclusiveLock on some locktag
> that will prevent concurrent catalog scans for that tuple.  Under
> SnapshotNow semantics, concurrent readers can fail to see that the
> object is present at all, leading to mysterious failures - especially
> if some of the object's catalog scans are seen and others are missed.
>
>
> So what I'm saying above is take AccessExclusiveLock on swapping relfile
in catalog.  This doesn't violate your statement, I suppose.  I'm actually
still skeptical about MVCC catalog, because even if you can make catalog
lookup MVCC, relfile on the filesystem is not MVCC.  If session 1 changes
relfilenode in pg_class and commit transaction, delete the old relfile from
the filesystem, but another concurrent session 2 that just took a snapshot
before 1 made such change keeps running and tries to open this relation,
grabbing the old relfile and open it from filesystem -- ERROR: relfile not
found.  So everyone actually needs to see up-to-date information that
synchronizes with what filesystem says and that's SnapshotNow.   In my
experimental thought above about compare-and-swap way, in compare phase he
needs to see the most recent valid information, otherwise he never thinks
someone did something new.  Since I haven't read the whole thread, maybe we
have already discussed about it, but it would help if you clarify this
concern.


Thanks,
-- 
Hitoshi Harada

Reply via email to