Re: [HACKERS] CIC and deadlocks

2007-04-11 Thread Pavan Deolasee

On 4/1/07, Tom Lane [EMAIL PROTECTED] wrote:



Good point.  I'm envisioning a procarray.c function along the
lines of
bool TransactionHasSnapshot(xid)
which returns true if the xid is currently listed in PGPROC
and has a nonzero xmin.  CIC's cleanup wait loop would check
this and ignore the xid if it returns false.  Your point means
that this function would have to take exclusive not shared lock
while scanning the procarray, which is kind of annoying, but
it seems not fatal since CIC isn't done all that frequently.



When I looked at the code, it occurred to me that possibly we are
OK with just taking shared lock on the procarray. That means that
some other transaction can concurrently set its serializable snapshot
while we are scanning the procarray. But that should not harm us:
if we see the snapshot set, we wait for the transaction. A transaction
which is setting its serializable snapshot NOW, can not see the
tuples that we did not index, isn't it ?

A patch based on the discussion is attached.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


CIC_deadlock.patch
Description: Binary data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CIC and deadlocks

2007-04-11 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 When I looked at the code, it occurred to me that possibly we are
 OK with just taking shared lock on the procarray. That means that
 some other transaction can concurrently set its serializable snapshot
 while we are scanning the procarray. But that should not harm us:
 if we see the snapshot set, we wait for the transaction. A transaction
 which is setting its serializable snapshot NOW, can not see the
 tuples that we did not index, isn't it ?

[ itch... ]  The problem is with time-extended execution of
GetSnapshotData; what happens if the other guy lost the CPU for a good
long time while in the middle of GetSnapshotData?  He might set his
xmin based on info you saw as long gone.

You might be correct that it's safe, but the argument would have to
hinge on the OldestXmin process being unable to commit because of
someone holding shared ProcArrayLock; a point you are definitely not
making above.  (Study the comments in GetSnapshotData for awhile,
also those in xact.c's commit-related code.)

I'm about to head to bed and am certainly in no condition to carry the
proof through.  Have at it ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CIC and deadlocks

2007-04-11 Thread Pavan Deolasee

On 4/11/07, Tom Lane [EMAIL PROTECTED] wrote:



[ itch... ]  The problem is with time-extended execution of
GetSnapshotData; what happens if the other guy lost the CPU for a good
long time while in the middle of GetSnapshotData?  He might set his
xmin based on info you saw as long gone.

You might be correct that it's safe, but the argument would have to
hinge on the OldestXmin process being unable to commit because of
someone holding shared ProcArrayLock; a point you are definitely not
making above.  (Study the comments in GetSnapshotData for awhile,
also those in xact.c's commit-related code.)



My argument was based on what you said above, but I obviously did not
state it well :)

Anyways, I think its better to be safe and we agree that its not such a
bad thing to take exclusive lock on procarray because CIC is not something
that happens very often. Attached is a revised patch which takes exclusive
lock on the procarray, rest remaining the same.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


CIC_deadlock_v2.patch
Description: Binary data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CIC and deadlocks

2007-04-09 Thread Pavan Deolasee

Tom Lane wrote:
 Pavan Deolasee [EMAIL PROTECTED] writes:

 Good point.  I'm envisioning a procarray.c function along the
 lines of
 bool TransactionHasSnapshot(xid)
 which returns true if the xid is currently listed in PGPROC
 and has a nonzero xmin.  CIC's cleanup wait loop would check
 this and ignore the xid if it returns false.  Your point means
 that this function would have to take exclusive not shared lock
 while scanning the procarray, which is kind of annoying, but
 it seems not fatal since CIC isn't done all that frequently.


Tom,

If you haven't finished this yet, would you like me to work
on this ? If I do it, I would mostly follow the path you
suggested above, unless I run into something else.

Thanks,
Pavan

--


EnterpriseDBhttp://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] CIC and deadlocks

2007-04-09 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 If you haven't finished this yet, would you like me to work
 on this ? If I do it, I would mostly follow the path you
 suggested above, unless I run into something else.

I'm not intending to work on it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Simon Riggs
On Sat, 2007-03-31 at 17:45 +0530, Pavan Deolasee wrote:
 
 Isn't CREATE INDEX CONCURRENTLY prone deadlock conditions ?
 I saw one with VACUUM today. But I think it can happen with other
 commands like VACUUM FULL, CLUSTER, CREATE INDEX
 CONCURRENTLY and so on. These commands conflict on the 
 ShareUpdateExclusiveLock held by CIC and hence would wait for
 CIC to release the lock. At the same time, CIC would wait for these
 transactions to complete.
 
 We know that these commands are run in a separate transaction 
 and do not do any index fetches or inserts/updates. So in principle
 CIC need not wait for these transactions to complete in any
 of its waits. May be we can skip waits on the transactions that
 are running one of these commands. 

Yes, because I proposed it already. :-)

utility transactions in - Latest plans for Utilities with HOT

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 Isn't CREATE INDEX CONCURRENTLY prone deadlock conditions ?

Can you give a specific example?  The deadlock code will grant locks
out-of-order in cases where the alternative is to abort somebody.
I think you may be describing a missed opportunity in that logic,
more than a reason to add still another fragile assumption for HOT.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Pavan Deolasee

On 3/31/07, Tom Lane [EMAIL PROTECTED] wrote:


Pavan Deolasee [EMAIL PROTECTED] writes:
 Isn't CREATE INDEX CONCURRENTLY prone deadlock conditions ?

Can you give a specific example?



txn1 - CREATE INDEX CONCURRENTLY (takes ShareUpdateExclusiveLock)
txn2 - VACUUM ANALYZE (waits on ShareUpdateExclusiveLock)
tnx1 - waits for txn2 to complete in the second phase of CIC

deadlock!

Lazy VACUUM is safe because we don't include inVacuum  transactions
in the snapshot and hence don't wait for it in CIC. I haven't checked, but
VACUUM FULL would also deadlock.




I think you may be describing a missed opportunity in that logic,
more than a reason to add still another fragile assumption for HOT.



Not sure what you are referring to. But I shall keep this in mind.

Thanks,
Pavan


--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 On 3/31/07, Tom Lane [EMAIL PROTECTED] wrote:
 Can you give a specific example?

 txn1 - CREATE INDEX CONCURRENTLY (takes ShareUpdateExclusiveLock)
 txn2 - VACUUM ANALYZE (waits on ShareUpdateExclusiveLock)
 tnx1 - waits for txn2 to complete in the second phase of CIC

Oh, it's the cleanup wait you're worried about.

 Lazy VACUUM is safe because we don't include inVacuum  transactions
 in the snapshot and hence don't wait for it in CIC.

Hmm ... only if it's already set inVacuum true ... there's a window
where it has not.

I wonder whether we could change CIC so that the reference
snapshot lists only transactions that are running and have already
determined their serializable snapshot (ie, have set proc-xmin).
Xacts that haven't yet done that can be ignored because they couldn't
possibly see the dead tuples we're worried about, no?

Then we could rearrange the order of operations in vacuum_rel so
that we lock the target rel before we acquire a snapshot.  Then
a vacuum waiting for the CIC cannot cause a deadlock.

Multi-rel CLUSTER could be changed the same way.  I'm not particularly
worried about single-rel CLUSTER, only stuff that would be reasonable
to launch from background maintenance tasks.

[ thinks... ]  Actually, it seems risky to omit xids from the reference
snapshot; that could perhaps screw up the index insertions.  But we
could look in the procArray to see if the xid still exists and has set
an xmin before we actually wait for it.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Pavan Deolasee

On 3/31/07, Tom Lane [EMAIL PROTECTED] wrote:



Hmm ... only if it's already set inVacuum true ... there's a window
where it has not.

I wonder whether we could change CIC so that the reference
snapshot lists only transactions that are running and have already
determined their serializable snapshot (ie, have set proc-xmin).
Xacts that haven't yet done that can be ignored because they couldn't
possibly see the dead tuples we're worried about, no?



Yes, it may work. Do we need to take some extra care because
proc-xmin is set while holding SHARED lock on proc array ?


Then we could rearrange the order of operations in vacuum_rel so

that we lock the target rel before we acquire a snapshot.  Then
a vacuum waiting for the CIC cannot cause a deadlock.



We may need to do the same in analyze_rel.


Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] CIC and deadlocks

2007-03-31 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 Yes, it may work. Do we need to take some extra care because
 proc-xmin is set while holding SHARED lock on proc array ?

Good point.  I'm envisioning a procarray.c function along the
lines of 
bool TransactionHasSnapshot(xid)
which returns true if the xid is currently listed in PGPROC
and has a nonzero xmin.  CIC's cleanup wait loop would check
this and ignore the xid if it returns false.  Your point means
that this function would have to take exclusive not shared lock
while scanning the procarray, which is kind of annoying, but
it seems not fatal since CIC isn't done all that frequently.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster