[HACKERS] Foreign key quandries

2003-02-28 Thread Stephan Szabo

Going through the issues in doing dirty reads in foreign keys I've come up
with a few cases that I'm fairly uncertain about how to handle with
regards to deadlocks and figured I should ask for advice because I think
I'm missing something painfully obvious, but don't get large enough blocks
of time to think about it to figure out what it is.


I'd thought maybe it'd be enough to say which type of
thing on which constraint and use that to basically say that
we don't need to wait on a transaction that's waiting on us
due to a modification to the other table, but AFAICS
that lets through a bad case:
 T1: insert into fk values (2);
 T2: delete from pk where key=3;
 T2: delete from pk where key=2;
 T1: insert into fk values (3);
If T1 doesn't wait in this case, you can get into a case where
a bad row is inserted into fk if you then have:
 T1: delete from fk where key=2;
 T1: commit;
Now there's no row to make the second delete fail but
transaction 2 still can't commit due to the fk row with key 3.



I'd then thought of doing something based on what row/value
transaction 2 was waiting on, but that has problems.
Given a foreign key with no referential actions and a
sequence like:

 Transaction 1 inserts into the foreign key table a row
  with a referencing key of 2.
 Transaction 1 checks the foreign key
 Transaction 2 deletes the primary key rows having keys
  2 and 3
 Transaction 1 inserts another row into the foreign key
  table with a referencing key of 2.
 Transactions 1 and 2 start checking the foreign key.

AFAICS, transaction 2 needs to wait since there's already a
row it can see in the foreign key table that's not yet committed
(so it doesn't know if the delete works or not).  We can tell
transaction 1 that it doesn't need to wait on transaction 2
because transaction 1 is inserting a value that transaction 2
will see in its check, thus we're saved from the first case.

However, this has the potential to deadlock if we had for example,
inserted a foreign key table row of 3 rather than 2 as the second
insert in transaction 1 and the delete check for 2 went first.  If
we knew that it was also going to be checking the 3 rows, we'd be
safe, but then we've got to keep that information in some way that's
visible to other transactions AFAICS.  And, if the checks were
done in the order delete check for 3, delete check for 2(t2 blocks),
insert check for 3, we'd be back in the state of the first example.
:(




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Foreign key quandries

2003-02-28 Thread Rod Taylor
Gah, hit wrong key combination and the email sent early.

Anyway, after that 'sleep' mess at the bottom is:
T1 or T2: Sleeping too long -- lets run deadlock detection code
T1 or T2: Kill off random participant of deadlock.

The other participant is then allowed to continue their work.

On Sat, 2003-03-01 at 02:03, Rod Taylor wrote:
 On Sat, 2003-03-01 at 00:44, Stephan Szabo wrote:
  On 1 Mar 2003, Rod Taylor wrote:
  
   I'm not sure I understand the question. The case as described simply has
   to deadlock because your approaching the same values with conflicting
   tasks from opposite directions.
  
  Well, the problem is that two cases (one of which I think deadlock is
  unnecessary in) are very similar.
 
 I see.  Now I see what your asking about.
 
  As I see it:
  
  T1: insert 2
  T2: delete 2
  T1: insert 2/update 2 (non-key fields)
   shouldn't need to deadlock.
 
  T1: insert 2
  T2: delete 2  3
 * delete 2's check blocks before
checking 3
  T1: insert 3
   should not need to deadlock I think
 
  T1: insert 2
  T2: delete 3
  T2: delete 2
   (or delete 2  3 where 3's check goes then 2's check blocks)
  T1: insert 3
   does need to deadlock
  
  In the second case, both deletes have happened so the row the insert wants
  to check against is marked for deletion, but since it's going to be
  checking for the 3 row in the future, and will error if T1 commits, I
  think it's safe for it to go through.
  
  I'm trying to find a way to differentiate the second and third case given
  that I'm running inside a constraint check on insert 3. It'd be easy if
  transaction 1 could see that it's going to be checking for the 3 row
  later, but I think that'd involve keeping around alot of information about
  the rows that are affected in some shared way which could get really
  large.
 
 Isn't the differentiation going to happen automatically?
 
 In case 2:
 
 T1: create fk tuple (uncommitted) - value 2
 T2: delete pk tuple value 2
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T1: create fk tuple (uncommitted) - value 3
 T1: commit
 T2:   scan through fk table, find tuple value 2 ... its committed
 T2: run cascade procedure on tuples found in fk table for value 2
 T2: continue scan through fk table, find tuple value 3 ... its committed
 T2: run cascade procedure on tuples found in fk table for value 3
 T2: All is well -- return control to user.
 
 In case 3:
 T1: create fk tuple (uncommitted) - value 2
 T2: delete pk tuple value 3
 T2:   scan through fk table, value 3 not found
 T2: delete pk tuple value 2
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T1: create fk value 3
 T1:   scan through pk table, find uncommitted tuple value 3 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
 T1:   scan through pk table, find uncommitted tuple value 3 ... sleep
 T2:   scan through fk table, find uncommitted tuple value 2 ... sleep
-- 
Rod Taylor [EMAIL PROTECTED]

PGP Key: http://www.rbt.ca/rbtpub.asc


signature.asc
Description: This is a digitally signed message part