Re: [HACKERS] Multi-table-unique-constraint

2005-11-13 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Maybe the solution is to make inherited tables actually the same table, 
 and jank it with an extra per-row attribute to differentiate them or 
 something :)

Aside from destroying the inheritance-for-partitioning stuff, this
wouldn't work for multiple inheritance, so I'm afraid it's not a very
attractive alternative.

Matt's idea about keeping the indexes separate seems that it probably
*would* work, modulo some lingering worries about when to take what kind
of lock on the index-set-as-a-whole.  It seems worth pursuing, anyway.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Multi-table-unique-constraint

2005-11-13 Thread Andrew Dunstan



Tom Lane wrote:


Matt Newell [EMAIL PROTECTED] writes:
 


BTW, i'm on the list now, so no need to cc me.
   



Common practice around here is to cc people anyway --- this has grown
out of a history of occasionally-slow list mail delivery.  If you don't
want it, best to fix it in your mail filters rather than expecting
people to change habits for you.


 



You can also change your subscriptions so that you don't get a copy from 
the list if you are in the To: or Cc: lists. I find this is better then 
having to set up filters. Visit 
http://mail.postgresql.org/mj/mj_wwwusr/domain=postgresql.org


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Multi-table-unique-constraint

2005-11-12 Thread Christopher Kings-Lynne

Most of the people who have thought about this have figured that the
right solution involves a single index spanning multiple tables (hence,
adding a table ID to the index entry headers in such indexes).  This
fixes the lookup and entry problems, but it's not any help for the
lock-against-schema-mods problem, and it leaves you with a real headache
if you want to drop just one of the tables.

'Tis a hard problem :-(


Maybe the solution is to make inherited tables actually the same table, 
and jank it with an extra per-row attribute to differentiate them or 
something :)


Might make constraint_exclusion less useful then.

Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Multi-table-unique-constraint

2005-11-11 Thread Matt Newell
On Thursday 10 November 2005 15:58, you wrote:
  The multi-table-unique-constraint problem has to
  be solved before we can even think much about multi-table FKs :-(
 
  Do you have ideas about how this should be solved?

 There's some discussions in the pghackers archives --- look for
 multi-table index and similar phrases.  But if anyone had had
 a really decent plan, it would have got done by now :-(


Are multi-table indexes really required?  After reading the code some more, I 
came across this comment in nbtinsert.c:_bt_doinsert

  * NOTE: obviously, _bt_check_unique can only detect keys that are already in
  * the index; so it cannot defend against concurrent insertions of the
  * same key.  We protect against that by means of holding a write lock on
  * the target page.  Any other would-be inserter of the same key must
  * acquire a write lock on the same target page, so only one would-be
  * inserter can be making the check at one time.  Furthermore, once we are
  * past the check we hold write locks continuously until we have performed
  * our insertion, so no later inserter can fail to see our insertion.
  * (This requires some care in _bt_insertonpg.)

Would it be possible to make another routine that locates and aquires a write 
lock on the page where the key would be inserted in each index(for each table 
in the inheritance), and holds all these locks until the key is inserted into 
the correct index.  It seems this would solve the unique problem without 
changing much else.  


Matt

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Multi-table-unique-constraint

2005-11-11 Thread Tom Lane
Matt Newell [EMAIL PROTECTED] writes:
 Would it be possible to make another routine that locates and aquires
 a write lock on the page where the key would be inserted in each
 index(for each table in the inheritance), and holds all these locks
 until the key is inserted into the correct index.  It seems this would
 solve the unique problem without changing much else.

It's an idea, but you are now staring directly into the hornet's nest:

1. How do you avoid deadlock among multiple processes all doing the
   above for similar (same page anyway) keys?  It's difficult if not
   impossible to ensure that they'll try to take the page locks in
   the same order.

2. What happens when another process is adding/dropping indexes that
   should be in the index set?  In the normal scenario you don't have
   any sort of lock on any of the other tables, only the one you are
   trying to insert into; and so you have no defense against somebody
   changing their schemas, up to and including dropping the index you
   are fooling with.  Adding such locks would increase the deadlock
   hazard.

Also, for many scenarios (including FKs) it's important to be able to
*look up* a particular key, not only to prevent insertion of duplicates.
The above approach would require searching multiple indexes.

Most of the people who have thought about this have figured that the
right solution involves a single index spanning multiple tables (hence,
adding a table ID to the index entry headers in such indexes).  This
fixes the lookup and entry problems, but it's not any help for the
lock-against-schema-mods problem, and it leaves you with a real headache
if you want to drop just one of the tables.

'Tis a hard problem :-(

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Multi-table-unique-constraint

2005-11-11 Thread Matt Newell
On Friday 11 November 2005 11:07, you wrote:

 It's an idea, but you are now staring directly into the hornet's nest:

 1. How do you avoid deadlock among multiple processes all doing the
above for similar (same page anyway) keys?  It's difficult if not
impossible to ensure that they'll try to take the page locks in
the same order.

Isn't all that is required is that they iterate through the indexes in the 
same order.  This shouldn't be hard to do, because the set of indexes is the 
same no matter what table you are inserting into, because the unique 
constraint will apply to all tables both up and down the inheritance tree.  
That order just needs to be stored somewhere.

What if there was a new system relation(pg_indexset) that stores an array of 
index oids.  Each index that is part of an index set has an fkey into this 
table. 

When aquiring the locks on the index pages, you must 
 a) have a ROW SHARE lock on the pg_indexset row for this set,  this
  ensures that the schema won't change from under us.

 b) do so in the order that the index oids are in.

This solves the problem  below also, because you would hold a row exclusive 
lock on the row in this table whenever adding or removing indexes from the 
set.

Now that i think about it some more, i realize that you only need to hold read 
locks on the index pages that you don't plan on actually inserting a new key 
into, which shouldn't cause near as much lock contention as holding write 
locks on multiple indexes' pages.

 2. What happens when another process is adding/dropping indexes that
should be in the index set?  In the normal scenario you don't have
any sort of lock on any of the other tables, only the one you are
trying to insert into; and so you have no defense against somebody
changing their schemas, up to and including dropping the index you
are fooling with.  Adding such locks would increase the deadlock
hazard.

 Also, for many scenarios (including FKs) it's important to be able to
 *look up* a particular key, not only to prevent insertion of duplicates.
 The above approach would require searching multiple indexes.

Why would this be required, if it currently isn't?  I mean you can already do 
Select from parent where key=X; and the planner takes care of scanning 
multiple indexes(or sequence scans).

If it is required though, it should be no more difficult that doing what i 
described above, right?

 Most of the people who have thought about this have figured that the
 right solution involves a single index spanning multiple tables (hence,
 adding a table ID to the index entry headers in such indexes).  This
 fixes the lookup and entry problems, but it's not any help for the
 lock-against-schema-mods problem, and it leaves you with a real headache
 if you want to drop just one of the tables.

It seems that the above solution would be less work, and would keep the data 
separate, which seems to be one of the biggest advantages of the current 
inheritance design. 

 'Tis a hard problem :-(
I think that's why i'm interested:)  I hope that I can succeed so as to not 
have wasted your valuable time. 

BTW, i'm on the list now, so no need to cc me.

Matt

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Multi-table-unique-constraint

2005-11-11 Thread Tom Lane
Matt Newell [EMAIL PROTECTED] writes:
 On Friday 11 November 2005 11:07, you wrote:
 1. How do you avoid deadlock among multiple processes all doing the
 above for similar (same page anyway) keys?

 Isn't all that is required is that they iterate through the indexes in the 
 same order.

Yeah, I was thinking along the same lines.  As long as any one index is
a member of at most one index set, this'd probably work.  (Maybe you
wouldn't even need that restriction if you used a globally defined
ordering, such as always processing the indexes in order by their
pg_class OIDs.)  Some concept of shared and exclusive locks on index
sets (extending only to the membership of the set, not to operations on
the individual member indexes) might fix the schema-change problem, too,
although you still need to think about whether there's a risk of
deadlocks for that.  In the past we've figured that exclusively locking
a table is necessary and sufficient for schema alterations on that
table, but I'm not sure what to do for cross-table index sets.

 What if there was a new system relation(pg_indexset) that stores an array of 
 index oids.  Each index that is part of an index set has an fkey into this 
 table. 

I'd be inclined to think about using pg_inherits instead, ie, pretend
that the child table indexes are inheritance children of the parent
table index.  If this is too inefficient, it suggests that we need to
fix pg_inherits anyway.

 Also, for many scenarios (including FKs) it's important to be able to
 *look up* a particular key, not only to prevent insertion of duplicates.
 The above approach would require searching multiple indexes.
 
 Why would this be required, if it currently isn't?

Well, because we're trying to do something that currently isn't possible?
It might not matter that we don't have a single instant at which we can
swear that the key is not present anywhere in the hierarchy, but I'm not
convinced that this is obviously true.

Your thought about leaving read locks on index pages while searching
other indexes might fix that, though, if it needs fixed at all.

 Most of the people who have thought about this have figured that the
 right solution involves a single index spanning multiple tables (hence,
 adding a table ID to the index entry headers in such indexes).

 It seems that the above solution would be less work, and would keep the data 
 separate, which seems to be one of the biggest advantages of the current 
 inheritance design. 

Yeah, I'm getting more attracted to the idea as I think about it.  Not
so much because it keeps the data separate, as that it avoids needing to
store a table OID in index headers, which has been a principal objection
to cross-table indexes all along because of the space cost.

Probably the next thing to think about is how this would impact the
index AM API.  I'm disinclined to want to put all of this logic inside
the index AMs, so somehow the find and leave page write locked behavior
would need to be exposed in the AM API.  That ties into a larger goal of
not wanting the unique-index behavior to be totally the AM's
responsibility as it is right now --- I dislike the fact that nbtree is
responsible for reaching into the heap to test rows for liveness, for
instance.  If we could separate that out a bit, it might make it easier
to support unique-index behavior in the other AMs.

 BTW, i'm on the list now, so no need to cc me.

Common practice around here is to cc people anyway --- this has grown
out of a history of occasionally-slow list mail delivery.  If you don't
want it, best to fix it in your mail filters rather than expecting
people to change habits for you.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match