I have committed the patches to hsqldb-dev HEAD and investigated the issues
of circular references to some extent.

On of the issues that I took into consideration for ON DELETE CASCADE was to
avoid heavy use of memory. The tables can be CACHED and quite often deleting
one row in a top level table will result in a very large number of rows
deleted. Also I wrote the code before implementing foreign keys that
reference the same table and also before I added support for ALTER TABLE ADD
CONSTRAINT for 'forward referencing' foreign keys.

As a result, there are issues with circular references to rows, both for
FK's contained within the same table and those that form a chain in a ring
topology.

It think we can rework the design by keeping a stack of the FK constraints
visited in the course of recursion, together with the row that was being
examined for each constraint. When it is detected that the same constraint
as visited before is being dealt with, and the row is the same, the
recursion can stop. This will take up more memory than the present design
but as it does not keep references to the leaves in the 'tree' of visited
rows it should be OK in most circumstances. The most likely exception,
however, is the self referencing table.

Another issue that Sebastian had raised, is the fact that row updates are
handled as a delete followed by an insert. I propose to implement an
updateRow() method that can be called from checkCascadeUpdate() instead of
first deleting a set of rows and then inserting modified tuples as new rows.
This is essential if we want to avoid massive memory overheads, especially
with CACHED tables.

The next issue is the SET DEFAULT problem. In general, SET DEFAULT is not
used as often as the rest of the rules and we can probably do without it
altogether.

When SET DEFAULT is specified, the database design relies on the perpetual
existence of a parent row containing the default value.

Apart from what Sebastian mentioned, a condition may arise while performing
deletes in which a parent row exists but there are no child rows in the
referencing table. In my opinion, this row should not be deleted as doing so
would render the SET DEFAULT condition useless in the future.

The last point is regarding SET REFERENTIAL_INTEGRITY FALSE. Our design
should be such that no problems would arise if the user has inserted rows
containing invalid values. The engine should handle the compliant rows
correctly and leave the rest alone.

BTW, should we allow a database design that _requires_ disabling referential
integrity checks at some point? Sebastian's example can be populated by
inserting (1,null) first and then updating the row to (1,1) without
disabling integrity violation checks. Only ring topologies with a reference
chain such as one below need it to be disabled:

t1(x) -> t2(y) -> t3(z) -> t1(x)

Fred Toussi

------------------


fredt wrote:

I will apply these patches to the CVS soon.

I think we can avoid infinite recursion in cascading deletes fairly easily.
Regarding other actions I must see what can be done. In general, if we can
use the graph of references as opposed to the actual rows, the solution
would be more universal and have no need for marking individual rows.

------------------

Sebastian wrote:

  Hi ....

  Did some more work on the
  'ON [DELETE|UPDATE} SET [NULL|DEFAULT]' stuff.

  1)  processCreateTable and  processCreateFk now understand
      the SET [DEFAULT|NULL] option.

  2)  procressCreateTable throws
      a COLUMN_TYPE_MISMATCH exception if someone tries
      to specify an 'SET DEFAULT' for a column without an
      explicit default value

  3)  checkCascadeUpdate has been extendet to set the
      FK values to NULL or the default value if the
      contraint says so.

  4)  checkCascadeDelete merely switches over to
      checkCascadeUpdate when the constraint is a
      'SET [NULL|DEFAULT]' constraint.


  There are however a bunch of issues with the whole
  ON [UPDATE|DELETE] scheme which I believe are not easily
  solved without mayor redesigns. Most of the stuff is
  related to self referential foreign keys and the fact
  that updates are actually sequences of delete/insert
  actions.

  1) Deleting/Updating self referential records results
     in infinite recursion. i.e:

     If I have a table like

     create table a(a int primary key,
                    b int,
                    foreign key(b)
                     references a(a)
                      on update cascade
                      on delete cascade)

    And If I inserted a tuple like (1,1)
    into the table (by turning of referentiual integrity checks)
    i end up in endless recursion when trying to delete/modify
    the record. This was already true befor I've added my stuff.

    I actually use records like these in my application, where I
    build up a tree structure with self referential foreign keys
    The root node in the tree is defined as a record refering to
    itself.

  2) If I have a table like

      create table a(a int primary key,
                    b int default 99,
                    foreign key(b)
                     references a(a)
                      on update set default
                      on delete set default)

     and the tuples are (once again build up by turning of
     referential integrity for a while)

     (1,1)
     (2,1)
     (3,1)

     I may try to do an update like:

     UPDATE A set A=33 where A=3

     This is forbidden because an 'SET DEFAULT' to the value 99
     is not permitted since there is not value a=99 allowing
     the update of b to the default value 99.

     If however the tuples are.

     (1,1)
     (2,1)
     (3,1)
     (99,1)

     It would be fine to do the

     UPDATE A set A=33 where A=3

     The result would be

     (1,1)
     (2,1)
     (3,99)
     (99,1)

     BUT

     Now its perefrctly fine to do an

     DELETE FROM A where A=99

     ceckUpdateDelete checks if there are
     records refering to the (99,1) tuple and
     finds (3,99). The contraint dictates that
     b whould be set to 99. It therefor checks
     if this is permitted and finds the tuple (99,1).

    3) The only way of solving most of these issues
       within the current design would be some
       'will be deleted' flag we might add to the tuples
       while checking referebtial integrity.


   Cheers

   Sebastian


--
**********************************
Dr. Sebastian Kloska
Head of Bioinformatics
Scienion AG
Volmerstr. 7a
12489 Berlin
phone: +49-(30)-6392-1708
fax:   +49-(30)-6392-1701
http://www.scienion.de
**********************************



-------------------------------------------------------
This sf.net email is sponsored by: 
To learn the basics of securing your web site with SSL, 
click here to get a FREE TRIAL of a Thawte Server Certificate: 
http://www.gothawte.com/rd522.html
_______________________________________________
hsqldb-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers

Reply via email to