Re: using database generated PK

Andrus Adamchik Wed, 09 Dec 2015 22:45:02 -0800

Assigning a PK before commit is a good feature on its own. Still not sure it 
will solve the cyclic graphs issue. Consider the following cases:


1. Cycles between entities, but not between the objects. Can be handled by 
making Ashwood smarter.

2. Optional FK, which can be handled via INSERT then UPDATE:

* insert department with null manager
* insert a person as a department member
* then update department to set the person as a manager

3. Same as #2, but department.manager_id is mandatory. There's no valid 
operation ordering.

So #1 and #2 do not require us to know the PK upfront. And #3 can't be solved 
even if we do.

Andrus

> On Dec 8, 2015, at 2:53 PM, Michael Gentry <mgen...@masslight.net> wrote:
> 
> Switched to dev...
> 
> There are two basic types of database-generated keys:
> 
> 1) Auto-increment (such as MySQL and H2)
> 2) Sequences (such as PostgreSQL and Oracle)
> 
> #1 has to be done row-by-row per insert.
> #2 can handle "batches" (you define how many PKs to cache) and is similar
> in operation to AUTO_PK_SUPPORT.
> 
> In theory, #2 should be faster since the PKs/FKs can be assigned in advance
> before contacting the DB instead of having to wait on a reply back from the
> DB.  Remember, Cayenne can't insert a record with a FK to a PK until that
> PK is known, so it has to get the auto-increment PK value from the DB and
> fill in the FK before it can do another insert.
> 
> One limitation we currently have is handling cyclic graphs:
> 
> A:FK -> B:PK
> B:FK -> C:PK
> C:FK -> A:FK
> 
> With auto-increment, there isn't a good solution to this that I can think
> of, especially if DB constraints are in play (if an FK is required, you
> can't insert a NULL FK, then do an update in the same transaction after you
> know it).  With sequences (or AUTO_PK_SUPPORT), this would actually be
> possible, because we generate the PKs prior to the inserts, therefore we
> can assign all the FKs prior to inserts, too.  However the last time I
> looked at the code, I don't believe it is smart enough to have two
> different paths of execution, so we still can't handle cyclic graphs (plus
> Ashwood fails).
> 
> Any thoughts on the value of investigating changing this?
> 
> Thanks,
> 
> mrg
> 
> 
> On Wed, Dec 2, 2015 at 10:08 PM, John Huss <johnth...@gmail.com> wrote:
> 
>> The AUTO_PK_SUPPORT table exists only to support PK generation for DBs that
>> do not have native support for it built in.  However, in this day and age
>> any database worth using has this built in.  I've run production apps with
>> Postgres, MySQL, FrontBase, and Interbase / Firebird, and all of them have
>> native PK generation.  Heck, even Derby, H2, and HSQLDB support it.
>> 
>> The main advantage of a native solution is that you know it works.  If you
>> ever used native PK generation the AUTO_PK_SUPPORT table feels very
>> kludgy.  It works (I guess?), but it's not ideal.  A native solution is
>> faster and simpler.  It handles transactions better.  Most DBs (all?) do PK
>> generation outside of transactions, so incrementing the sequence always
>> persists regardless of transaction rollbacks or commits.
>> 
>> Performance is better because a native DB implementation is able to
>> implement it with as little locking as necessary, which is certain to be
>> much faster that the equivalent set of SQL commands.
>> 
>> Another advantage is being able to use the same PK generators to insert
>> data directly with SQL, which is sometimes necessary.  Also, maintenance is
>> easier since there are simple built in functions to query or manipulate the
>> sequence value.
>> 
>> MySQL's implementation is different than the others I've used.  There's is
>> more magical.  The component parts aren't accessible to developers.  The
>> engine just magically chooses the next highest PK value when rows are
>> inserted without giving any idea where this number comes from or how it is
>> chosen.  This has some benefits since the PKs don't have to be explicitly
>> generated ahead of time.  I don't know if it's better, but it's different
>> anyhow.
>> 
>> On Wed, Dec 2, 2015 at 3:30 AM Aristedes Maniatis <amania...@apache.org>
>> wrote:
>> 
>>> I've always used the AUTO_PK_SUPPORT approach to PK generation, all the
>>> way back from WebObject EOF days many many years ago.
>>> 
>>> However a recently discovered bug in this approach [1] and how it
>>> intersects with transactions may require use to think about going over to
>>> database generated primary keys with a auto-increment column. For
>>> developers not using an ORM this is pretty common I understand.
>>> 
>>> What are the downsides to that approach? Is performance, reliability and
>>> everything else just the same? What problem was AUTO_PK_SUPPORT
>> originally
>>> designed to solve?
>>> 
>>> Cheers
>>> Ari
>>> 
>>> 
>>> [1] https://issues.apache.org/jira/browse/CAY-2040
>>> 
>>> 
>>> --
>>> -------------------------->
>>> Aristedes Maniatis
>>> GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
>>> 
>>

Re: using database generated PK

Reply via email to