Yeah... I've seen stored procs like this on more than one occasion -

DECLARE @newId INT
SELECT @newId = MAX(idCol) FROM IdTable
INSERT INTO IdTable (IdTable, ... ) VALUES ( @newId +1, ... )
RETURN @newId + 1 (or SELECT @newId + 1 as OUTPUT)

*untested of course, trying to remember off the top of my head...

There are a number of things that can go wrong. First off... the
database I worked with primarily (Sybase) does not lock other
transactions from reading the data, even during the update. Even if
you wrap it in a transaction, the data is still not locked until the
UPDATE, and even then, it may not be locked from reads at all. So, two
simultaneous calls to the stored proc would execute as follows -

proc1 -> SELECT @newId = MAX(idCol) FROM IdTable
proc2 -> SELECT @newId = MAX(idCol) FROM IdTable
proc1 -> INSERT INTO IdTable (IdTable, ... ) VALUES ( @newId +1, ... )
proc2 -> INSERT INTO IdTable (IdTable, ... ) VALUES ( @newId +1, ... )

as you can see, they'll both return the same number and set the column
to the same number, which is generally not what you want to happen.
Another approach that still leads to concurrency problems is using a
separate table for generating IDs, much like the
OPENJPA_SEQUENCE_TABLE. The problem isn't in the design, but given a
table, a stored proc (or whatever you use) that does the following
will also break -

DECLARE @newId INT
SELECT @newId = MAX(idCol) FROM IdTable WHERE idColName = @procParam
UPDATE IdTable SET idCol = @newId +1 WHERE idColName = @procParam
RETURN @newId + 1 (or SELECT @newId + 1 as OUTPUT)

Given the same logic as above, duplicate ids can be returned. I don't
know for sure about other databases, but I do know that Sybase doesn't
lock the IdTable in this example from other
threads/connections/whatever from reading the data during a
transaction. Being faced with this problem, I grabbed a Sybase
employee at a Sybase training and nailed her down on this... We came
up with the following solution for this -

DECLARE @newId INT
UPDATE IdTable SET idCol = idCol +1, @newId = idCol WHERE idColName = @procParam
RETURN @newId + 1 (or SELECT @newId + 1 as OUTPUT)

Succinct, transaction/concurrency safe, etc. This doesn't solve the
problem of using MAX() though. In that case, I would suggest that
using an IDENTITY field (or whatever your database's analog is). There
are other problems you will deal with, but they tend to not be as big
as generating dupe IDs during high load.

-Wes

On Tue, May 19, 2009 at 8:04 AM, Daryl Stultz <[email protected]> wrote:
> On Sun, May 17, 2009 at 1:30 PM, Wes Wannemacher <[email protected]> wrote:
>
>> That is generally a bad idea. Select max(row) will generally initiate
>> a full index scan or, even worse, a full table scan.
>>
>
> Isn't it a greater problem that this approach is a race condition? 2 threads
> call max(id) about the same time and then try to insert max + 1 one just
> after the other.
>
> --
> Daryl Stultz
> _____________________________________
> 6 Degrees Software and Consulting, Inc.
> http://www.6degrees.com
> mailto:[email protected]
>



-- 
Wes Wannemacher
Author - Struts 2 In Practice
Includes coverage of Struts 2.1, Spring, JPA, JQuery, Sitemesh and more
http://www.manning.com/wannemacher

Reply via email to