The problem with assigning consecutive IDs is that it limits scalability
by serializing all inserts on some document that contains the next/last ID.
There is a common design pattern for this sort of operation that begins
with a unique ID for each inserted document, generated in one of the
following ways.
1. Use random IDs generated by xdmp:random().
2. Use a hash of something within the document: xdmp:hash64($s as xs:string)
The ID is then used as part of the document's URI.
You can test for the existence of a document before creating it, and in
doing so acquire a read lock on the URI. That lock will be upgraded to a
write lock when you insert the document, and in doing so serialize the
transaction with any other inserts that try to create a document with
the same ID.
If the document already exists when you check for its existence, you
need to choose a different ID. A recursive ID selection function is
useful for this.
declare function choose-uri() as xs:string
{
let $uri := fn:concat("/document-", xdmp:random(), ".xml")
return if (fn:exists(fn:doc($uri))) then choose-uri() else $uri
};
It's possible two transactions to simultaneously attempt to insert
documents with the same ID. They'll both take the read lock, and then
both attempt to upgrade to a write lock. One will succeed, the other
will be restarted and (in the case of random) choose a different ID.
If the ID cannot determine the document URI but is part of the inserted
document, you can use xdmp:lock-for-update() to lock a URI derived from
the ID even if you don't actually insert a document at that URI. That
operation paired with a search for a document containing the chosen ID
ensures no two documents will be inserted with the same ID.
Wayne.
On 03/11/2011 09:00 PM, Tim Meagher wrote:
Hi Folks,
I would like to auto-generate unique identifiers for XML documents,
but I need to prevent them from being assigned to multiple documents.
For example, when inputting information into a form, the unique ID
must be filled in prior to saving the form, but such that other forms
being created in sequence will obtain the next available unique ID.
I’ll probably create a list of unique IDs to work with(as they will
map to other unique record identifiers) from which I can set the next
available unique ID as record IDs are added, but again I would like to
ensure no conflicts. For example:
Record ID Unique ID
======= ========
RecA1 UID01
RecB2 UID02
RecA2 UID03
RecC1 UID04
and so on such that adding the next Record ID will automatically have
a UID of UID05 assigned to it. I recognize that it is simple enough
to do this unless an attempt to add 2 new records occurs simultaneously.
Thank you!
Tim Meagher
--
Wayne Feick
Principal Engineer
MarkLogic Corporation
Phone +1 650 655 2378
Cell +1 408 981 4576
www.marklogic.com
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general