The problem with assigning consecutive IDs is that it limits scalability by serializing all inserts on some document that contains the next/last ID.

There is a common design pattern for this sort of operation that begins with a unique ID for each inserted document, generated in one of the following ways.

1. Use random IDs generated by xdmp:random().
2. Use a hash of something within the document: xdmp:hash64($s as xs:string)

The ID is then used as part of the document's URI.

You can test for the existence of a document before creating it, and in doing so acquire a read lock on the URI. That lock will be upgraded to a write lock when you insert the document, and in doing so serialize the transaction with any other inserts that try to create a document with the same ID.

If the document already exists when you check for its existence, you need to choose a different ID. A recursive ID selection function is useful for this.

   declare function choose-uri() as xs:string
   {
      let $uri := fn:concat("/document-", xdmp:random(), ".xml")
      return if (fn:exists(fn:doc($uri))) then choose-uri() else $uri
   };

It's possible two transactions to simultaneously attempt to insert documents with the same ID. They'll both take the read lock, and then both attempt to upgrade to a write lock. One will succeed, the other will be restarted and (in the case of random) choose a different ID.

If the ID cannot determine the document URI but is part of the inserted document, you can use xdmp:lock-for-update() to lock a URI derived from the ID even if you don't actually insert a document at that URI. That operation paired with a search for a document containing the chosen ID ensures no two documents will be inserted with the same ID.

Wayne.


On 03/11/2011 09:00 PM, Tim Meagher wrote:

Hi Folks,

I would like to auto-generate unique identifiers for XML documents, but I need to prevent them from being assigned to multiple documents. For example, when inputting information into a form, the unique ID must be filled in prior to saving the form, but such that other forms being created in sequence will obtain the next available unique ID.

I’ll probably create a list of unique IDs to work with(as they will map to other unique record identifiers) from which I can set the next available unique ID as record IDs are added, but again I would like to ensure no conflicts. For example:

Record ID            Unique ID

=======             ========

RecA1                   UID01

RecB2                    UID02

RecA2                   UID03

RecC1                    UID04

and so on such that adding the next Record ID will automatically have a UID of UID05 assigned to it. I recognize that it is simple enough to do this unless an attempt to add 2 new records occurs simultaneously.

Thank you!

Tim Meagher


--
Wayne Feick
Principal Engineer
MarkLogic Corporation
Phone +1 650 655 2378
Cell +1 408 981 4576
www.marklogic.com

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to