Hi again,
To me, this is the same as locking the file, except that you are
possibly letting someone spend wasted time editing a doc only to lose
their changes if not up-to-date. As you say it is rare, but just wait
till you hear from someone who spends 10 minutes editing a file only
to see all the work lost.
best,
-Rob
On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
Good morning all! Sorry to cause such a stir. Upon reading your
responses, I feel you've gotten the wrong idea, which is probably
due to communication failure on my part.
My idea of sequential ids is one 'special' document, for example /
id.xml, which contains nothing but <id>42</id>, and an id() function
which exclusive-locks the file, yanks 42 out, increments it,
replaces the text node with 43, and unlocks the file. My
environment is read-heavy, write-light, so although write operations
which require a unique id would touch this file, I don't think it
would be an awful bottleneck. This guaranteed unique ids without
having to ever worry about collisions.
Of course, the counter-argument is that since it's a write-light
environment, the chances of using random() and lighting striking
twice, as Michael put it, are infinitesimally small. I don't truly
have a problem with using random ids, I'm just saying it's worth
noting that it is *impossible* for lighting to strike twice with
sequential ids.
Eric
Wayne Feick wrote:
Hi Eric,
A disadvantage of sequential ids is that you can end up read
locking all of your documents in order to find the current max id.
You can address this partially by moving the next id into a
separate document, but that document can still become a bottleneck
if you have a high insertion rate. You could also address this by
creating a range index on the id and using cts:element-values() or
cts:element-attribute-values() to find the max.
By switching to random ids, you get better parallelism since our
indexes can quickly determine if the id is already in use and will
lock at most one document (or 0 if your existing id search is
unfiltered). There is still a vanishingly small probability that
two competing threads would allocate the same random id at the same
moment in time, but that is improbable enough to be ignored.
Wayne.
On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
Wow, thanks for the reply, Michael. I'll probably be using some
variation of one of your examples.
Michael Blakeley wrote:
> Many people ask about sequential ids. It is possible to model an
id > sequence as a database document. But as with RDBMS sequences,
there are > serialization penalties. I don't see the advantage of
sequential ids, so > I rarely, if ever, use this approach.
Assuming the recursive check isn't feasible (it doesn't scale
well), the advantage of sequential ids is being able to sleep at
night knowing collisions are simply impossible, and are not
reliant on a 'good-enough' random() function. I'm nit-picking of
course, I'm sure random() is fine. :)
Cheers,
Eric
_______________________________________________
General mailing list
General@developer.marklogic.com <mailto:General@developer.marklogic.com
>
http://xqzone.com/mailman/listinfo/general
------------------------------------------------------------------------
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general