On Oct 15, 2008, at 10:09 AM, Eric Palmitesta wrote:
Rob,
I think so far we're talking about insertion, not editing.
OK, that wasn't what I was understanding. And sorry to keep coming
back, but I want to understand what I am missing.
Assuming you don't mean inserting in an existing document (which I
understand to be editing), and you are just inserting a new document,
how would you have an ID to compare against? And, why isn't a URI good
enough?
best,
-Rob
What you're referring to is a whole other can of worms. I've
implemented something like a lock-less editor before (java-based
website, nothing to do with xquery) which, upon saving an edited
document, would check to see if the timestamp on the document has
changed while your editing was taking place. If so, it would hold
onto the data and say "Hey, someone edited and saved the doc you're
editing and trying to save now. I've recovered your data though, we
can proceed from here". This was for a relatively low-traffic app,
though.
I think someone described something similar to this not too long ago
on this mailing list, although I can't find that email now.
Eric
Robert Koberg wrote:
Hi again,
To me, this is the same as locking the file, except that you are
possibly letting someone spend wasted time editing a doc only to
lose their changes if not up-to-date. As you say it is rare, but
just wait till you hear from someone who spends 10 minutes editing
a file only to see all the work lost.
best,
-Rob
On Oct 15, 2008, at 9:41 AM, Eric Palmitesta wrote:
Good morning all! Sorry to cause such a stir. Upon reading your
responses, I feel you've gotten the wrong idea, which is probably
due to communication failure on my part.
My idea of sequential ids is one 'special' document, for example /
id.xml, which contains nothing but <id>42</id>, and an id()
function which exclusive-locks the file, yanks 42 out, increments
it, replaces the text node with 43, and unlocks the file. My
environment is read-heavy, write-light, so although write
operations which require a unique id would touch this file, I
don't think it would be an awful bottleneck. This guaranteed
unique ids without having to ever worry about collisions.
Of course, the counter-argument is that since it's a write-light
environment, the chances of using random() and lighting striking
twice, as Michael put it, are infinitesimally small. I don't
truly have a problem with using random ids, I'm just saying it's
worth noting that it is *impossible* for lighting to strike twice
with sequential ids.
Eric
Wayne Feick wrote:
Hi Eric,
A disadvantage of sequential ids is that you can end up read
locking all of your documents in order to find the current max
id. You can address this partially by moving the next id into a
separate document, but that document can still become a
bottleneck if you have a high insertion rate. You could also
address this by creating a range index on the id and using
cts:element-values() or cts:element-attribute-values() to find
the max.
By switching to random ids, you get better parallelism since our
indexes can quickly determine if the id is already in use and
will lock at most one document (or 0 if your existing id search
is unfiltered). There is still a vanishingly small probability
that two competing threads would allocate the same random id at
the same moment in time, but that is improbable enough to be
ignored.
Wayne.
On Tue, 2008-10-14 at 13:07 -0400, Eric Palmitesta wrote:
Wow, thanks for the reply, Michael. I'll probably be using some
variation of one of your examples.
Michael Blakeley wrote:
> Many people ask about sequential ids. It is possible to model
an id > sequence as a database document. But as with RDBMS
sequences, there are > serialization penalties. I don't see the
advantage of sequential ids, so > I rarely, if ever, use this
approach.
Assuming the recursive check isn't feasible (it doesn't scale
well), the advantage of sequential ids is being able to sleep at
night knowing collisions are simply impossible, and are not
reliant on a 'good-enough' random() function. I'm nit-picking
of course, I'm sure random() is fine. :)
Cheers,
Eric
_______________________________________________
General mailing list
General@developer.marklogic.com <mailto:General@developer.marklogic.com
>
http://xqzone.com/mailman/listinfo/general
------------------------------------------------------------------------
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general