We have what I consider to be an interesting issue with an XQuery that is run
as a REST endpoint, basically we have at least two race-conditions that we have
identified. Typically I would fix these by enforcing something like a Critical
Section in the code through appropriate locking.
Unfortunately after lots of head scratching and re-reading of documentation I
cannot at the moment see how to solve this with the facilities provided in
MarkLogic and am looking for some guidance. I guess this is a common issue that
others must have solved before, so I am most likely missing something obvious!
Our REST endpoint effectively does the following:
1) XQuery REST Endpoint - receives an XML document over HTTP POST. Let's call
this document B.
2) Searches the database for an existing document, which has an <id> element
with the same value as that in document B. Assuming we find a document, let us
call that document A.
3) Check the version of document B against document A. The version is indicated
in a <version> element in each document respectively. The version of document B
should be newer than document A, if not then stop, else continue.
4) Remove document A from the 'live' collection
5) Insert document B into the database and add it to the 'live' collection.
Now this REST end-point may be called by many clients in parallel, which means
not just adding the new document B, but in parallel running the above query for
document C, D, E ... nN. I think we are seeing three separate race conditions
appearing:
i) Steps (4) and (5) where the same version of the document with the same id
can be inserted into the live collection. Typically step (4) tries to ensure
there is only one live version by removing the old document (document A) from
the live collection, before adding the new document (document B) to the live
collection.
ii) Steps (3) and (5) where multiple versions can be inserted into the live
collection.
iii) Steps (3) and (5) where sometimes an older version is inserted after a
newer version.
I believe that due to the number of client requests, we are effectively seeing
threads pre-empt other threads within this query and because no explicit
locking has yet been added to the system, we have problems.
How can I make the steps (1) through (5) thread-safe?
I have tried adding xdmp:transaction-mode "update"; to my REST query, and using
an explicit xdmp:commit at the end. This has not helped at all, but I think
that is because we are never writing the same document, every document we write
in steps (4) and (5) will always have a different URI in the database. I think
really that we need to be able to lock based on an abstract uri (e.g. the
content of our id element) and not the document uri as that varies over time in
our model.
I also looked at xdmp:lock-acquire, but it appears the locks are shared for a
single user, i.e. it states - "When a user locks a URI, it is locked to other
users, but not to the user who locked it", the problem I have here is that this
is a public un-authenticated REST end-point effectively so it will always be
the same user running the query as far as ML is concerned.
Does anyone have any suggestions of how we might achieve what we are looking
for?
Cheers Adam.
DISCLAIMER
This message is intended only for the use of the person(s) ("Intended
Recipient") to whom it is addressed. It may contain information, which is
privileged and confidential. Accordingly any dissemination, distribution,
copying or other use of this message or any of its content by any person other
than the Intended Recipient may constitute a breach of civil or criminal law
and is strictly prohibited. If you are not the Intended Recipient, please
contact the sender as soon as possible.
Reed Business Information Limited. Registered Office: Quadrant House, The
Quadrant, Sutton, Surrey, SM2 5AS, UK.
Registered in England under Company No. 151537
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general