We have what I consider to be an interesting issue with an XQuery that is run 
as a REST endpoint, basically we have at least two race-conditions that we have 
identified. Typically I would fix these by enforcing something like a Critical 
Section in the code through appropriate locking. 
Unfortunately after lots of head scratching and re-reading of documentation I 
cannot at the moment see how to solve this with the facilities provided in 
MarkLogic and am looking for some guidance. I guess this is a common issue that 
others must have solved before, so I am most likely missing something obvious!

Our REST endpoint effectively does the following:

1) XQuery REST Endpoint - receives an XML document over HTTP POST. Let's call 
this document B.
2) Searches the database for an existing document, which has an <id> element 
with the same value as that in document B. Assuming we find a document, let us 
call that document A.
3) Check the version of document B against document A. The version is indicated 
in a <version> element in each document respectively. The version of document B 
should be newer than document A, if not then stop, else continue.
4) Remove document A from the 'live' collection
5) Insert document B into the database and add it to the 'live' collection.

Now this REST end-point may be called by many clients in parallel, which means 
not just adding the new document B, but in parallel running the above query for 
document C, D, E ... nN. I think we are seeing three separate race conditions 
appearing:

i) Steps (4) and (5) where the same version of the document with the same id 
can be inserted into the live collection. Typically step (4) tries to ensure 
there is only one live version by removing the old document (document A) from 
the live collection, before adding the new document (document B) to the live 
collection.

ii) Steps (3) and (5) where multiple versions can be inserted into the live 
collection.

iii) Steps (3) and (5) where sometimes an older version is inserted after a 
newer version.

I believe that due to the number of client requests, we are effectively seeing 
threads pre-empt other threads within this query and because no explicit 
locking has yet been added to the system, we have problems.

How can I make the steps (1) through (5) thread-safe?

I have tried adding xdmp:transaction-mode "update"; to my REST query, and using 
an explicit xdmp:commit at the end. This has not helped at all, but I think 
that is because we are never writing the same document, every document we write 
in steps (4) and (5) will always have a different URI in the database. I think 
really that we need to be able to lock based on an abstract uri (e.g. the 
content of our id element) and not the document uri as that varies over time in 
our model.

I also looked at xdmp:lock-acquire, but it appears the locks are shared for a 
single user, i.e. it states - "When a user locks a URI, it is locked to other 
users, but not to the user who locked it", the problem I have here is that this 
is a public un-authenticated REST end-point effectively so it will always be 
the same user running the query as far as ML is concerned.

Does anyone have any suggestions of how we might achieve what we are looking 
for?

Cheers Adam.

DISCLAIMER
This message is intended only for the use of the person(s) ("Intended 
Recipient") to whom it is addressed. It may contain information, which is 
privileged and confidential. Accordingly any dissemination, distribution, 
copying or other use of this message or any of its content by any person other 
than the Intended Recipient may constitute a breach of civil or criminal law 
and is strictly prohibited. If you are not the Intended Recipient, please 
contact the sender as soon as possible.
Reed Business Information Limited. Registered Office: Quadrant House, The 
Quadrant, Sutton, Surrey, SM2 5AS, UK. 
Registered in England under Company No. 151537

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to