Tim, First, to your particular question: all updates in a single call to MarkLogic will be a single transaction (fully atomic) by default. So if you have a process (synchronous or asynchronous) that updates a document and also updates its status in another document, they will happen atomically (all or nothing).
You may find that CPF (content processing framework) already does what you need, however. With CPF you move a document through a set of states to represent a workflow and state transitions trigger asynchronously. CPF has triggers to ensure actions happen, restores its state on system restart automatically (since asynchronous tasks on the task server do not persist across a restart), and tracks state in a properties fragment associated with each document. I tracks any errors or problems in the properties fragment too. As a gotcha, be sure that if you have high update volumes you do not cause lock contention or deadlock-induced retries by updating a single status record for many documents. If, OTOH, you have low transaction volumes, you may want to put the status right on the document after all, since it's simpler but does incur slightly more write overhead. Yours, Damon -- Damon Feldman Sr. Principal Consultant, MarkLogic From: [email protected] [mailto:[email protected]] On Behalf Of Tim Sent: Saturday, February 23, 2013 3:16 PM To: 'MarkLogic Developer Discussion' Subject: [MarkLogic Dev General] Asyncronous Status Updates Hi Folks, I have a question about best practices for maintaining the state of a document. In a SQL world, I track document statuses using a control table. I find it useful to likewise track status separately from documents via a status record in MarkLogic so that for example, I don't need to update a document when performing quality control. In addition, I can maintain a set of records to track the history of a document and refer to saved instances of the document at each touch point in a workflow where I really do want to retain a copy of the document whenever a change has taken place as referenced by the current state and document URI as well as other important information such as ownership, date/time stamp, etc. However, there are some asynchronous back-end processing actions that can be taken on the document which can be spawned concurrently with updates made to the status table when each completes. I want to make sure that I understand the concurrency issues related to updates top the status record. I think I can assume that there really won't be any need for a locking mechanism, that is that each response will update the status table atomically. I plan to have separate statuses for each of the asynchronous events as the completion of all such statues will indicate that the record is ready for the next stage. Thanks for any suggestions and insight into this! Tim
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
