Tim,

First, to your particular question: all updates in a single call to MarkLogic 
will be a single transaction (fully atomic) by default. So if you have a 
process (synchronous or asynchronous) that updates a document and also updates 
its status in another document, they will happen atomically (all or nothing).

You may find that CPF (content processing framework) already does what you 
need, however. With CPF you move a document through a set of states to 
represent a workflow and state transitions trigger asynchronously. CPF has 
triggers to ensure actions happen, restores its state on system restart 
automatically (since asynchronous tasks on the task server do not persist 
across a restart), and tracks state in a properties fragment associated with 
each document. I tracks any errors or problems in the properties fragment too.

As a gotcha, be sure that if you have high update volumes you do not cause lock 
contention or deadlock-induced retries by updating a single status record for 
many documents. If, OTOH, you have low transaction volumes, you may want to put 
the status right on the document after all, since it's simpler but does incur 
slightly more write overhead.

Yours,
Damon

--
Damon Feldman
Sr. Principal Consultant, MarkLogic


From: [email protected] 
[mailto:[email protected]] On Behalf Of Tim
Sent: Saturday, February 23, 2013 3:16 PM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] Asyncronous Status Updates

Hi Folks,

I have a question about best practices for maintaining the state of a document. 
 In a SQL world, I track document statuses using a control table.  I find it 
useful to likewise track status separately from documents via a status record 
in MarkLogic so that for example, I don't need to update a document when 
performing quality control.  In addition, I can maintain a set of records to 
track the history of a document and refer to saved instances of the document at 
each touch point in a workflow where I really do want to retain a copy of the 
document whenever a change has taken place as referenced by the current state 
and document URI as well as other important information such as ownership, 
date/time stamp, etc.

However, there are some asynchronous back-end processing actions that can be 
taken on the document which can be spawned concurrently with updates made to 
the status table when each completes.  I want to make sure that I understand 
the concurrency issues related to updates top the status record.  I think I can 
assume that there really won't be any need for a locking mechanism, that is 
that each response will update the status table atomically.   I plan to have 
separate statuses for each of the asynchronous events as the completion of all 
such statues will indicate that the record is ready for the next stage.

Thanks for any suggestions and insight into this!

Tim

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to