Thanks Dan, this is what I was looking for.  Presuming Properties are managed 
internally similar to a seperate XML document It makes sense to make use of 
them.  The only drawback is that if I delete the document I delete the 
properties.  But for this project there is no requirement to maintain update 
history of deleted documents.
I didnt think of using a range index, thats a nice optimization.


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: Thursday, April 15, 2010 8:34 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] RE: Update Logs ... 

Hi David,

When you say you want to store the update history, do you mean you want to 
store just when the document was updated or do you mean you actually want to 
store the change deltas?

I think properties will work either way, but if you just need to store when the 
doc was updated for each update, that is probably pretty easy to do (and if you 
just needed to answer "all documents updated since XXX" then the last-modified 
system-updated property may even be enough).

If you store the updates as some datetime format and then put a range index on 
that property (or put a range index on the prop:last-modified element if the 
last-updated is good enough), then a simple range-query search against 
properties will give you what you want I think.

Off the top of my head, you can put some xml like this in a property (using 
xdmp:document-add-properties):

<updated>2010-04-15-07:00</updated>

Have an xs:date range index on the "updated" element.

then you can search it something like the following, which returns the URIs of 
every document updated after April 15 2009:

for $x in cts:search(xdmp:document-properties(),
 cts:element-range-query(
     xs:QName("updated"), ">", 
     xs:date("2009-04-15-07:00") ) ) 
return
xdmp:node-uri($x)

Now I have not tried this so I am not sure how well it would work, but it seems 
reasonable to me.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Lee, David
Sent: Thursday, April 15, 2010 9:17 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Update Logs ... 

An upcoming project I need to maintain a log of all updates to documents.  The 
MarkLogic DB is being used to 'mirror' a dataset with changes occuring daily.   
About 20,000 documents but expected about 10 change daily.   For this were not 
using the Library API (may at a future point but not now).
I was planning on using document properties to store update history.   I still 
think this is a good idea, but there's the need to produce a global report of 
"all documents updated since XXX".   Would this work well by simply querying 
the document properties ?
Another Idea I had is keeping a directory and adding small 'log files' to it 
for every update that contain the URI and update date.
Much like I'd do in an RDB (add a new record for every activity).
I suspect this would be in a directory and could grow large over time (but not 
that large as the change rate is low).   However to do this I need to create 
unique ID's.    Is this any better then using the document properties ?

Alternatively I could simply append elements to a single audit XML document ... 
but I suspect that requires loading, inserting, then storeing the document 
every time and it would grow unbounded.

I'm sure this is a common problem,  any suggestions ?



----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]
812-482-5224

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to