Attached is the script Ive used in the past. 
YMMV

Note that this was intended to put a directory tree from the filesystem to ML.  
It would have to be modified to pull from one ML instance to another .

The MD5 property and length is not builtin property of ML.  This script assures 
it is written to every document and if it doesn't exist will write it the first 
time.  That may or may not be optimal for your case.   Note that the MD5 and 
length is meaningless on an  in-database document because the serialized form 
is not stored.  But it is meaningful assuming that the document existed prior 
in a serialized form.
Thus to compare 2 documents you have to serialize them and then compare their 
MD5.
But once stored with the documents as a property they can be queried without 
serializing or even fetching either document.   This script handles updates, 
inserts, and deletes to make the target match the source.


This is called like

xmlsh put_sync  direcory   uri

it assumes you have xmlsh and the marklogic extension module installed, and the 
MLCONNECT variable set.

<disclaimer> 
personal code with no assumed warrantee and not affiliated with my employer
use as an example only blah blah blah

-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 650-287-2531
Cell:  +1 812-630-7622
www.marklogic.com

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Murray, Gregory
Sent: Friday, February 10, 2012 11:05 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Syncing only documents that have changed

David,

Would you be willing to share your xmlsh script? If you can't, at least that 
points me in the right direction -- I hadn't thought to use xmlsh. I also 
didn't know there was an MD5 property.

Thanks!
Greg


On Feb 10, 2012, at 10:20 AM, David Lee wrote:

> If you start outside ML you can connect to 2 different servers via XCC easily.
> I've implemented a "sync" script in xmlsh which does datetime and checksum 
> comparisons to sync between filesystem and an ML server.  It could easily be 
> adopted to sync between 2 servers.
> 
> If you are inside ML then one idea would be to expose an HTTP service on the 
> other server to do what you ask.
> 
> IMHO timestamps are not quite good enough unless your system clocks are 
> synced.  The xmlsh sync script uses a MD5 property stored with the document.
> 
> -----------------------------------------------------------------------------
> David Lee
> Lead Engineer
> MarkLogic Corporation
> [email protected]
> Phone: +1 650-287-2531
> Cell:  +1 812-630-7622
> www.marklogic.com
> 
> This e-mail and any accompanying attachments are confidential. The 
> information is intended solely for the use of the individual to whom it is 
> addressed. Any review, disclosure, copying, distribution, or use of this 
> e-mail communication by others is strictly prohibited. If you are not the 
> intended recipient, please notify us immediately by returning this message to 
> the sender and delete all copies. Thank you for your cooperation.
> 
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Murray, Gregory
> Sent: Friday, February 10, 2012 9:56 AM
> To: General MarkLogic Developer Discussion
> Subject: [MarkLogic Dev General] Syncing only documents that have changed
> 
> I need to copy documents from one server (development) to another 
> (production) but copy only documents that have changed, that is, each 
> document on development that has a more recent last-modified property than 
> the corresponding document on production.
> 
> Does xqsync have an option for this? I'm not seeing one.
> 
> If not, can Information Studio do this?
> 
> If not, is it possible to run an XQuery query that connects to an XDBC server 
> on a different machine? If so, I could easily take the last-modified property 
> of the document in the database against which I run the query (development) 
> and compare it against the same property of the corresponding document on the 
> production machine. In the past I've used the <database> option of 
> xdmp:eval() to grab documents from a different database on the same machine, 
> but in this case I need to connect to a different machine altogether.
> 
> Many thanks,
> Greg
> 
> Gregory Murray
> Digital Library Application Developer
> Princeton Theological Seminary
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Attachment: put_sync.xsh
Description: put_sync.xsh

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to