Yes, I would strongly suggest that you use cts:uris() rather than retrieving all of the nodes and using xdmp:node-uri(). You will need to enable the "uri lexicon" option for your database before you can use the cts:uris() function. I almost always enabled the uri lexicon on my databases, it comes in handy in many different ways.
The cts:uris() query would be something like this: declare namespace atom='http://www.w3.org/2005/Atom'; cts:uris('', (), cts:element-attribute-value-query(xs:QName('atom:category'), xs:QName('term'), 'extra')) -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Declan Newman Sent: Friday, July 02, 2010 10:09 AM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] Database sync Thanks for coming back to me Mark. The batch script that I tested with; which timed-out is the following: java -Dfile.encoding=UTF-8 -cp "xqsync.jar;xcc-3.2.11.jar;xstream-1.3.1.jar;xpp3-1.1.4c.jar" -Xmx512m -DCOPY_PERMISSIONS=false -DINPUT_CONNECTION_STRING="xcc://username:[email protected]:8000/Documents" -DINPUT_QUERY="xquery version '1.0-ml'; declare namespace meta='http://namespace.com/meta'; declare namespace atom='http://www.w3.org/2005/Atom'; for $var1 in (//metadata/atom:entry/atom:catego...@term='extra']) return xdmp:node-uri($var1)" -DOUTPUT_CONNECTION_STRING="xcc://username:[email protected]:8000/Documents" com.marklogic.ps.xqsync.XQSync Could I use cts:uris() to do this? I'm not familiar with it. Thanks again for your help, Dec On 02/07/2010 14:40, Mark Helmstetter wrote: > Dec, > > XQSync should do the trick, I've xqsync'd many, many more documents than > 400k. It sounds like you're using the INPUT_QUERY option? Are you using > cts:uris() for that INPUT_QUERY? Can you share that query? > > --Mark > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Declan Newman > Sent: Friday, July 02, 2010 6:42 AM > To: [email protected] > Subject: [MarkLogic Dev General] Database sync > > Hello all, > > I need to copy approx 400,000 documents across from one database > (staging) to another (live) dependent on a query. I have looked into > XQSync, which seems to meet most of the requirements, other than the > fact that it times-out given the volume (even when splitting the query > into several, using ";;"). I have written a simple extension to XQSync > which will do the job, but in the event of a failure, it will start back > at the beginning. I would rather not start inserting documents into > collections if I can avoid it. > > Has anyone done a similar thing, and have a nicer solution? Thanks for > any help. > > Cheers, > > Dec > > -- Declan Newman, Senior Software Engineer, Semantico, Floor 1, 21-23 Dyke Road, Brighton BN1 3FE <http://www.semantico.com/> <mailto:[email protected]> <tel:+44-1273-358247> <fax:+44-1273-723232> Check out all our latest news and thinking on the Discovery blog - http://blogs.semantico.com/discovery-blog/ Follow Semantico on Twitter - http://twitter.com/semantico _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
