Dominic,

There are two key concepts here. First, every document your query accesses may need to be fetched from disk. Disks are only so fast, so finishing 20,000 I/O operations may take some time. Second, every document accessed by your query must fit into the expanded-tree cache.

To simply list a large number of URIs, you can avoid I/O by using cts:uris().

http://developer.marklogic.com/pubs/3.2/apidocs/SearchBuiltins.html#uris

But it sounds like you might want to do something with these URIs. Now, you can always increase the size of the expanded-tree cache, and you can always increase the request time limit. But there will always be limits to what you can do in a single query (RAM, mostly: the expanded tree cache is just a pool of RAM). So you might be interested in XQSync and Corb.

http://developer.marklogic.com/howto/tutorials/2006-08-xqsync.xqy

http://developer.marklogic.com/svn/corb/trunk/README.html

-- Mike

Dominic Beesley wrote:
Hello,

This is my first post to the list so please be gentle!

I've built an ML application that works well up until a point (about 20,000
smallish documents + 100 or so bigger ones) and then I've started hitting a
lot of long running queries and the dreaded Expanded Tree Cache full
message.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to