Ranjan, I believe code like this:
let $csv-lines := for $u in cts:uri-match(...) return myfunc:get-csv(doc($u)) return string-join($csv-lines, " " (: new line :) ) will not hold all the documents in tree cache because you are not gathering them up into one sequence - it just gathers the .csv text output. It will hold the overall text strings in memory, though - perhaps twice over while it joins them into one string. However, 20,000 docs is a lot, so consider using xqsync with the INPUT_MODULE_URI and INPUT_QUERY parameters. Xqsync will act as an external controller for the overall batch job. You could similarly write your own program using python or whatever and first get the URIs then get each .csv line and append to a file. Lastly, if you have adequate RAM you can define Range Indexes on all the elements you want to export and build a SQL view, and use any relational tool, or Excel, to convert the query result to .csv. Does this help in your case? Yours, Damon -- Damon Feldman Sr. Principal Consultant, MarkLogic From: [email protected] [mailto:[email protected]] On Behalf Of ranjan sarma Sent: Monday, February 04, 2013 4:42 AM To: [email protected] Subject: [MarkLogic Dev General] regarding error: XDMP-EXPNTREECACHEFULL Hi I have a database which consists of above 20,000 documents, each of around 2-10 kilo bytes. I get the 20000 documents by following query: let $uri := cts:uri-match('products/documents/*.xml') let $doc := fn:doc ($uri) products/documents contains all the xml documents. we need to build csv file from this record set, with each xml documents being one row in the csv file. but since the size of document is too large so we are recieving an error message ' XDMP-EXPNTREECACHEFULL' (I think the document was tried to store in main memory, which was not allowed by the system). What can be other work around? Can we add streaming of result ? If yes please provide one example so that I can grab it. Otherwise, Can we convert documents part by part to csv and then output to HTTP output stream ? Increasing the size of cache from admin console is not a solution because the document size may grow in future. thanks, ranjan.
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
