Ranjan,

I believe code like this:

let $csv-lines :=
  for $u in cts:uri-match(...)
  return myfunc:get-csv(doc($u))
return string-join($csv-lines, "
"  (: new line :)
)

will not hold all the documents in tree cache because you are not gathering 
them up into one sequence - it just gathers the .csv text output. It will hold 
the overall text strings in memory, though - perhaps twice over while it joins 
them into one string.

However, 20,000 docs is a lot, so consider using xqsync with the  
INPUT_MODULE_URI and INPUT_QUERY parameters. Xqsync will act as an external 
controller for the overall batch job. You could similarly write your own 
program using python or whatever and first get the URIs then get each .csv line 
and append to a file.

Lastly, if you have adequate RAM you can define Range Indexes on all the 
elements you want to export and build a SQL view, and use any relational tool, 
or Excel, to convert the query result to .csv.

Does this help in your case?

Yours,
Damon
--
Damon Feldman
Sr. Principal Consultant, MarkLogic



From: [email protected] 
[mailto:[email protected]] On Behalf Of ranjan sarma
Sent: Monday, February 04, 2013 4:42 AM
To: [email protected]
Subject: [MarkLogic Dev General] regarding error: XDMP-EXPNTREECACHEFULL

Hi

I have a database which consists of above 20,000 documents, each of around 2-10 
kilo bytes. I get the 20000 documents by following query:
    let $uri := cts:uri-match('products/documents/*.xml')
    let $doc := fn:doc ($uri)

products/documents contains all the xml documents. we need to build csv file 
from this record set, with each xml documents being
one row in the csv file. but since the size of document is too large so we are 
recieving an error message ' XDMP-EXPNTREECACHEFULL' (I think the document was 
tried to store in main memory, which was not allowed by the system).

What can be other work around? Can we add streaming of result ? If yes please 
provide one example so that I can grab it.

Otherwise, Can we convert documents part by part to csv and then output to HTTP 
output stream ?

Increasing the size of cache from admin console is not a solution because the 
document size may grow in future.

thanks,
ranjan.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to