Hello,
I am receiving an exception reading "Expanded tree cache full on host"
(XDMP-EXPNTREECACHEFULL). Thanks to previous posts and ML documentation, I
employed paging (75,000 / page) and temporarily tripled the expanded tree cache
size (from 2048 to 6144).
I find it odd that doubling the cache size only allowed the script to get a
little further, disproportionately so. It makes me wonder if the cache is
retaining results from previous lines, of the same script, and if there is a
way to flush the cache midstream.
I am using ML 3.2-5 and attempting to pull stats on files and users via an
on-demand script:
1. Create a list of URIs representing all files: <uri ext="{$ext}">{$uri}</uri>
2. For 9 different types of files we're reporting on (images, audio/video, XML,
etc.), iterate through the file URI list to create a sub-list. These lists are
created using the file extension portion of the URI.
3. After each sub-list is created, calculate the total file size using fn:sum()
and a metadata value from the file's properties. This is where I encountered my
first instance of the EXPNTREECACHEFULL exception. Paging kept the cache size
less than the original setting.
4. Create a distinct list of file extensions via fn:distinct-values().
5. Lastly, create a distinct list of users that modified one or more files.
This is the second instance of the EXPNTREECACHEFULL exception. I added paging,
same as in step no. 3. The exception was then thrown in the 3rd set. I doubled
the expanded tree cache size. The exception was then thrown in the 4th set.
After tripling the cache size, it made it yet I had only scripted 6 of 14
pages. Given I can only increase the cache size so much, I'm curious what my
alternatives are for these large jobs. I'll try a couple more tests after
hitting the send button, namely a) all 14 pages using the tripled cache size,
b) reverse the processing order or split the script into two and c) reduce the
page size. The purpose of test "b." is to identify if I'm just doing too much
in one script or this part of the script is asking too much.
Below is one of the exceptions and associated snippet.
com.marklogic.xcc.exceptions.XQueryException: XDMP-EXPNTREECACHEFULL: for $uri
as item()* in $uris-all[$level-2 + 1 to min(($cnt-all, $level-3))] -- Expanded
tree cache full on host ...
let $users-3 :=
if ($cnt-all > $level-2) then
fn:distinct-values(
for $uri in $uris-all[($level-2 + 1) to fn:min(($cnt-all,$level-3))]
return xdmp:document-properties($uri)/prop:properties/meta:r_modifier)
else ()
Note the above use of fn:distinct-values() is present under the believe it
would lighten the load on a subsequent call to fn:distinct-values(($users-1,
$users-2, $users-3, ...)). This is probably unnecessary as ($users-n * the
number of pages) will be significantly smaller than $uris-all.
Thank you in advance for your time and thoughts.
-Brent
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general