Yes, any doc() call will use space in the expanded-tree cache. So you might end 
up with X in the cache, plus Y for the deserialized map.

I would also worry about how long it might take to deserialize a 400-MB map, 
even if the XML is already in cache. My guess is around 30-sec to construct the 
map. If the cache is cold that might double because the fragment has to be read 
from disk and decoded. But those are just guesses.

There are a couple of approaches that might avoid that cost. One is to break up 
the map into multiple small documents. You could query a special directory or 
collection for document that have the key(s) you need, and let the 
expanded-tree cache handle the memory management. Each map would be relatively 
small, so deserialization wouldn't be as expensive.

Another approach is to keep the map in a server field. That would be both 
powerful and dangerous, because the memory for a server field is persistent. We 
are used to working with query allocations, which disappear when the query 
ends. So a single query is limited in its scope for damage. But a 400-MB server 
field allocates 400-MB per eval host, for the lifetime of the host process.

So you'd want to be very careful to ensure that each host has exactly one of 
these huge server fields. You'd also have to be very careful about updating the 
map, partly because of the size and also because server fields do not offer 
much in the way of memory protection. Depending on your needs you might be able 
to do some sort of A-B switching when you need to update the map, or develop a 
locking strategy, or both.

-- Mike

On 6 May 2013, at 16:29 , Will Thompson <[email protected]> wrote:

> Mike - I should have been a little more specific about the use case. What
> if that map is serialized to the db; would calling doc() on that
> potentially overload the expanded tree cache?
> 
> let $m := map:map(doc('/path/to/map.xml')/map:map)
> return xdmp:set-server-field('my-map', $m)
> 
> Best guess on the QA server is that ML was installed when its VM was
> allocated fewer resources. But that's a good point about catching bad
> queries.
> 
> -Will
> 
> 
> On 5/6/13 4:05 PM, "Michael Blakeley" <[email protected]> wrote:
> 
>> No, maps don't use expanded tree cache space. A really large map might
>> hit some per-eval limits, but I didn't find them when I created map
>> around 800-MiB on my laptop, with 6.0-3. I used an xdmp:quote to try to
>> make sure the map would really allocated more space for each entry. This
>> was fine at 80-MiB and took about 5-sec. For 800-MiB it took a little
>> longer, and the OS swapped some pages out. So I conclude that it was
>> working hard to allocate all the memory.
>> 
>> let $m := map:map()
>> let $n := doc()[1]
>> let $_ := (1 to 1000000) ! (
>> map:put($m, xdmp:integer-to-hex(xdmp:random()), xdmp:quote($n)))
>> return map:count($m) * string-length(xdmp:quote($n)) div (1024 * 1024)
>> , xdmp:elapsed-time()
>> =>
>> 802.04010009765625
>> PT1M6.429219S
>> 
>> On that QA system, you might have set the expanded tree cache size to a
>> smaller value on purpose. That can be a good way to catch
>> poorly-optimized queries.
>> 
>> -- Mike
>> 
>> On 6 May 2013, at 14:44 , Will Thompson <[email protected]>
>> wrote:
>> 
>>> Here's another one related to the Expanded Tree Cache: Say I want to
>>> load
>>> a giant map: 400MB or more. Will this always be dependent on the size of
>>> the Expanded Tree Cache? Most of our dev machines have an Expanded Tree
>>> Cache big enough to handle a map like this, but some don't, and for some
>>> reason our QA server is set to an inexplicably small value. Is it
>>> advisable to just manually increase that value so everything fits? Are
>>> there any other general rules when adjusting server spec values? I have
>>> mostly heard "look don't touch" with regard to these settings.
>>> 
>>> -Will
>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> 
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to