Josh,
That error message suggests that your query is trying to insert
items.xml after all (and probably not getting far enough to see if the
EXPNTREECACHEFULL will happen or not).
What does your full query look like now?
As Tim mentioned, you could also set a fragment root on "item". However,
there are some advantages to storing multiple documents when the content
is fully separable.
-- Mike
On 2009-09-24 18:54, Josh Daymont wrote:
Hi Mike,
Thanks your idea of not loading items.xml into the database seemed to sort out
the cache issue. I also reset the ingestion and other buffers back to their
default sizes as you suggested.
At this point I am getting this error:
XDMP-FRAGTOOLARGE: Fragment of items.xml too large for in-memory storage:
XDMP-INMMTREEFULL: In-memory tree storage full; list: table=0%, wordsused=0%,
wordsfree=100%, overhead=0%; tree: table=0%, wordsused=0%, wordsfree=100%,
overhead=0%
[1.0-ml]
I am not 100% sure what this means though.
Josh
On Thu, Sep 24, 2009 at 6:18 PM, Michael
Blakeley<michael.blake...@marklogic.com<mailto:michael.blake...@marklogic.com>>
wrote:
Josh,
Your immediate problem is that you're changing the wrong settings. You have
changed the ingestion buffer size, not the cache size. That's a common mistake,
but it's important that you reset those in-memory list and tree sizes to their
original values. Ignoring this advice may cause the OS to page heavily, which
is extremely bad for performance.
Done that? Good. Now let's see if we can avoid doing any server tuning at all.
I'm always happier when I can get the result I want by tuning the workload,
instead of the server.
First, consider using RecordLoader for this job. With ID_NAME=#AUTO it will do
something pretty close to what you're trying to do. With some added XQuery code
you can make it do exactly what you want (see the example in the readme, under
CONTENT_FACTORY_CLASSNAME). But the built-in behavior might be good enough.
http://developer.marklogic.com/svn/recordloader/trunk/README.html
If that isn't appropriate for some reason, let's start by tuning your query to
avoid unnecessary database updates. I don't think there's any reason to insert
items.xml into the database at all.
(: new query :)
for $item in xdmp:document-get("/tmp/items.xml")//item
let $newitem := element { "item" }
{ attribute { "owner" } { $item/@owner } ,
element { "timestamp" } { current-dateTime() },
for $price in $item/price return
<price>{ $price }</price>
}
return xdmp:document-insert(
fn:concat("/item/", $newitem/@owner), current-dateTime()),
$item
)
That might be enough to allow the query to succeed. If it isn't, you can very
carefully increase the size of your expanded tree cache (*not* your database
tree-size limit). Again, if you increase it too much the OS may start to page
heavily, which will reduce performance catastrophically. Keep an eye on your
system's memory and swap utilization.
-- Mike
On 2009-09-24 17:05, Josh Daymont wrote:
I am attempting to load a large xml file in order to process it flatten it in
memory. The document itself is around 80mb and the code I have written uses a
FLWOR statement to break it into about 30k new documents which are to be loaded
into a collection. Before my FLWOR finishes I get the following error:
XDMP-EXPNTREECACHEFULL: Expanded tree cache full on host temp.marklogic
[1.0-ml]
I have set the db's in memory list size and in memory tree size to 256mb. I
have read on other threads here that this is caused by large numbers of
documents remaining in scope during the execution of a single query. However I
am not sure how the newly created documents would remain in scope. Can anyone
advise on best practices for performing this sort of action.
The code looks like this:
xdmp:load("/tmp/items.xml", "items.xml");
for $item in doc("items.xml")//item
let $newitem := element { "item" }
{ attribute { "owner" } { $item/@owner } ,
element { "timestamp" } { current-dateTime() },
for $price in $item/price return
<price>{ $price }</price>
}
return xdmp:document-insert(
fn:concat(
fn:concat("/item/",
xs:string($newitem/@owner)),
current-dateTime() )
, $item)
_______________________________________________
General mailing list
General@developer.marklogic.com<mailto:General@developer.marklogic.com>
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general