Deshbir,

There isn't an exact answer to (1), but you can approximate by visiting the forest status page in the admin UI. Divide the total on-disk size of the forest by the total number of fragments. But this will always be an approximation because there are several on-disk data structures which have different overheads, grow at different rates, and interact in complex ways.

The fragment rule has split your documents into an average of 20 fragments per document, so your average expanded fragment size is probably about 180-kB. That's a little large, but might be perfectly ok for your application. Ultimately, fragmentation is all about aligning the physical storage and indexing of your XML with your application needs.

thanks,
-- Mike

On 2009-03-18 04:27, Deshbir wrote:
Mike,

Thank a lot for your inputs.

You were right! The slow query performance (for xdmp update functions) was not 
related to the merges. We verified by temporarily disabling merging, and found 
it had none or very little impact on the query performance.

Your suggestion on fragmenting the document (we used Fragment Parents) also 
worked very well. After applying a fragmentations rule (and re-indexing the 
database) the query performance (for xdmp update functions) improved 
significantly. We did not notice any negative impact on the data-loading 
queries.

Before Fragmentation (i.e. default Mark logic settings)
- Total Size of Database: 25 MB
- No of documents: 41
- No. of fragments: 101

After Fragmentation
- Total Size of Database: 29 MB (Not sure why this changed)
- No of documents: 41
- No. of fragments: 811

Some more questions for you:

1. How does one accurately determine the size of a document (or an element) in 
Marklogic? I presume that size of an exported XML file on disk is not the same 
as the size of the same document in Marklogic database? In our application the 
maximum size of the document on Disk (i.e. exported XML file) is 3.5 MB.

2. Do we still need to consider breaking up our documents (currently 3.5 MB on 
disk) into smaller pieces? Or does fragment roots/parents have the same effect? 
Note, in our case the documents are dynamic i.e. the application regularly 
create/modifies documents in the MarkLogic database.

Once again, thanks for all the help.

Regards,
Deshbir

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Michael Blakeley
Sent: Monday, March 16, 2009 10:31 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Peformance issue (Merging) - 
XDMP:node-replace function

Deshbir,

You can learn more about merges by reading our admin guide, available
via http://developer.marklogic.com/pubs

Merges are asynchronous with respect to queries, but they can compete
with queries for system resources. I suspect that's a false trail, though.

How large is the document on which node-replace is running? If you do
see a that ErrorLog extract after every node-replace, that suggests a
document size of 5-20 MB. If so, you should consider breaking up your
documents into smaller ones, or possibly use a fragment root or fragment
parent (fragments are also discussed in the admin guide).

-- Mike

On 2009-03-16 04:06, [email protected] wrote:
Hello,

We are experiencing extremely slow XQUERY performance for the XDMP:node-replace 
function. Following is an XQUERY snippet that consistently takes more than 5 
secs on one of our servers (Mark Logic 3.2).
============================================
let $docbookNode :=<p>hello</p>
let $path := doc(".....")/../../
return
        xdmp:node-replace($path,$docbookNode)
============================================
On another (different) Mark Logic installation (3.2), the same code takes 
consistently less that 300 milliseconds!

We've compared the server settings and they appear to be the same across both 
servers (they are probably the default Mark Logic installation settings)

On examining the log folder, we found that every time an "xdmp:node-replace" 
was executed, the following lines are being added to the error log file:
============================================
2009-03-16 06:45:32.114 Info: Saving C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000470
2009-03-16 06:45:32.880 Info: Saved 15 MB in 1 sec at 15 MB/sec to C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000470
2009-03-16 06:45:33.036 Info: Merging 62 MB from C:\Program 
Files\MarkLogic\Data\Forests\Documents\0000046f and C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000470 to C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000471
2009-03-16 06:45:37.661 Info: Merged 55 MB in 5 sec at 11 MB/sec to C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000471
2009-03-16 06:45:37.958 Info: Deleted C:\Program 
Files\MarkLogic\Data\Forests\Documents\0000046f
2009-03-16 06:45:38.098 Info: Deleted C:\Program 
Files\MarkLogic\Data\Forests\Documents\00000470
============================================

What is going wrong here? What could be causing all the "merging" activity?

Thanks in advance.

Regards,
Deshbir



------------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to