Deshbir,
There isn't an exact answer to (1), but you can approximate by visiting
the forest status page in the admin UI. Divide the total on-disk size of
the forest by the total number of fragments. But this will always be an
approximation because there are several on-disk data structures which
have different overheads, grow at different rates, and interact in
complex ways.
The fragment rule has split your documents into an average of 20
fragments per document, so your average expanded fragment size is
probably about 180-kB. That's a little large, but might be perfectly ok
for your application. Ultimately, fragmentation is all about aligning
the physical storage and indexing of your XML with your application needs.
thanks,
-- Mike
On 2009-03-18 04:27, Deshbir wrote:
Mike,
Thank a lot for your inputs.
You were right! The slow query performance (for xdmp update functions) was not
related to the merges. We verified by temporarily disabling merging, and found
it had none or very little impact on the query performance.
Your suggestion on fragmenting the document (we used Fragment Parents) also
worked very well. After applying a fragmentations rule (and re-indexing the
database) the query performance (for xdmp update functions) improved
significantly. We did not notice any negative impact on the data-loading
queries.
Before Fragmentation (i.e. default Mark logic settings)
- Total Size of Database: 25 MB
- No of documents: 41
- No. of fragments: 101
After Fragmentation
- Total Size of Database: 29 MB (Not sure why this changed)
- No of documents: 41
- No. of fragments: 811
Some more questions for you:
1. How does one accurately determine the size of a document (or an element) in
Marklogic? I presume that size of an exported XML file on disk is not the same
as the size of the same document in Marklogic database? In our application the
maximum size of the document on Disk (i.e. exported XML file) is 3.5 MB.
2. Do we still need to consider breaking up our documents (currently 3.5 MB on
disk) into smaller pieces? Or does fragment roots/parents have the same effect?
Note, in our case the documents are dynamic i.e. the application regularly
create/modifies documents in the MarkLogic database.
Once again, thanks for all the help.
Regards,
Deshbir
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Michael Blakeley
Sent: Monday, March 16, 2009 10:31 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Peformance issue (Merging) -
XDMP:node-replace function
Deshbir,
You can learn more about merges by reading our admin guide, available
via http://developer.marklogic.com/pubs
Merges are asynchronous with respect to queries, but they can compete
with queries for system resources. I suspect that's a false trail, though.
How large is the document on which node-replace is running? If you do
see a that ErrorLog extract after every node-replace, that suggests a
document size of 5-20 MB. If so, you should consider breaking up your
documents into smaller ones, or possibly use a fragment root or fragment
parent (fragments are also discussed in the admin guide).
-- Mike
On 2009-03-16 04:06, [email protected] wrote:
Hello,
We are experiencing extremely slow XQUERY performance for the XDMP:node-replace
function. Following is an XQUERY snippet that consistently takes more than 5
secs on one of our servers (Mark Logic 3.2).
============================================
let $docbookNode :=<p>hello</p>
let $path := doc(".....")/../../
return
xdmp:node-replace($path,$docbookNode)
============================================
On another (different) Mark Logic installation (3.2), the same code takes
consistently less that 300 milliseconds!
We've compared the server settings and they appear to be the same across both
servers (they are probably the default Mark Logic installation settings)
On examining the log folder, we found that every time an "xdmp:node-replace"
was executed, the following lines are being added to the error log file:
============================================
2009-03-16 06:45:32.114 Info: Saving C:\Program
Files\MarkLogic\Data\Forests\Documents\00000470
2009-03-16 06:45:32.880 Info: Saved 15 MB in 1 sec at 15 MB/sec to C:\Program
Files\MarkLogic\Data\Forests\Documents\00000470
2009-03-16 06:45:33.036 Info: Merging 62 MB from C:\Program
Files\MarkLogic\Data\Forests\Documents\0000046f and C:\Program
Files\MarkLogic\Data\Forests\Documents\00000470 to C:\Program
Files\MarkLogic\Data\Forests\Documents\00000471
2009-03-16 06:45:37.661 Info: Merged 55 MB in 5 sec at 11 MB/sec to C:\Program
Files\MarkLogic\Data\Forests\Documents\00000471
2009-03-16 06:45:37.958 Info: Deleted C:\Program
Files\MarkLogic\Data\Forests\Documents\0000046f
2009-03-16 06:45:38.098 Info: Deleted C:\Program
Files\MarkLogic\Data\Forests\Documents\00000470
============================================
What is going wrong here? What could be causing all the "merging" activity?
Thanks in advance.
Regards,
Deshbir
------------------------------------------------------------------------
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general