Paul,

It may be wise to make yourself aware of the implications of using 
fragmentation, some of which may be less obvious..

An important one is that it makes horizontal scaling more difficult. If you 
split one document with fragmentation, all the fragments related to that 
document end up in the same forest. Imagine the case of one table like XML 
document with many fragments. Regardless of the number of forests, all of those 
fragments would only go into one. You wouldn’t benefit from adding forests, 
which is normally a very good way of increasing search and facet response 
times. In relation to this, it is also wise to keep the number of fragments in 
one forest below, say 50 mln fragments. If you go beyond that, MarkLogic will 
start to log messages about it.

You can keep this in mind when there is need to scale up..

Cheers,
Geert

From: 
<general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>>
 on behalf of Paul Vanderveen 
<pvanderv...@terraxml.com<mailto:pvanderv...@terraxml.com>>
Reply-To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Date: Thursday, September 8, 2016 at 5:30 PM
To: MarkLogic Developer Discussion 
<general@developer.marklogic.com<mailto:general@developer.marklogic.com>>
Subject: Re: [MarkLogic Dev General] General Question about Documents and 
Fragments

Charles, thanks for the response.  That makes sense that we’re looking at 
fragments and properties.

I completely understand your comments about fragmenting and why MarkLogic is 
taking the stance that you should not use it.   If we were starting from 
scratch I would absolutely load separate documents.  In our environment, 
however, that meant spending months to refactor a large amount of queries and 
backend code, test, then convert and update a large amount of legacy data at 
our customer site.  Our data is very flat and lends itself well to fragmenting, 
so we gave it a try.  The addition of fragmenting was a huge instant win as it 
improved performance at least 10x on larger manuals with virtually no changes 
to code or existing data.  It also virtually eliminated expanded tree cache 
errors that were becoming frequent.   In our particular case it was the best 
option, and so far we’ve found very little downside to it.   I’d love to see us 
be able to refactor in the future to smaller docs, but our customers are very 
happy to see blisteringly fast performance today.

Guidelines are good, but sometimes you just gotta cross the beams
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to