Paul, It may be wise to make yourself aware of the implications of using fragmentation, some of which may be less obvious..
An important one is that it makes horizontal scaling more difficult. If you split one document with fragmentation, all the fragments related to that document end up in the same forest. Imagine the case of one table like XML document with many fragments. Regardless of the number of forests, all of those fragments would only go into one. You wouldn’t benefit from adding forests, which is normally a very good way of increasing search and facet response times. In relation to this, it is also wise to keep the number of fragments in one forest below, say 50 mln fragments. If you go beyond that, MarkLogic will start to log messages about it. You can keep this in mind when there is need to scale up.. Cheers, Geert From: <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>> on behalf of Paul Vanderveen <pvanderv...@terraxml.com<mailto:pvanderv...@terraxml.com>> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Date: Thursday, September 8, 2016 at 5:30 PM To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Subject: Re: [MarkLogic Dev General] General Question about Documents and Fragments Charles, thanks for the response. That makes sense that we’re looking at fragments and properties. I completely understand your comments about fragmenting and why MarkLogic is taking the stance that you should not use it. If we were starting from scratch I would absolutely load separate documents. In our environment, however, that meant spending months to refactor a large amount of queries and backend code, test, then convert and update a large amount of legacy data at our customer site. Our data is very flat and lends itself well to fragmenting, so we gave it a try. The addition of fragmenting was a huge instant win as it improved performance at least 10x on larger manuals with virtually no changes to code or existing data. It also virtually eliminated expanded tree cache errors that were becoming frequent. In our particular case it was the best option, and so far we’ve found very little downside to it. I’d love to see us be able to refactor in the future to smaller docs, but our customers are very happy to see blisteringly fast performance today. Guidelines are good, but sometimes you just gotta cross the beams
_______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general