Make all the changes at the same time, but don't force reindexing. So follow steps 1-3 and 4, but skip 3a. Or make all the changes at once, using package XML or the admin API.
There's no need for a second-pass forced reindex (3a) because the refragmenter will detect the fragment policy change and do all the necessary work. The new fragments will have the right fragmentation and the right indexing. All the indexing is applied at the same time. Technically steps 1 & 3 aren't critical: you could leave the reindexer enabled while you make the changes. The system would refragment a few old docs in the time it takes you to make the various changes, and probably those fragments would have to be done over with the final configuration. But if you are reasonably quick about making the changes, there won't be very much extra work: maybe a few hundred or a few thousand fragments. Still, it's easy enough to do - or just use a database package update, or the admin API. With ML5, I would leave merge policy alone. Disabling merges risks hitting the stand limit, which would put the database in an error state. It will also use more disk space, because the forests will retain all deleted fragments until they can merge. Imposing a max-merge-size has much the same risks. You may hit the stand limit, and will use more disk space because the system will be reluctant to merge out the deleted fragments in existing stands larger than max-merge-size. There are ways to benefit from max-merge-size, but not with ML5 and reindexing. If you are short on disk space during reindexing, you might take a look at https://github.com/mblakele/threx which automatically pauses reindexing and forces merges when disk space is low. This takes longer than normal reindexing - sometimes much longer - but can be handy when available disk space is insufficient for normal reindexing. -- Mike On 7 Aug 2013, at 05:31 , Ron Hitchens <[email protected]> wrote: > > I have a need to make a bunch of changes to index > settings on a moderately sized database (under a million > documents), as well as change the fragmentation scheme. > What is the best way to apply all these changes so that > the least amount of work is done (this will ultimately > need to be done on the production system)? > > The documents currently have a fragment parent that > results in up to seven fragments per document. I want > to change to a fragment root that will result in up to > three fragments. Index changes are a couple of new > range indexes and lots of word-query related things like > adding word positions and setting inclusions. > > If I start making these changes on the various Admin > screens, the reindexer will kick off and everything will > eventually get done. But is there a way to combine the > work into a single pass (or fewer passes)? > > For example, will this result in less work than just > banging in the changes and letting it go: > > 1) Disable auto-reindexer > 2) Make changes to index settings and fragmentation scheme > 3) Re-enable indexer > 3a) Would a forced re-index be needed? > 4) Wait > > Will this result in ML doing less overall work by > combining multiple index updates in the same pass? > > Also, would I be guaranteed that all existing fragments > will have been re-fragmented when all is said and done? > > Can managing merge policy help here? Reducing the size > of merges during re-index? Stopping merges until reindexing > is finished? > > Thanks. > > --- > Ron Hitchens {[email protected]} +44 7879 358212 > > > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
