Thanks Mike. That's all pretty much what I understood, but wanted to know if there might be an optimal way to go about it, since I don't usually do such big changes very often.
On Aug 7, 2013, at 3:54 PM, Michael Blakeley <[email protected]> wrote: > Make all the changes at the same time, but don't force reindexing. So follow > steps 1-3 and 4, but skip 3a. Or make all the changes at once, using package > XML or the admin API. > > There's no need for a second-pass forced reindex (3a) because the > refragmenter will detect the fragment policy change and do all the necessary > work. The new fragments will have the right fragmentation and the right > indexing. All the indexing is applied at the same time. > > Technically steps 1 & 3 aren't critical: you could leave the reindexer > enabled while you make the changes. The system would refragment a few old > docs in the time it takes you to make the various changes, and probably those > fragments would have to be done over with the final configuration. But if you > are reasonably quick about making the changes, there won't be very much extra > work: maybe a few hundred or a few thousand fragments. Still, it's easy > enough to do - or just use a database package update, or the admin API. > > With ML5, I would leave merge policy alone. Disabling merges risks hitting > the stand limit, which would put the database in an error state. It will also > use more disk space, because the forests will retain all deleted fragments > until they can merge. Imposing a max-merge-size has much the same risks. You > may hit the stand limit, and will use more disk space because the system will > be reluctant to merge out the deleted fragments in existing stands larger > than max-merge-size. There are ways to benefit from max-merge-size, but not > with ML5 and reindexing. > > If you are short on disk space during reindexing, you might take a look at > https://github.com/mblakele/threx which automatically pauses reindexing and > forces merges when disk space is low. This takes longer than normal > reindexing - sometimes much longer - but can be handy when available disk > space is insufficient for normal reindexing. > > -- Mike > > On 7 Aug 2013, at 05:31 , Ron Hitchens <[email protected]> wrote: > >> >> I have a need to make a bunch of changes to index >> settings on a moderately sized database (under a million >> documents), as well as change the fragmentation scheme. >> What is the best way to apply all these changes so that >> the least amount of work is done (this will ultimately >> need to be done on the production system)? >> >> The documents currently have a fragment parent that >> results in up to seven fragments per document. I want >> to change to a fragment root that will result in up to >> three fragments. Index changes are a couple of new >> range indexes and lots of word-query related things like >> adding word positions and setting inclusions. >> >> If I start making these changes on the various Admin >> screens, the reindexer will kick off and everything will >> eventually get done. But is there a way to combine the >> work into a single pass (or fewer passes)? >> >> For example, will this result in less work than just >> banging in the changes and letting it go: >> >> 1) Disable auto-reindexer >> 2) Make changes to index settings and fragmentation scheme >> 3) Re-enable indexer >> 3a) Would a forced re-index be needed? >> 4) Wait >> >> Will this result in ML doing less overall work by >> combining multiple index updates in the same pass? >> >> Also, would I be guaranteed that all existing fragments >> will have been re-fragmented when all is said and done? >> >> Can managing merge policy help here? Reducing the size >> of merges during re-index? Stopping merges until reindexing >> is finished? >> >> Thanks. >> >> --- >> Ron Hitchens {[email protected]} +44 7879 358212 >> >> >> >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
