The performance of large updates is improved in both ML6 (I forget exactly which maintenance release), and ML7. We pass around much less information for distributed deadlock detection than earlier releases.
Wayne. On 12/11/2013 11:37 AM, Christopher Cieslinski wrote: > We do have some experience with updating millions of documents, > though, we have had to do that numerous times. We don’t have > super-huge data, so the biggest updates we have run have only been on > around 60 million documents. We had to do the updates in batches of > around 1500 documents each (we just spawned tasks to the task server). > We had 16 threads configured that were processing this on an e-node (8 > cores, 32GB of RAM, I think). This process took around 9 hours to > complete. We could have potentially had another e-node or two also > processing, as long as our d-nodes could have kept up, and that could > have cut the time down. -- Wayne Feick Principal Engineer MarkLogic Corporation [email protected] Phone: +1 650 655 2378 www.marklogic.com This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation. _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
