On Tue, May 30, 2023 at 11:46:50AM -0400, Peter Xu wrote: > Hi, Andrei, > > On Thu, Apr 27, 2023 at 03:42:56PM +0300, Andrei Gudkov via wrote: > > Afterwards we tried to migrate VM after randomly selecting max downtime > > and bandwidth limit. Typical prediction error is 6-7%, with only 180 out > > of 5779 experiments failing badly: prediction error >=25% or incorrectly > > predicting migration success when in fact it didn't converge. > > What's the normal size of the VMs when you did the measurements?
VM size in all experiments was 32GiB. However, since some of the pages are zero, the effective VM size was smaller. I checked the value of precopy-bytes counter after the first migration iteration. Median value among all experiments is 24.3GiB. > > A major challenge of convergence issues come from huge VMs and I'm > wondering whether those are covered in the prediction verifications. Hmmm... My understanding is that convergence primarily depends on how agressive VM dirties pages and not on VM size. Small VM with agressive writes would be impossible to migrate without throttling. On the contrary, migration of the huge dormant VM will converge in just single iteration (although a long one). The only reason I can imagine why large VM size can negatively affect convergence is due to the following reasoning: larger VM size => bigger number of vCPUs => more memory writes per second. Or do you probably mean that during each iteration we perform KVM_CLEAR_DIRTY_LOG, which is (I suspect) linear in time and can become bottleneck for large VMs? Anyway, I will conduct experiments with large VMs. I think that the easiest way to predict whether VM migration will converge or not is the following. Run calc-dirty-rate with calc-time equal to desired downtime. If it reports that the volume of dirtied memory over calc-time period is larger than you can copy over network in the same time, then you are out of luck. Alas, at the current moment calc-time accepts values in units of seconds, while reasonable downtime lies in range 50-300ms. I am preparing a separate patch that will allow to specify calc-time in milliseconds. I hope that this approach will be cleaner than an array of hardcoded values I introduced in my original patch. > > Thanks, > > -- > Peter Xu