Sorry I was mistaken in the previous email. The ones with data remaining
are for other VMs. For the hanging one, the migration job seems to always
have Data Total & Data Remaining be 0.

Best
Yangchen Ye

On Mon, Jul 21, 2025 at 12:49 PM Yangchen Ye <eikasi...@gmail.com> wrote:

> Hi,
>
> We are running libvirt 8.0.0, and sometimes live migration could not
> finish (because the guest is dirtying the memory too fast). We implemented
> a monitor that increases max downtime when it observed that "Data
> Remaining" bumps up. But we found a strange sequence of events from the
> monitor, which leads to a paused domain on the destination hypervisor:
>
> The monitor sees Data Remaining bumping up and increases max downtime up
> to 20 seconds, but weird thing is that after a period of time, it started
> reporting "Data Remaining" and "Data Total" is both 0, but the migration
> job is still unfinished:
>
> "Migration in progress - DataTotal: 85
> 904728064, DataRemaining: 22201458688, TimeElapsed: 20005, MaxDowntime:
> 500, DirtyRate: 0"
>
> "Migration in progress - DataTotal:
> 85904728064, DataRemaining: 43801825280, TimeElapsed: 10005, MaxDowntime:
> 500, DirtyRate: 0"
>
> "Migration in progress - DataTotal:
>  85904728064, DataRemaining: 52382912512, TimeElapsed: 10004, MaxDowntime:
> 500, DirtyRate: 0" (DataRemaining bumps up, we start increasing max
> downtime)
>
> "Migration in progress - DataTotal:
>  85904728064, DataRemaining: 4219596800, TimeElapsed: 40004, MaxDowntime:
> 1500, DirtyRate: 0" (Last poll where we see the job info)
>
> After which monitor logs
>
> "Migration in progress - DataTotal:
>  0, DataRemaining: 0, TimeElapsed: 40004, MaxDowntime: 13500, DirtyRate: 0"
>
> The domain is always running on the source hypervisor but there is a
> paused domain on the destination hypervisor which is paused at start up.
>
> Trying to understand what might have happened:
> - Is this a known issue for live migrating high memory activity guests and
> the way we interact with libvirt?
> - What is the recommended way to ensure that a started live migration
> always run to completion if we don't care about downtime?
>
> Appreciate any help here
>
> Yangchen Ye
>
>
>

Reply via email to