On 2017/12/7 0:34, Michael S. Tsirkin wrote:
> On Wed, Dec 06, 2017 at 09:30:27PM +0800, Ying Fang wrote:
>>
>> On 2017/12/1 22:39, Michael S. Tsirkin wrote:
>>> On Fri, Dec 01, 2017 at 01:58:32PM +0800, fangying wrote:
>>>> QEMU will abort when vhost-user process is restarted during migration
>>>> when vhost_log_global_start/stop is called. The reason is clear that
>>>> vhost_dev_set_log returns -1 because network connection is lost.
>>>>
>>>> To handle this situation, let's cancel migration by setting migrate
>>>> state to failure and report it to user.
>>>
>>> In fact I don't see this as the right way to fix it. Backend is dead so why
>>> not just proceed with migration? We just need to make sure we re-send
>>> migration data on re-connect.
>>> This is where vhost start/stop migration dirty log. The original code aborts
>> qemu here beacuse vhost data stream may break down if we fail to start/stop
>> vhost dirty log during migration. Backend may be active after
>> vhost_log_global_start.
>>
>> dirty log start ----------------- dirty log stop
>> ^ ^
>> | |
>> ----- backend dead ----- backend active
>
> I'm sorry, I don't understand yet. Backend is active after logging started -
> why is this a problem?Sorry, I did not explain it well. IF backend is dead
> when dirty log start is called,
vhost_dev_set_log/vhost_dev_set_features may fail because connection is
temporarily lost.
So even if migration is in progress and vhost-user backend is active again
later,
vhost-user dirty memory is not logged.
>
>> Currently we don't re-send migration data on re-connect in this situation.
>> May we should work it out.
>
> So basically backend connects after logging started, and we
> do not tell it to start logging and where - is that the issue?
> I agree, that would be a bug then.
>
Yes, this is just the issue.