Peter Xu <pet...@redhat.com> writes: > On Thu, Sep 18, 2025 at 06:17:37PM -0300, Fabiano Rosas wrote: >> > ============= ABOUT OLD PATCH 2 =================== >> > >> > I dropped it for now to unblock almost patch 1, because patch 1 will fix a >> > real warning that can be triggered for not only qtest but also normal tls >> > postcopy migration. >> > >> > While I was looking at temporary settings for multifd send iochannels to be >> > blocking always, I found I cannot explain how migration_tls_channel_end() >> > currently works, because it writes to the multifd iochannels while the >> > channels should still be owned (and can be written at the same time?) by >> > the sender threads. It sounds like a thread-safety issue, or is it not? >> > >> >> IIUC, the multifd channels will be stuck at p->sem because this is the >> success path so migration will have already finished when we reach >> migration_cleanup(). The ram/device state migration will hold the main >> thread until the multifd channels finish transferring. > > For success cases, indeed. However this is not the success path? After > all, we check migration_has_failed(). >
My point is that when we reach here, if migration has succeeded, then it should be ok. If not, then thread-safety doesn't matter because things have already went bad, we'll lose the destination anyway. > Should I then send a patch to only send bye() when succeeded? Then I can > also add some comment. I wished we could assert. Then the "temporarily > changing nonblock mode" will also rely on this one, because ideally we > shouldn't touch the fd nonblocking mode if some other thread is operating > on it. > I don't know if it changes much. Currently we basically always ignore the error from bye(). > The other thing is I also think we shouldn't rely on checking > "p->tls_thread_created && p->thread_created" but only rely on channel type, > which might be more straightforward (I almost did it in v1, but v2 rewrote > things so it was lost). Ok, but we may need to ensure bye() is not called before the session is initiated. So thread_created may still be needed?