Hi Clément, On Tue, Sep 6, 2022 at 3:41 PM Clément Chigot <chi...@adacore.com> wrote: > > Hi Bin, > > > On Mon, Sep 5, 2022 at 4:10 PM Clément Chigot <chi...@adacore.com> wrote: > > > > > > Hi all, > > > > > > I did reach the same issue while trying to connect a gdb to qemu on > > > Windows hosts. Some inputs send by gdb aren't getting correctly pulled, > > > they will be retrieved only once g_poll times out. > > > > > > As you explained earlier, the issue, AFAICT, is that WSAEnumNetworkEvents > > > will reset the internal events before select can detect them. > > > > No, I don't think WSAEnumNetworkEvents and select cannot be used together. > > > > MSDN says: > > https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select > > > > "The select function has no effect on the persistence of socket events > > registered with WSAAsyncSelect or WSAEventSelect." > > > > That sounds to me like current usage in QEMU codes does not have a problem. > > Yes, but WSAEnumNetworkEvents has effects. Thus, it might reset the > events before > select detects them no ?
Even if the event *handle* is reset, select can still detect the network event has happened. I think the above MSDN statement is trying to explain this. > Moreover, I'm not sure what they mean by "persistence of WSAEventSelect". > Does that mean that select doesn't change the events wanted to be > notified as set > by WSAEventSelect or that select isn't resetting the events once > retrieved, as you > imply ? My understanding is that select just determines the socket status neither from the event handle associated with the socket, nor does it change anything on the event handle. > Moreover, the current code is creating the event object with > CreateEvent(NULL, FALSE, FALSE, NULL). The first FALSE implies that we want > an auto-reset. The MSDN doc says: > > "Boolean that specifies whether a manual-reset or auto-reset event > object is created. > If TRUE, then you must use the ResetEvent function to manually reset > the state to > nonsignaled. If FALSE, the system automatically resets the state to > nonsignaled > after a single waiting thread has been released." Yes, I think this is a bug however when I changed to CreateEvent(NULL, TRUE, FALSE, NULL) or WSACreateEvent() there are still "Broken pipe" errors ... > However, I'm not sure if I understand correctly what "after a single > waiting thread > has been released." signified. Is the reset occuring after recv() or > after select or > WSAEnumNetworkEvents calls ? AFAIR, I've tried to move to manual-reset with > the current qemu but it doesn't change anything. The same here :( > > > > Sadly, I didn't find any way to adjust the current code using select to > > > avoid that. As select and any cleaner (ever WSAEnumNetworkEvents or > > > WSEResetEvent) cannot be called in an atomic way, it seems that events > > > can always be received between the reset and the select. At least, all > > > my attempts failed. > > > > According to MSDN: > > https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaeventselect > > > > "Having successfully recorded the occurrence of the network event (by > > setting the corresponding bit in the internal network event record) > > and signaled the associated event object, no further actions are taken > > for that network event until the application makes the function call > > that implicitly reenables the setting of that network event and > > signaling of the associated event object." > > > > So events will be kept unsignaled after they are signaled, until the > > reenable routine is called. For example, recv() for the FD_READ event. > > In my understanding, WSAEnumNetworkEvents states the opposite: > https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsaenumnetworkevents > > "The Windows Sockets provider guarantees that the operations of copying > the network event record, clearing it and resetting any associated event > object are atomic, such that the next occurrence of a nominated network > event will cause the event object to become set." My interpretation of this does not conflict with the WSAEventSelect(). I think they are just two things. > > > > The only solution I've found is to move completely to the Windows API > > > and stop calling select. This, however, needs some tricks. In Particular, > > > we > > > need to "remove" the support of GSources having only G_IO_HUP. > > > This is already kind of done as we currently don't detect them in > > > qio_channel_socket_source_check anyway. > > > This is mandatory because of the two GSources created on the same socket. > > > IIRC, the first one (I'll call it the master GSource) is created during > > > the initialization of the channel-socket by update_ioc_handlers and will > > > have only G_IO_HUP to catch up. > > > The second one is created during the "prepare" phase of the first one, > > > in io_watch_poll_prepare. This one will have all the events we want > > > to pull (G_IO_IN here). > > > As they are referring to the same socket, the Windows API cannot know > > > on which GSources we want to catch which events. The solution is then > > > > I think Windows knows which events to catch. As WSAEventSelect() in > > channel-watch.c::qio_channel_create_socket_watch() tells Windows which > > event it should monitor. > > > > Both the "master" and "child" GSources are watching on the same socket > > with G_IO_IN condition (see qio_channel_create_socket_watch), and GLib > > is smart enough to merge these two resources into one in the event > > handle array which is passed to WaitForMultipleObjectsEx() in > > g_poll(). > > I've forgotten to mention it. But the current code only fails with newer glib > versions. I wasn't able to test all of them but it's working with 2.54 and > starts failing with versions after 2.60. > The qemu master isn't supporting 2.54 anymore. Thus I've tested that with > our internal qemu-6 which runs with 2.54. When upgrading glib, it starts > behaving like our issue. > Sadly, I wasn't able to test with glib-2.56/2.58 nor to find anything relevant > in glib code which would explain our issue. (But I wasn't aware of the > two GSources at that time so maybe it's worth investigating again). > Ah, good to know glib version matters. > > I checked your patch, what you did seems to be something one would > > naturally write, but what is currently in the QEMU sources seems to be > > written intentionally. > > > > +Paolo Bonzini , you are the one who implemented the socket watch on > > Windows. Could you please help analyze this issue? > > > > > to avoid WSAEnumNetworkEvents for the master GSource which only has > > > G_IO_HUP (or for any GSource having only that). > > > As I said above, the current code doesn't do anything with it anyway. > > > So, IMO, it's safe to do so. > > > > > > I'll send you my patch attached. I was planning to send it in the > > > following > > > weeks anyway. I was just waiting to be sure everything looks fine on our > > > CI. Feel free to test and modify it if needed. > > > > I tested your patch. Unfortunately there is still one test case > > (migration-test.exe) throwing up the "Broken pipe" message. > > I must say I didn't fully test it against qemu testsuite yet. Maybe there are > some refinements to be done. "Broken pipe" might be linked to the missing > G_IO_HUP support. > > > Can you test my patch instead to see if your gdb issue can be fixed? > > Yeah sure. I'll try to do it this afternoon. Thanks! Regards, Bin