Re: [HelenOS-devel] Deadlock while booting
Hi, On 05.11., Jakub Jermář wrote: > On 10/27/2017 12:08 AM, Ondřej Hlavatý wrote: > > I started to experience deadlock while booting HelenOS. It does not > > happen every time, and when I add some debug prints, the deadlock > > disappears completely. > > > > The issue starts at ps2mouse driver, which adds mouse function from its > > device_add operation. This remote call goes all the way to fun_online, > > in which it is holding the writelock (blocking other drivers) and, > > because the function is exposed, probably waiting inside > > loc_register_tree_function, respectively in loc_service_register. > > > > Looking at this function, it seems to be very similar to what Jakub > > Jermar describes at: > > > > http://jakubsuniversalblog.blogspot.cz/2011/09/debugging-file-system-hang-using.html?q=deadlock > > > > As far as I understand the issue, this shall not be the case - this is > > the sender, not the receiver, and there is no cycle of messages waiting > > for themselves. But after swapping the order of exch release and waiting > > for answer, the deadlock no longer occurs. > > > > Can someone please confirm, that the order there is correct? > > I don't think that changing the mutual ordering of loc_exchange_end() > and async_wait_for() will fix this on its own. See #700 for details. Everything you wrote there makes sense to me. > Btw, when you were testing your fix, did you change the order only in > loc_service_register() or also in other places? I would be actually very > surprised if this changed anything because in all the deadlocks for > which we have some data in #700 (i.e. your .svg and my log files), the > LOC_SERVICE_REGISTER was the second request, not the first. The first > must have been LOC_SERVICE_ADD_TO_CAT. locsrv did not even start > processing the second one. Well, the deadlock was very randomly occuring, so it is possible that it just didn't show up because of some timing issues of previous requests. It often happened that the deadlock disappeared by adding a debug print somewhere completely unrelated. Also, it wasn't clear at all to me why it should fix the deadlock, that's why I asked instead of sending patch. OH ___ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/listinfo/helenos-devel
Re: [HelenOS-devel] Deadlock while booting
On 10/27/2017 12:08 AM, Ondřej Hlavatý wrote: > I started to experience deadlock while booting HelenOS. It does not > happen every time, and when I add some debug prints, the deadlock > disappears completely. > > The issue starts at ps2mouse driver, which adds mouse function from its > device_add operation. This remote call goes all the way to fun_online, > in which it is holding the writelock (blocking other drivers) and, > because the function is exposed, probably waiting inside > loc_register_tree_function, respectively in loc_service_register. > > Looking at this function, it seems to be very similar to what Jakub > Jermar describes at: > > http://jakubsuniversalblog.blogspot.cz/2011/09/debugging-file-system-hang-using.html?q=deadlock > > As far as I understand the issue, this shall not be the case - this is > the sender, not the receiver, and there is no cycle of messages waiting > for themselves. But after swapping the order of exch release and waiting > for answer, the deadlock no longer occurs. > > Can someone please confirm, that the order there is correct? I don't think that changing the mutual ordering of loc_exchange_end() and async_wait_for() will fix this on its own. See #700 for details. Btw, when you were testing your fix, did you change the order only in loc_service_register() or also in other places? I would be actually very surprised if this changed anything because in all the deadlocks for which we have some data in #700 (i.e. your .svg and my log files), the LOC_SERVICE_REGISTER was the second request, not the first. The first must have been LOC_SERVICE_ADD_TO_CAT. locsrv did not even start processing the second one. Jakub ___ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/listinfo/helenos-devel
Re: [HelenOS-devel] Deadlock while booting
On 10/27/2017 12:08 AM, Ondřej Hlavatý wrote: > I started to experience deadlock while booting HelenOS. It does not > happen every time, and when I add some debug prints, the deadlock > disappears completely. > > The issue starts at ps2mouse driver, which adds mouse function from its > device_add operation. This remote call goes all the way to fun_online, > in which it is holding the writelock (blocking other drivers) and, > because the function is exposed, probably waiting inside > loc_register_tree_function, respectively in loc_service_register. > > Looking at this function, it seems to be very similar to what Jakub > Jermar describes at: > > http://jakubsuniversalblog.blogspot.cz/2011/09/debugging-file-system-hang-using.html?q=deadlock > > As far as I understand the issue, this shall not be the case - this is > the sender, not the receiver, and there is no cycle of messages waiting > for themselves. But after swapping the order of exch release and waiting > for answer, the deadlock no longer occurs. > > Can someone please confirm, that the order there is correct? For the record, here is my and Ondrej's conversation from irc: can you see some active calls from locsrv to devman? tbf i cannot reproduce it anymore but i think the only active calls were 4 to ethip from locsrv? there were some chain of stuck messages, ending at ns but ns wasn't sending anything that might be important ns was recently rewritten to use async framework another interesting thing is that unlike in my blog, the connection between devman and locsrv uses only one phone but I still fail to see anything that would prevent receiving the answer to LOC_SERVICE_REGISTER forever it would be good if you could collect the ipc for all interesting task ID's I also make an observation that until the LOC_SERIVCE_REGISTER call is answered, locsrv cannot start processing another call because there is only one fibril and it is currently busy processing LOC_SERVICE_REGISTER J. ___ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/listinfo/helenos-devel