On 10/27/2017 12:08 AM, Ondřej Hlavatý wrote: > I started to experience deadlock while booting HelenOS. It does not > happen every time, and when I add some debug prints, the deadlock > disappears completely. > > The issue starts at ps2mouse driver, which adds mouse function from its > device_add operation. This remote call goes all the way to fun_online, > in which it is holding the writelock (blocking other drivers) and, > because the function is exposed, probably waiting inside > loc_register_tree_function, respectively in loc_service_register. > > Looking at this function, it seems to be very similar to what Jakub > Jermar describes at: > > http://jakubsuniversalblog.blogspot.cz/2011/09/debugging-file-system-hang-using.html?q=deadlock > > As far as I understand the issue, this shall not be the case - this is > the sender, not the receiver, and there is no cycle of messages waiting > for themselves. But after swapping the order of exch release and waiting > for answer, the deadlock no longer occurs. > > Can someone please confirm, that the order there is correct?
I don't think that changing the mutual ordering of loc_exchange_end() and async_wait_for() will fix this on its own. See #700 for details. Btw, when you were testing your fix, did you change the order only in loc_service_register() or also in other places? I would be actually very surprised if this changed anything because in all the deadlocks for which we have some data in #700 (i.e. your .svg and my log files), the LOC_SERVICE_REGISTER was the second request, not the first. The first must have been LOC_SERVICE_ADD_TO_CAT. locsrv did not even start processing the second one. Jakub _______________________________________________ HelenOS-devel mailing list HelenOS-devel@lists.modry.cz http://lists.modry.cz/listinfo/helenos-devel