Hi Piotr, Through it's not clear what root caused this problem, a work around is provided. Great to see it works.
-- Tim On Sun, 2008-12-14 at 14:00 +0100, Piotr Jasiukajtis wrote: > Hi, > > Your putback rev. 8349 fixes this issue on my laptop. > > Tim Chen pisze: > > Hi Piotr, > > Thanks for your help. I have also done some investigation on this > > problem, and found that this problem comes out right after the putback of > > changeset 8048. Nightly build on 11/09 is OK, that of 11/10 is broken. > > > > The putback log of change set 8048 is here: > > http://dlc.sun.com/osol/on/downloads/20081117/on-changelog-20081110.html > > (this page is long, search '6565503' and '6311743' to locate the right > > changeset). > > > > So, I am forwarding this problem to people who have inside knowledge > > on this to get help. > > This problem may not be introduced by the putback of 8048, but be a > > problem > > between doorfs and check-point resume. The putback of 8048 may just > > uncover this problem (Please correct me if I am wrong). > > > > The problem is like this: > > On resume, iwk driver initiate a state change (RUN -> INIT) to notify > > a link down event to up layer kernel module (net80211) > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/iwk/iwk2.c#477 > > > > In turn net80211 will issue a door up call into a user-land daemon > > (wpad). net80211 start a timer of 0 expire interval through timeout() to > > do this: > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#ieee80211_notify > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#ieee80211_event_thread > > > > The timer handler issues the door up call. Door up call may hold some > > dispatch/thread locks and manipulate thread structures: > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#168 > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/doorfs/door_sys.c#door_upcall > > > > At the same time, the check-point resume process goes on (to resume > > other devices, enable more CPUs, and to start user threads): > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/cpr/cpr_main.c#1110 > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/cpr/cpr_main.c#1216 > > > > Finally, system failed to resume. No message output (event those produced > > by prom_printf), serial console not responding, and kmdb can't be activated. > > << > > > > Hi door and CPR experts, > > Is there any possibility that door up calls while system resuming end up > > deadlock? > > > > PS: > > two workarounds available: > > 1. change iwk to notify link down event to net80211 while system > > suspending > > (rather than resuming) > > 2. add 1 tick (or more) delay right after here: > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/iwk/iwk2.c#477 > > Delays added to elsewhere in iwk didn't help. > > > > Thanks, > > Tim > > > > On Tue, 2008-11-18 at 01:06 +0100, Piotr Jasiukajtis wrote: > >> Hi, > >> > >> I have a build revision 8127. > >> Suspend/resume works only when I don't use wireless. > >> It seems to be broken when iwk0 is plumbed. It doesn't resume. > >> > >> Is it known? I can't find any related bugs. > >> > > > >