Hi Piotr,

Through it's not clear what root caused this problem, a work around is
provided. Great to see it works.

--
Tim

On Sun, 2008-12-14 at 14:00 +0100, Piotr Jasiukajtis wrote:
> Hi,
> 
> Your putback rev. 8349 fixes this issue on my laptop.
> 
> Tim Chen pisze:
> > Hi Piotr,
> >   Thanks for your help. I have also done some investigation on this
> > problem, and found that this problem comes out right after the putback of
> > changeset 8048. Nightly build on 11/09 is OK, that of 11/10 is broken.
> > 
> >   The putback log of change set 8048 is here:
> > http://dlc.sun.com/osol/on/downloads/20081117/on-changelog-20081110.html
> > (this page is long, search '6565503' and '6311743' to locate the right
> > changeset).
> > 
> >   So, I am forwarding this problem to people who have inside knowledge
> > on this to get help.
> >   This problem may not be introduced by the putback of 8048, but be a 
> > problem
> > between doorfs and check-point resume. The putback of 8048 may just
> > uncover this problem (Please correct me if I am wrong).
> > 
> >   The problem is like this:
> >   On resume, iwk driver initiate a state change (RUN -> INIT) to notify
> > a link down event to up layer kernel module (net80211)
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/iwk/iwk2.c#477
> > 
> >   In turn net80211 will issue a door up call into a user-land daemon
> > (wpad). net80211 start a timer of 0 expire interval through timeout() to
> > do this:
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#ieee80211_notify
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#ieee80211_event_thread
> > 
> >   The timer handler issues the door up call. Door up call may hold some
> > dispatch/thread locks and manipulate thread structures:
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/net80211/net80211.c#168
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/doorfs/door_sys.c#door_upcall
> > 
> >   At the same time, the check-point resume process goes on (to resume
> > other devices, enable more CPUs, and to start user threads):
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/cpr/cpr_main.c#1110
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/cpr/cpr_main.c#1216
> > 
> >   Finally, system failed to resume. No message output (event those produced
> > by prom_printf), serial console not responding, and kmdb can't be activated.
> > <<
> > 
> > Hi door and CPR experts,
> >   Is there any possibility that door up calls while system resuming end up 
> > deadlock?
> > 
> > PS:
> >   two workarounds available:
> >   1. change iwk to notify link down event to net80211 while system 
> > suspending
> > (rather than resuming)
> >   2. add 1 tick (or more) delay right after here:
> > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/iwk/iwk2.c#477
> >   Delays added to elsewhere in iwk didn't help.
> > 
> > Thanks,
> > Tim
> > 
> > On Tue, 2008-11-18 at 01:06 +0100, Piotr Jasiukajtis wrote:
> >> Hi,
> >>
> >> I have a build revision 8127.
> >> Suspend/resume works only when I don't use wireless.
> >> It seems to be broken when iwk0 is plumbed. It doesn't resume.
> >>
> >> Is it known? I can't find any related bugs.
> >>
> > 
> 
> 


Reply via email to