Re: [E1000-devel] Need help with igb driver suspend crash issue

Keller, Jacob E Wed, 11 May 2016 10:54:12 -0700


From: vidya sagar [mailto:[email protected]]
Sent: Wednesday, May 11, 2016 10:44 AM
To: Keller, Jacob E <[email protected]>
Cc: [email protected]; [email protected]
Subject: Re: [E1000-devel] Need help with igb driver suspend crash issue


I added the dump_stack() to see the full flow as the error print "PCIe link 
lost, device now detached" is there as part of igb_rd32() API which is called 
at many places. Are we not supposed to 'cancel delayed work of 
igb_ptp_overflow_check() when system goes to suspend state (and schedule when 
system resumes)?

On Wed, May 11, 2016 at 9:54 PM, Keller, Jacob E 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

> -----Original Message-----
> From: vidya sagar [mailto:[email protected]<mailto:[email protected]>]
> Sent: Wednesday, May 11, 2016 3:25 AM
> To: 
> [email protected]<mailto:[email protected]>; 
> linuxptp-
> [email protected]<mailto:[email protected]>
> Subject: Re: [E1000-devel] Need help with igb driver suspend crash issue
>
> <<< Including 
> [email protected]<mailto:[email protected]>
>  >>>
>
> On Wed, May 11, 2016 at 3:51 PM, vidya sagar 
> <[email protected]<mailto:[email protected]>>
> wrote:
>
> > Hi,
> > I'm using Intel IGB I350 NIC card on one of our arm based platforms.
> > While suspending the system, sometimes we see "igb 0000:01:00.0 eth1:
> PCIe
> > link lost, device now detached" print in the log and subsequent resume
> > causes system to crash. After digging the code (BTW, I'm using kernel-3.18
> > release), it looks like the above print comes because of the following call
> > flow, which got executed after igb_suspend() is called ( I confirmed this
> > with the help of prints)
> >
> > [10846.434381] [<ffffffc000089ce4>] dump_backtrace+0x0/0xf8
> > [10846.434386] [<ffffffc000089ea0>] show_stack+0x10/0x1c
> > [10846.434393] [<ffffffc000bc3b70>] dump_stack+0x80/0xc4
> > [10846.434397] [<ffffffc000613d3c>] igb_rd32+0xb0/0x1a8
> > [10846.434400] [<ffffffc00062eb0c>] igb_ptp_read_82580+0x18/0x48
> > [10846.434407] [<ffffffc000106e6c>] timecounter_read+0x1c/0x60
> > [10846.434410] [<ffffffc00062f338>] igb_ptp_gettime_82576+0x2c/0x88
> > [10846.434413] [<ffffffc00062f41c>] igb_ptp_overflow_check+0x1c/0x58
> > [10846.434419] [<ffffffc0000ba584>] process_one_work+0x154/0x414
> > [10846.434424] [<ffffffc0000bb338>] worker_thread+0x13c/0x4e4
> > [10846.434428] [<ffffffc0000bfc4c>] kthread+0xf8/0x110
> >
> > It looks like reading timer registers would have returned all F's as the
> > device is already in D3Hot state.
> > Is my understanding correct. Is there any patch available to fix this
> > issue?
> > Let me know if more information is needed.
> >
Maybe an ordering bug when doing suspend that we try to read things too late. 
Is that stack trace the actual crash or did you add the dump_stack yourself?

Thanks,
Jake

> > Thanks,
> > Vidya Sagar
> >

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j

_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Need help with igb driver suspend crash issue

Reply via email to