On 05/07/2016 15:33, Mason wrote: > I was testing suspend/resume sequences where the suspend operation > fails and returns without having suspended the platform. > > # echo mem > /sys/power/state > [ 90.322264] PM: Syncing filesystems ... done. > [ 90.328758] Freezing user space processes ... (elapsed 0.001 seconds) done. > [ 90.337092] Double checking all user space processes after OOM killer > disable... (elapsed 0.000 seconds) > [ 90.346765] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) > done. > [ 90.355357] Suspending console(s) (use no_console_suspend to debug) > [ 90.364590] PM: suspend of devices complete after 2.068 msecs > [ 90.365554] PM: late suspend of devices complete after 0.954 msecs > [ 90.366223] PM: noirq suspend of devices complete after 0.662 msecs > [ 90.366227] Disabling non-boot CPUs ... > [ 90.379004] CPU1: shutdown > [ 90.412661] Enabling non-boot CPUs ... > [ 90.450385] CPU1 is up > [ 90.450979] PM: noirq resume of devices complete after 0.584 msecs > [ 90.451672] PM: early resume of devices complete after 0.667 msecs > [ 90.453149] nb8800 26000.ethernet eth0: Link is Down > [ 90.453264] PM: resume of devices complete after 1.583 msecs > [ 90.508180] Restarting tasks ... done. > -sh: echo: write error: Input/output error > [ 93.860411] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow > control rx/tx > > (The error message is expected, as my suspend routine returns -EIO > on failure.) > > I left the system to idle at the prompt; then 5 minutes later, > the system printed the following trace. > > [ 400.718491] ------------[ cut here ]------------ > [ 400.723175] WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155 > inet_sock_destruct+0x1c4/0x1dc > [ 400.731582] Modules linked in: > [ 400.734689] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 4.7.0-rc6-00010-gd07031bdc433 #1 > [ 400.742646] Hardware name: Sigma Tango DT > [ 400.746671] Backtrace: > [ 400.749141] [<c010b974>] (dump_backtrace) from [<c010bb70>] > (show_stack+0x18/0x1c) > [ 400.756747] r7:60000113 r6:c080ea84 r5:00000000 r4:c080ea84 > [ 400.762454] [<c010bb58>] (show_stack) from [<c02e9fe4>] > (dump_stack+0x80/0x94) > [ 400.769722] [<c02e9f64>] (dump_stack) from [<c011bfc8>] (__warn+0xec/0x104) > [ 400.776717] r7:00000009 r6:c05e3fbc r5:00000000 r4:00000000 > [ 400.782417] [<c011bedc>] (__warn) from [<c011c098>] > (warn_slowpath_null+0x28/0x30) > [ 400.790022] r9:dfbdd4e0 r8:0000000a r7:c0801de8 r6:df6f9514 r5:df5df144 > r4:df5df040 > [ 400.797825] [<c011c070>] (warn_slowpath_null) from [<c0463c04>] > (inet_sock_destruct+0x1c4/0x1dc) > [ 400.806661] [<c0463a40>] (inet_sock_destruct) from [<c03e9c60>] > (__sk_destruct+0x28/0xe0) > [ 400.814878] r7:c0801de8 r6:df6f9514 r5:df5df040 r4:df5df1ec > [ 400.820584] [<c03e9c38>] (__sk_destruct) from [<c016f230>] > (rcu_process_callbacks+0x488/0x59c) > [ 400.829237] r5:00000000 r4:00000000 > [ 400.832836] [<c016eda8>] (rcu_process_callbacks) from [<c01207e4>] > (__do_softirq+0x138/0x264) > [ 400.841402] r10:c08020a0 r9:40000001 r8:00000101 r7:c0800000 r6:c08020a4 > r5:00000009 > [ 400.849285] r4:00000000 > [ 400.851829] [<c01206ac>] (__do_softirq) from [<c0120c04>] > (irq_exit+0xc8/0x104) > [ 400.859172] r10:c0801f10 r9:df402400 r8:00000001 r7:00000000 r6:00000013 > r5:00000000 > [ 400.867053] r4:c0735428 > [ 400.869601] [<c0120b3c>] (irq_exit) from [<c0162610>] > (__handle_domain_irq+0x88/0xf4) > [ 400.877473] [<c0162588>] (__handle_domain_irq) from [<c01014ac>] > (gic_handle_irq+0x50/0x94) > [ 400.885865] r10:dfffcdc0 r9:e0803100 r8:e0802100 r7:c0801f10 r6:e080210c > r5:c080277c > [ 400.893747] r4:c080eca0 r3:c0801f10 > [ 400.897342] [<c010145c>] (gic_handle_irq) from [<c010c694>] > (__irq_svc+0x54/0x90) > [ 400.904861] Exception stack(0xc0801f10 to 0xc0801f58) > [ 400.909936] 1f00: 00000000 00000000 > 0000826a c0117c80 > [ 400.918156] 1f20: c0800000 c08024f8 c0802494 c081e2d6 c05b954c c07268c0 > dfffcdc0 c0801f6c > [ 400.926376] 1f40: c0801f70 c0801f60 c01086b0 c01086b4 60000013 ffffffff > [ 400.933020] r9:c07268c0 r8:c05b954c r7:c0801f44 r6:ffffffff r5:60000013 > r4:c01086b4 > [ 400.940826] [<c0108674>] (arch_cpu_idle) from [<c0155f54>] > (default_idle_call+0x28/0x34) > [ 400.948960] [<c0155f2c>] (default_idle_call) from [<c0156088>] > (cpu_startup_entry+0x128/0x17c) > [ 400.957620] [<c0155f60>] (cpu_startup_entry) from [<c04a3f54>] > (rest_init+0x8c/0x90) > [ 400.965400] r7:ffffffff r4:00000002 > [ 400.969005] [<c04a3ec8>] (rest_init) from [<c0700cb4>] > (start_kernel+0x310/0x31c) > [ 400.976522] r5:c081e4c0 r4:00000001 > [ 400.980121] [<c07009a4>] (start_kernel) from [<8000807c>] (0x8000807c) > [ 400.986716] ---[ end trace f8deb50d1b3d3c7a ]---
NB: The warning shows up 310 seconds after the suspend attempt. I rebooted, tried the same operation, and hit the same warning still 310 seconds later: # echo mem > /sys/power/state [ 25.665905] PM: Syncing filesystems ... done. [ 25.672102] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 25.680807] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds) [ 25.690494] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 25.699118] Suspending console(s) (use no_console_suspend to debug) [ 25.707899] PM: suspend of devices complete after 1.639 msecs [ 25.708796] PM: late suspend of devices complete after 0.887 msecs [ 25.709460] PM: noirq suspend of devices complete after 0.657 msecs [ 25.709465] Disabling non-boot CPUs ... [ 25.729045] CPU1: shutdown [ 25.762704] Enabling non-boot CPUs ... [ 25.800416] CPU1 is up [ 25.801024] PM: noirq resume of devices complete after 0.595 msecs [ 25.801730] PM: early resume of devices complete after 0.678 msecs [ 25.803194] nb8800 26000.ethernet eth0: Link is Down [ 25.803311] PM: resume of devices complete after 1.571 msecs [ 25.858245] Restarting tasks ... done. -sh: echo: write error: Input/output error [ 29.186902] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 335.865192] ------------[ cut here ]------------ [ 335.869875] WARNING: CPU: 0 PID: 0 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x1c4/0x1dc [ 335.878284] Modules linked in: [ 335.881366] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc6-00010-gd07031bdc433 #1 [ 335.889321] Hardware name: Sigma Tango DT [ 335.893346] Backtrace: [ 335.895815] [<c010b974>] (dump_backtrace) from [<c010bb70>] (show_stack+0x18/0x1c) [ 335.903420] r7:60000113 r6:c080ea84 r5:00000000 r4:c080ea84 [ 335.909125] [<c010bb58>] (show_stack) from [<c02e9fe4>] (dump_stack+0x80/0x94) [ 335.916391] [<c02e9f64>] (dump_stack) from [<c011bfc8>] (__warn+0xec/0x104) [ 335.923386] r7:00000009 r6:c05e3fbc r5:00000000 r4:00000000 [ 335.929086] [<c011bedc>] (__warn) from [<c011c098>] (warn_slowpath_null+0x28/0x30) [ 335.936691] r9:dfbdd4e0 r8:0000000a r7:c0801de8 r6:df75ec54 r5:df5af3c4 r4:df5af2c0 [ 335.944491] [<c011c070>] (warn_slowpath_null) from [<c0463c04>] (inet_sock_destruct+0x1c4/0x1dc) [ 335.953326] [<c0463a40>] (inet_sock_destruct) from [<c03e9c60>] (__sk_destruct+0x28/0xe0) [ 335.961542] r7:c0801de8 r6:df75ec54 r5:df5af2c0 r4:df5af46c [ 335.967248] [<c03e9c38>] (__sk_destruct) from [<c016f230>] (rcu_process_callbacks+0x488/0x59c) [ 335.975901] r5:00000000 r4:00000000 [ 335.979499] [<c016eda8>] (rcu_process_callbacks) from [<c01207e4>] (__do_softirq+0x138/0x264) [ 335.988065] r10:c08020a0 r9:40000001 r8:00000101 r7:c0800000 r6:c08020a4 r5:00000009 [ 335.995947] r4:00000000 [ 335.998492] [<c01206ac>] (__do_softirq) from [<c0120c04>] (irq_exit+0xc8/0x104) [ 336.005835] r10:c0801f10 r9:df402400 r8:00000001 r7:00000000 r6:00000013 r5:00000000 [ 336.013716] r4:c0735428 [ 336.016264] [<c0120b3c>] (irq_exit) from [<c0162610>] (__handle_domain_irq+0x88/0xf4) [ 336.024136] [<c0162588>] (__handle_domain_irq) from [<c01014ac>] (gic_handle_irq+0x50/0x94) [ 336.032526] r10:dfffcdc0 r9:e0803100 r8:e0802100 r7:c0801f10 r6:e080210c r5:c080277c [ 336.040406] r4:c080eca0 r3:c0801f10 [ 336.044001] [<c010145c>] (gic_handle_irq) from [<c010c694>] (__irq_svc+0x54/0x90) [ 336.051520] Exception stack(0xc0801f10 to 0xc0801f58) [ 336.056594] 1f00: 00000000 00000000 000079b2 c0117c80 [ 336.064815] 1f20: c0800000 c08024f8 c0802494 c081e2d6 c05b954c c07268c0 dfffcdc0 c0801f6c [ 336.073034] 1f40: c0801f70 c0801f60 c01086b0 c01086b4 60000013 ffffffff [ 336.079678] r9:c07268c0 r8:c05b954c r7:c0801f44 r6:ffffffff r5:60000013 r4:c01086b4 [ 336.087483] [<c0108674>] (arch_cpu_idle) from [<c0155f54>] (default_idle_call+0x28/0x34) [ 336.095616] [<c0155f2c>] (default_idle_call) from [<c0156088>] (cpu_startup_entry+0x128/0x17c) [ 336.104277] [<c0155f60>] (cpu_startup_entry) from [<c04a3f54>] (rest_init+0x8c/0x90) [ 336.112057] r7:ffffffff r4:00000002 [ 336.115662] [<c04a3ec8>] (rest_init) from [<c0700cb4>] (start_kernel+0x310/0x31c) [ 336.123180] r5:c081e4c0 r4:00000001 [ 336.126777] [<c07009a4>] (start_kernel) from [<8000807c>] (0x8000807c) [ 336.133349] ---[ end trace d6b09977089e89b4 ]--- Does this 310 second lag ring a bell for anyone? Regards.