Re: [PATCH v5] power: suspend: Move dpm_watchdog to suspend.c and enhance it

2021-01-27 Thread Greg Kroah-Hartman
On Fri, Jan 08, 2021 at 01:51:11PM +0800, Joseph Jang wrote:
> Since dpm_watchdog just cover two functions __device_suspend() and
> device_resume(), we proposed to move it to core power suspend.c to extend
> its coverage and monitor more devices suspend hand issues.
> 
> We propose to use new name suspend watchdog and new timeout handler to
> cover more sleep hang issues. The new timeout handler will dump disk
> sleep task call trace at first round timeout and trigger kernel panic
> at second round timeout.
> The default timer for each round is defined in
> CONFIG_PM_SUSPEND_WATCHDOG_TIMEOUT.
> 
> Signed-off-by: Joseph Jang 
> ---
> Changes since v4:
>  - Change #define PM_SUSPEND_WATCHDOG to CONFIG_PM_SUSPEND_WATCHDOG in 
> suspend_watchdog.c
>to make sure we compile all suspend watchdog functions.
>  - Add suspend watchdog functions declared in suspend_watchdog.h to prevent 
> compile errors.
> 
> Changes since v3:
>  - Change the naming from sleep timer to suspend watchdog.
>  - Remove console_is_suspended() from console.h and printk.c
>  - Add static declaration for suspend_watchdog struct since we use it in 
> suspend.c only.
>  - Move suspend watchdog real logic to suspend_watchdog.c.
>  - Put the function prototypes in suspend_watchdog.h
>  - Use inline functions for function prototypes definition.
> 
> Changes since v2:
>  - Add kernel/power/suspend_timer.h in MAINTAINERS
>  - Share some corner cases that dpm_watchdog cannot cover.
>   Case#A: dpm_watchdog cannot cover suspend hang in suspend_enter().
>   File: kernel/power/suspend.c
>   int suspend_devices_and_enter(suspend_state_t state)
>   {
>... 
> 
>  suspend_test_start();
>  error = dpm_suspend_start(PMSG_SUSPEND);  <== dpm_watchdog will be 
> enabled/disabled in this fucntion ...
>  if (error) {
>  pr_err("Some devices failed to suspend, or early wake event 
> detected\n");
>  goto Recover_platform;
>  }
>  suspend_test_finish("suspend devices");
>  if (suspend_test(TEST_DEVICES))
>  goto Recover_platform;
> 
>  do {
>  error = suspend_enter(state, );  <== but we encounter 
> hang inside function suspend_enter() ...
>  } while (!error && !wakeup && platform_suspend_again(state));
> 
>   Case#B: dpm_watchdog cannot cover resume hang in dpm_resume_early() because 
> it enable/disable in device_resume().
>   Call trace:
>__switch_to+0x174/0x194
>__schedule+0xa60/0x1464
>__cancel_work_timer+0x120/0x234
>chg_pm_resume+0x2c/0xd8
>dpm_run_callback+0x27c/0x624
>device_resume_early+0x1e4/0x1f8
>dpm_resume_early+0x350/0x8f4
>suspend_devices_and_enter+0xffc/0x168c
>pm_suspend+0xb54/0xdac
> 
>   File: drivers/base/power/main.c.
>   static int device_resume(struct device *dev, pm_message_t state, bool async)
>   if (!dpm_wait_for_superior(dev, async))
>   goto Complete;
>   
>   dpm_watchdog_set(, dev);
>   device_lock(dev);
>   
>   /*
>... 
>  
>   Unlock:
>   device_unlock(dev);
>   dpm_watchdog_clear();
> 
>   Case#C: dpm_watchdog cannot cover suspend hang in ksys_sync().
>Call trace:
> __switch_to+0x1b0/0x1cc
> __schedule+0xac8/0x1314
> io_schedule+0x94/0xc8
> wait_on_page_bit+0x1f8/0x3a4
> __filemap_fdatawait_range+0x134/0x150
> sync_inodes_sb+0x368/0x584
> sync_inodes_one_sb+0x18/0x24
> iterate_supers+0x12c/0x2b8
> ksys_sync+0x48/0x12c
> enter_state+0x294/0x7bc
> pm_suspend+0x164/0x2a8
> 
>  - Explain some enhancement by following.
>   Q1: Why not use the existing one?
>   struct dpm_watchdog {
>   struct device   *dev;
>   struct task_struct  *tsk;
>   struct timer_list   timer;
>   };
>   
>   A1: In kernel/power/suspend.c, we don't have "struct device " because 
> suspend.c is for core PM instead of device PM.
>   So we propose to use sleep_timer struct.
>   
>   struct sleep_timer {
>   struct task_struct  *tsk;
>   struct timer_list   timer;
>   };
>   
>   
>   Q2: Why not use stack memory for timer struct?
>   static void dpm_watchdog_set(struct dpm_watchdog *wd, struct device *dev)
>   {
>   ...   
>   timer_setup_on_stack(timer, dpm_watchdog_handler, 0);  <== dpm_watchdog 
> use stack memory for timer struct.
>   /* use same timeout value for both suspend and resume */
>   timer->expires = jiffies + HZ * CONFIG_DPM_WATCHDOG_TIMEOUT;
>   add_timer(timer);
>   }
>   
>   A2: dpm_watchdog use stack memory has limitation.
>   It cannot support starting watchdog at end of function like s2idle_enter().
>   We cannot use stack memory for this case because the timer struct will be 
> free when go back to caller.
>   
>   File: kernel/power/suspend.c
>   static void s2idle_enter(void)
> {
>   trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, 
> true);
>   
>   +   stop_sleep_timer();
>   +

[PATCH v5] power: suspend: Move dpm_watchdog to suspend.c and enhance it

2021-01-07 Thread Joseph Jang
Since dpm_watchdog just cover two functions __device_suspend() and
device_resume(), we proposed to move it to core power suspend.c to extend
its coverage and monitor more devices suspend hand issues.

We propose to use new name suspend watchdog and new timeout handler to
cover more sleep hang issues. The new timeout handler will dump disk
sleep task call trace at first round timeout and trigger kernel panic
at second round timeout.
The default timer for each round is defined in
CONFIG_PM_SUSPEND_WATCHDOG_TIMEOUT.

Signed-off-by: Joseph Jang 
---
Changes since v4:
 - Change #define PM_SUSPEND_WATCHDOG to CONFIG_PM_SUSPEND_WATCHDOG in 
suspend_watchdog.c
   to make sure we compile all suspend watchdog functions.
 - Add suspend watchdog functions declared in suspend_watchdog.h to prevent 
compile errors.

Changes since v3:
 - Change the naming from sleep timer to suspend watchdog.
 - Remove console_is_suspended() from console.h and printk.c
 - Add static declaration for suspend_watchdog struct since we use it in 
suspend.c only.
 - Move suspend watchdog real logic to suspend_watchdog.c.
 - Put the function prototypes in suspend_watchdog.h
 - Use inline functions for function prototypes definition.

Changes since v2:
 - Add kernel/power/suspend_timer.h in MAINTAINERS
 - Share some corner cases that dpm_watchdog cannot cover.
  Case#A: dpm_watchdog cannot cover suspend hang in suspend_enter().
  File: kernel/power/suspend.c
  int suspend_devices_and_enter(suspend_state_t state)
  {
   ... 

 suspend_test_start();
 error = dpm_suspend_start(PMSG_SUSPEND);  <== dpm_watchdog will be 
enabled/disabled in this fucntion ...
 if (error) {
 pr_err("Some devices failed to suspend, or early wake event 
detected\n");
 goto Recover_platform;
 }
 suspend_test_finish("suspend devices");
 if (suspend_test(TEST_DEVICES))
 goto Recover_platform;

 do {
 error = suspend_enter(state, );  <== but we encounter 
hang inside function suspend_enter() ...
 } while (!error && !wakeup && platform_suspend_again(state));

  Case#B: dpm_watchdog cannot cover resume hang in dpm_resume_early() because 
it enable/disable in device_resume().
  Call trace:
   __switch_to+0x174/0x194
   __schedule+0xa60/0x1464
   __cancel_work_timer+0x120/0x234
   chg_pm_resume+0x2c/0xd8
   dpm_run_callback+0x27c/0x624
   device_resume_early+0x1e4/0x1f8
   dpm_resume_early+0x350/0x8f4
   suspend_devices_and_enter+0xffc/0x168c
   pm_suspend+0xb54/0xdac

  File: drivers/base/power/main.c.
  static int device_resume(struct device *dev, pm_message_t state, bool async)
  if (!dpm_wait_for_superior(dev, async))
  goto Complete;
  
  dpm_watchdog_set(, dev);
  device_lock(dev);
  
  /*
   ... 
 
  Unlock:
  device_unlock(dev);
  dpm_watchdog_clear();

  Case#C: dpm_watchdog cannot cover suspend hang in ksys_sync().
   Call trace:
__switch_to+0x1b0/0x1cc
__schedule+0xac8/0x1314
io_schedule+0x94/0xc8
wait_on_page_bit+0x1f8/0x3a4
__filemap_fdatawait_range+0x134/0x150
sync_inodes_sb+0x368/0x584
sync_inodes_one_sb+0x18/0x24
iterate_supers+0x12c/0x2b8
ksys_sync+0x48/0x12c
enter_state+0x294/0x7bc
pm_suspend+0x164/0x2a8

 - Explain some enhancement by following.
  Q1: Why not use the existing one?
  struct dpm_watchdog {
  struct device   *dev;
  struct task_struct  *tsk;
  struct timer_list   timer;
  };
  
  A1: In kernel/power/suspend.c, we don't have "struct device " because 
suspend.c is for core PM instead of device PM.
  So we propose to use sleep_timer struct.
  
  struct sleep_timer {
  struct task_struct*tsk;
  struct timer_list timer;
  };
  
  
  Q2: Why not use stack memory for timer struct?
  static void dpm_watchdog_set(struct dpm_watchdog *wd, struct device *dev)
  {
  ...   
  timer_setup_on_stack(timer, dpm_watchdog_handler, 0);  <== dpm_watchdog 
use stack memory for timer struct.
  /* use same timeout value for both suspend and resume */
  timer->expires = jiffies + HZ * CONFIG_DPM_WATCHDOG_TIMEOUT;
  add_timer(timer);
  }
  
  A2: dpm_watchdog use stack memory has limitation.
  It cannot support starting watchdog at end of function like s2idle_enter().
  We cannot use stack memory for this case because the timer struct will be 
free when go back to caller.
  
  File: kernel/power/suspend.c
  static void s2idle_enter(void)
{
  trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, 
true);
  
  +   stop_sleep_timer();
  +
  raw_spin_lock_irq(_lock);
  if (pm_wakeup_pending())
  goto out;
  ... 
  s2idle_state = S2IDLE_STATE_NONE;
  raw_spin_unlock_irq(_lock);
  
  +   start_sleep_timer();
  +
  trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, 
false);
}
  
  So we propose to