RE: [PATCH] ahci: Remove the exporting of ahci_em_messages
> -Original Message- > From: Denis Efremov [mailto:efre...@linux.com] > Sent: Wednesday, July 10, 2019 11:29 PM > To: Liu, Chuansheng > Cc: Denis Efremov ; Jens Axboe ; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: [PATCH] ahci: Remove the exporting of ahci_em_messages > > The variable ahci_em_messages is declared static and marked > EXPORT_SYMBOL_GPL, which is at best an odd combination. Because the > variable is not used outside of the drivers/ata/libahci.c file it is > defined in, this commit removes the EXPORT_SYMBOL_GPL() marking. Sounds good to me, thanks. Reviewed-by: Chuansheng Liu
RE: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic
> -Original Message- > From: Petr Mladek [mailto:pmla...@suse.com] > Sent: Wednesday, December 12, 2018 6:18 PM > To: Liu, Chuansheng > Cc: Tetsuo Handa ; Sergey Senozhatsky > ; a...@linux-foundation.org; > sergey.senozhat...@gmail.com; rost...@goodmis.org; dvyu...@google.com; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic > > On Wed 2018-12-12 01:16:11, Liu, Chuansheng wrote: > > > > > > > -Original Message- > > > From: Tetsuo Handa [mailto:penguin-ker...@i-love.sakura.ne.jp] > > > Sent: Tuesday, December 11, 2018 6:02 PM > > > To: Liu, Chuansheng ; Sergey Senozhatsky > > > > > > Cc: a...@linux-foundation.org; pmla...@suse.com; > > > sergey.senozhat...@gmail.com; rost...@goodmis.org; > dvyu...@google.com; > > > linux-kernel@vger.kernel.org > > > Subject: Re: [PATCH] kernel/hung_task.c: force ignore_loglevel before > > > panic > > > > > > On 2018/12/11 10:16, Liu, Chuansheng wrote: > > > > We may enhance it by: > > > > - if (sysctl_hung_task_warnings) { > > > > + if (sysctl_hung_task_panic || sysctl_hung_task_warnings) { > > > > if (sysctl_hung_task_warnings > 0) > > > > sysctl_hung_task_warnings--; > > > > > > Why ignore sysctl_hung_task_warnings? The administrator can already > > > configure as sysctl_hung_task_warnings == -1 && sysctl_hung_task_panic > == 1 > > > if he/she does not want to suppress neither sched_show_task() nor > > > debug_show_all_locks()/trigger_all_cpu_backtrace(). Someone might want > that > > > sysctl_hung_task_warnings == 0 (which is a request to suppress only > > > sched_show_task()) should not be ignored by sysctl_hung_task_panic == 1 > > > (which is a request to trigger panic). > > > > > > My complete idea is in patch V1 which has been sent. Paste here: > > If sysctl_hung_task_panic == 1, I will force sched_show_task(t) and set > > hung_task_call_panic = true > > hung_task_show_lock = true > > Please, do not mix two changes into one patch. Thanks your suggestion. > > Add console_verbose() in one patch. It is simple and > everyone has agreed with it so far. Has sent it out, please help review. > > Force sched_show_task() when hung_task_call_panic == 1 in > another patch. It seems to be controversial and should be > discussed/changed separately. I found some other points also, and will send out patches later.
[PATCH V2] kernel/hung_task.c: Force console verbose before panic
Based on patch commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before panic"), we could get the call stack of hung task. However, if the console loglevel is not high, we still can not see the useful panic information in practice, and in most cases users don't set console loglevel to high level. This patch is to force console verbose before system panic, so that the real useful information can be seen in the console, instead of being like the following, which doesn't have hung task information. [ 246.916600] INFO: task init:1 blocked for more than 120 seconds. [ 246.922320] Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.926790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.932553] Kernel panic - not syncing: hung_task: blocked tasks [ 246.938503] CPU: 2 PID: 479 Comm: khungtaskd Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.990266] Call Trace: [ 246.991707] dump_stack+0x4f/0x65 [ 246.993710] panic+0xde/0x231 [ 246.995445] watchdog+0x290/0x410 [ 246.997390] kthread+0x12c/0x150 [ 246.999301] ? reset_hung_task_detector+0x20/0x20 [ 247.004825] ? kthread_create_worker_on_cpu+0x70/0x70 [ 247.007735] ret_from_fork+0x35/0x40 [ 247.010280] reboot: panic mode set: p,w [ 247.012619] Kernel Offset: 0x3400 from 0x8100 (relocation range: 0x8000-0xbfff) V1: console_verbose() is used instead of ignore_loglevel, suggested by Sergey. Tweak the function check_hung_task() suggested by Tetsuo to make code more readable. V2: Petr suggests to make 2 patches: One is addressing console_verbose() only, which is this patch's aim. The other is about forcing sched_show_task() during panic, which needs more discussion yet. Cc: Sergey Senozhatsky Cc: Tetsuo Handa Cc: Petr Mladek Signed-off-by: Chuansheng Liu --- kernel/hung_task.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/kernel/hung_task.c b/kernel/hung_task.c index cb8e3e8..85af0cd 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -112,8 +112,11 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) trace_sched_process_hang(t); - if (!sysctl_hung_task_warnings && !sysctl_hung_task_panic) - return; + if (sysctl_hung_task_panic) { + console_verbose(); + hung_task_show_lock = true; + hung_task_call_panic = true; + } /* * Ok, the task did not get scheduled for more than 2 minutes, @@ -135,11 +138,6 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) } touch_nmi_watchdog(); - - if (sysctl_hung_task_panic) { - hung_task_show_lock = true; - hung_task_call_panic = true; - } } /* -- 2.7.4
[PATCH V1] kernel/hung_task.c: Force console verbose before panic
Based on patch commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before panic"), we could get the call stack of hung task. However, if the console loglevel is not high, we still can not see the useful panic information in practice, and in most cases users don't set console loglevel to high level. This patch is to force console verbose before system panic, so that the real useful information can be seen in the console, instead of being like the following, which doesn't have hung task information. [ 246.916600] INFO: task init:1 blocked for more than 120 seconds. [ 246.922320] Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.926790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.932553] Kernel panic - not syncing: hung_task: blocked tasks [ 246.938503] CPU: 2 PID: 479 Comm: khungtaskd Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.990266] Call Trace: [ 246.991707] dump_stack+0x4f/0x65 [ 246.993710] panic+0xde/0x231 [ 246.995445] watchdog+0x290/0x410 [ 246.997390] kthread+0x12c/0x150 [ 246.999301] ? reset_hung_task_detector+0x20/0x20 [ 247.004825] ? kthread_create_worker_on_cpu+0x70/0x70 [ 247.007735] ret_from_fork+0x35/0x40 [ 247.010280] reboot: panic mode set: p,w [ 247.012619] Kernel Offset: 0x3400 from 0x8100 (relocation range: 0x8000-0xbfff) V1: console_verbose() is used instead of ignore_loglevel, suggested by Sergey. Tweak the function check_hung_task() suggested by Tetsuo to make code more readable. Cc: Sergey Senozhatsky Cc: Tetsuo Handa Signed-off-by: Chuansheng Liu --- kernel/hung_task.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/kernel/hung_task.c b/kernel/hung_task.c index cb8e3e8..8db421f 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -112,14 +112,16 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) trace_sched_process_hang(t); - if (!sysctl_hung_task_warnings && !sysctl_hung_task_panic) - return; - /* * Ok, the task did not get scheduled for more than 2 minutes, * complain: */ - if (sysctl_hung_task_warnings) { + if (sysctl_hung_task_panic || sysctl_hung_task_warnings) { + if (sysctl_hung_task_panic) { + console_verbose(); + hung_task_call_panic = true; + } + if (sysctl_hung_task_warnings > 0) sysctl_hung_task_warnings--; pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n", @@ -132,13 +134,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) " disables this message.\n"); sched_show_task(t); hung_task_show_lock = true; - } - - touch_nmi_watchdog(); - if (sysctl_hung_task_panic) { - hung_task_show_lock = true; - hung_task_call_panic = true; + touch_nmi_watchdog(); } } -- 2.7.4
RE: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic
> -Original Message- > From: Tetsuo Handa [mailto:penguin-ker...@i-love.sakura.ne.jp] > Sent: Monday, December 10, 2018 5:59 PM > To: Sergey Senozhatsky ; Liu, > Chuansheng > Cc: a...@linux-foundation.org; pmla...@suse.com; > sergey.senozhat...@gmail.com; rost...@goodmis.org; dvyu...@google.com; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic > > On 2018/12/10 15:11, Sergey Senozhatsky wrote: > > On (12/10/18 05:58), Liu, Chuansheng wrote: > >>> On (12/10/18 05:40), Liu, Chuansheng wrote: > >>>> @@ -130,6 +130,13 @@ static void check_hung_task(struct task_struct > >>>> *t, > >>> unsigned long timeout) > >>>> init_utsname()->version); > >>>> pr_err("\"echo 0 > > >>>> /proc/sys/kernel/hung_task_timeout_secs\"" > >>>> " disables this message.\n"); > >>>> + /* When sysctl_hung_task_panic is set, we have to force > >>>> +* ignore_loglevel to get really useful hung task > >>>> +* information. > >>>> +*/ > >>>> + if (sysctl_hung_task_panic && !ignore_loglevel) > >>>> + ignore_loglevel = true; > >>> > >>> console_verbose()? > >> > >> Thanks Sergey, it is really my need. I will prepare for a new version > >> of patch:) > > > > Let's wait for people to take a look at this patch first. > > Shouldn't console_verbose() be called like > > -if (!sysctl_hung_task_warnings && !sysctl_hung_task_panic) > +if (sysctl_hung_task_panic) > +console_verbose(); > +else if (!sysctl_hung_task_warnings) > return; > > or > > -if (!sysctl_hung_task_warnings && !sysctl_hung_task_panic) > -return; > +if (sysctl_hung_task_panic) > +console_verbose(); > > or > > -if (!sysctl_hung_task_warnings && !sysctl_hung_task_panic) > -return; > +if (sysctl_hung_task_panic) { > +console_verbose(); > +hung_task_show_lock = true; > +hung_task_call_panic = true; > +} > (...snipped...) > -if (sysctl_hung_task_panic) { > -hung_task_show_lock = true; > -hung_task_call_panic = true; > -} Thanks Tetsuo, I prefer this option, which makes code more readable. > > so that sysctl_hung_task_warnings == 0 && sysctl_hung_task_panic == 1 will > call debug_show_all_locks() and trigger_all_cpu_backtrace() with verbose > level? More thoughts in this condition of sysctl_hung_task_warnings == 0 && sysctl_hung_task_panic == 1, in this case, debug_show_all_locks() may not output useful information if LOCK DEBUG config is not enabled. trigger_all_cpu_backtrace() will not show the hung task for debugging either. We may enhance it by: - if (sysctl_hung_task_warnings) { + if (sysctl_hung_task_panic || sysctl_hung_task_warnings) { if (sysctl_hung_task_warnings > 0) sysctl_hung_task_warnings--;
RE: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic
> -Original Message- > From: Sergey Senozhatsky [mailto:sergey.senozhatsky.w...@gmail.com] > Sent: Monday, December 10, 2018 1:46 PM > To: Liu, Chuansheng > Cc: a...@linux-foundation.org; pmla...@suse.com; > sergey.senozhat...@gmail.com; rost...@goodmis.org; dvyu...@google.com; > penguin-ker...@i-love.sakura.ne.jp; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] kernel/hung_task.c: force ignore_loglevel before panic > > On (12/10/18 05:40), Liu, Chuansheng wrote: > > @@ -130,6 +130,13 @@ static void check_hung_task(struct task_struct *t, > unsigned long timeout) > > init_utsname()->version); > > pr_err("\"echo 0 > > > /proc/sys/kernel/hung_task_timeout_secs\"" > > " disables this message.\n"); > > + /* When sysctl_hung_task_panic is set, we have to force > > +* ignore_loglevel to get really useful hung task > > +* information. > > +*/ > > + if (sysctl_hung_task_panic && !ignore_loglevel) > > + ignore_loglevel = true; > > console_verbose()? Thanks Sergey, it is really my need. I will prepare for a new version of patch:)
[PATCH] kernel/hung_task.c: force ignore_loglevel before panic
Based on patch commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before panic"), we could get the call stack of hung task. However, if the console loglevel is not high, we still can not get the useful information in practice, and in most cases users don't set console loglevel to high level. This patch is to force ignore_loglevel before system panic, so that the real useful information can be seen in the console, instead of being like the following, which doesn't have hung task information. [ 246.916600] INFO: task init:1 blocked for more than 120 seconds. [ 246.922320] Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.926790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.932553] Kernel panic - not syncing: hung_task: blocked tasks [ 246.938503] CPU: 2 PID: 479 Comm: khungtaskd Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 [ 246.990266] Call Trace: [ 246.991707] dump_stack+0x4f/0x65 [ 246.993710] panic+0xde/0x231 [ 246.995445] watchdog+0x290/0x410 [ 246.997390] kthread+0x12c/0x150 [ 246.999301] ? reset_hung_task_detector+0x20/0x20 [ 247.004825] ? kthread_create_worker_on_cpu+0x70/0x70 [ 247.007735] ret_from_fork+0x35/0x40 [ 247.010280] reboot: panic mode set: p,w [ 247.012619] Kernel Offset: 0x3400 from 0x8100 (relocation range: 0x8000-0xbfff) Signed-off-by: Chuansheng Liu --- include/linux/printk.h | 2 +- kernel/hung_task.c | 7 +++ kernel/printk/printk.c | 2 +- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/printk.h b/include/linux/printk.h index cf3eccf..24748c1 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -59,8 +59,8 @@ static inline const char *printk_skip_headers(const char *buffer) */ #define CONSOLE_LOGLEVEL_DEFAULT CONFIG_CONSOLE_LOGLEVEL_DEFAULT #define CONSOLE_LOGLEVEL_QUIET CONFIG_CONSOLE_LOGLEVEL_QUIET - extern int console_printk[]; +extern bool ignore_loglevel; #define console_loglevel (console_printk[0]) #define default_message_loglevel (console_printk[1]) diff --git a/kernel/hung_task.c b/kernel/hung_task.c index cb8e3e8..7d942d1 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -130,6 +130,13 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) init_utsname()->version); pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\"" " disables this message.\n"); + /* When sysctl_hung_task_panic is set, we have to force +* ignore_loglevel to get really useful hung task +* information. +*/ + if (sysctl_hung_task_panic && !ignore_loglevel) + ignore_loglevel = true; + sched_show_task(t); hung_task_show_lock = true; } diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 1b2a029..31a7a56 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1135,7 +1135,7 @@ void __init setup_log_buf(int early) free, (free * 100) / __LOG_BUF_LEN); } -static bool __read_mostly ignore_loglevel; +bool __read_mostly ignore_loglevel; static int __init ignore_loglevel_setup(char *str) { -- 2.7.4
RE: Change behaviour when tracing ... nasty trap (was Re: [PATCH] PM/Trace: get rid of synchronous resume limit during PM trace)
Hello Pavel, > -Original Message- > From: Pavel Machek [mailto:pa...@ucw.cz] > Sent: Monday, January 26, 2015 6:24 PM > To: Fu, Zhonghui > Cc: Rafael J. Wysocki; Brown, Len; Greg Kroah-Hartman; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Chuansheng > Subject: Change behaviour when tracing ... nasty trap (was Re: [PATCH] > PM/Trace: get rid of synchronous resume limit during PM trace) > > On Mon 2015-01-26 13:07:03, Fu, Zhonghui wrote: > > >From f9c841d1f943d81b5ab0aac7483e794a7f966296 Mon Sep 17 00:00:00 > 2001 > > From: Zhonghui Fu > > Date: Mon, 26 Jan 2015 11:27:08 +0800 > > Subject: [PATCH] PM/Trace: get rid of synchronous resume limit during PM > trace > > > > There are some kind of dependency between devices in some > > hardware platforms. So, asynchronous resuming devices may > > hang system due to wrong resume order. As a result, should > > not fore synchronously resuming devices during tracing > > PM events. > > > > Signed-off-by: Zhonghui Fu > > --- > > drivers/base/power/main.c|3 +-- > > include/linux/resume-trace.h |7 --- > > 2 files changed, 1 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > > index 9717d5f..5df148b 100644 > > --- a/drivers/base/power/main.c > > +++ b/drivers/base/power/main.c > > @@ -517,8 +517,7 @@ static int device_resume_noirq(struct device *dev, > pm_message_t state, bool asyn > > > > static bool is_async(struct device *dev) > > { > > - return dev->power.async_suspend && pm_async_enabled > > - && !pm_trace_is_enabled(); > > + return dev->power.async_suspend && pm_async_enabled; > > } > > > > Actually... whoever did the original patch was evil person. Changing > behaviour when tracing is requested is evil, evil, evil. Git blame > tells me > > Signed-off-by: Chuansheng Liu > > went to the dark side. Although I didn't get where is something wrong, but the is_async() is not created by my commit, it is from commit (PM: Start asynchronous resume threads upfront), I just moved it ahead. And like other phases, I added it into resum/suspend_noirq()... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Change behaviour when tracing ... nasty trap (was Re: [PATCH] PM/Trace: get rid of synchronous resume limit during PM trace)
Hello Pavel, -Original Message- From: Pavel Machek [mailto:pa...@ucw.cz] Sent: Monday, January 26, 2015 6:24 PM To: Fu, Zhonghui Cc: Rafael J. Wysocki; Brown, Len; Greg Kroah-Hartman; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Chuansheng Subject: Change behaviour when tracing ... nasty trap (was Re: [PATCH] PM/Trace: get rid of synchronous resume limit during PM trace) On Mon 2015-01-26 13:07:03, Fu, Zhonghui wrote: From f9c841d1f943d81b5ab0aac7483e794a7f966296 Mon Sep 17 00:00:00 2001 From: Zhonghui Fu zhonghui...@linux.intel.com Date: Mon, 26 Jan 2015 11:27:08 +0800 Subject: [PATCH] PM/Trace: get rid of synchronous resume limit during PM trace There are some kind of dependency between devices in some hardware platforms. So, asynchronous resuming devices may hang system due to wrong resume order. As a result, should not fore synchronously resuming devices during tracing PM events. Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com --- drivers/base/power/main.c|3 +-- include/linux/resume-trace.h |7 --- 2 files changed, 1 insertions(+), 9 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 9717d5f..5df148b 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -517,8 +517,7 @@ static int device_resume_noirq(struct device *dev, pm_message_t state, bool asyn static bool is_async(struct device *dev) { - return dev-power.async_suspend pm_async_enabled -!pm_trace_is_enabled(); + return dev-power.async_suspend pm_async_enabled; } Actually... whoever did the original patch was evil person. Changing behaviour when tracing is requested is evil, evil, evil. Git blame tells me Signed-off-by: Chuansheng Liu chuansheng@intel.com went to the dark side. Although I didn't get where is something wrong, but the is_async() is not created by my commit, it is from commit (PM: Start asynchronous resume threads upfront), I just moved it ahead. And like other phases, I added it into resum/suspend_noirq()... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] sched/fair: fix find_idlest_cpu return -1
> -Original Message- > From: Vincent Guittot [mailto:vincent.guit...@linaro.org] > Sent: Thursday, December 04, 2014 8:15 PM > To: Zhang, Jun > Cc: mi...@redhat.com; Peter Zijlstra; gre...@linuxfoundation.org; Hillf > Danton; linux-kernel; sta...@vger.kernel.org; Liu, Chuansheng; Liu, > Changcheng > Subject: Re: [PATCH v2] sched/fair: fix find_idlest_cpu return -1 > > On 4 December 2014 at 12:43, wrote: > > From: zhang jun > > > > in function select_task_rq_fair, when find_idlest_cpu return -1 and > > sd->child > == NULL > > select_task_rq_fair return -1, system panic. > > you forgot to add on which kernel version this patch applies. > > We don't have such issue on 3.18-rc7 as find_idlest_cpu don't return > -1. We only need a cleanup of the test against -1 for 3.18-rc > Agreed, so Jun can make two patches: The first one patch is in find_idlest_cpu() to not return -1 and based on stable-branch such as 3.14.x; The second patch is one cleanup for new_cpu == -1 judgment against the 3.18-rc, since it is no possibility now; Best Regards Chuansheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] sched/fair: fix select_task_rq_fair return -1
> -Original Message- > From: Vincent Guittot [mailto:vincent.guit...@linaro.org] > Sent: Thursday, December 04, 2014 6:08 PM > To: Hillf Danton > Cc: Zhang, Jun; Ingo Molnar; Peter Zijlstra; linux-kernel; Liu, Chuansheng; > Liu, > Changcheng > Subject: Re: [PATCH] sched/fair: fix select_task_rq_fair return -1 > > On 4 December 2014 at 10:05, Hillf Danton wrote: > >> > >> From: zhang jun > >> > >> when cpu == -1 and sd->child == NULL, select_task_rq_fair return -1, system > panic. > >> > >> [ 0.738326] BUG: unable to handle kernel paging request at > 8800997ea928 > >> [ 0.746138] IP: [] wake_up_new_task+0x43/0x1b0 > >> [ 0.752886] PGD 25df067 PUD 0 > >> [ 0.756321] Oops: 1 PREEMPT SMP > >> [ 0.760743] Modules linked in: > >> [ 0.764179] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted > 3.14.19-quilt-b27ac761 #2 > >> [ 0.772651] Hardware name: Intel Corporation CHERRYVIEW B1 > PLATFORM/Cherry Trail CR, BIOS CHTTRVP1.X64.0003.R08.140453 > >> 11/11/2014 > >> [ 0.786084] Workqueue: khelper __call_usermodehelper > >> [ 0.791649] task: 88007955a150 ti: 88007955c000 task.ti: > 88007955c000 > >> [ 0.800021] RIP: 0010:[] [] > wake_up_new_task+0x43/0x1b0 > >> [ 0.809478] RSP: :88007955dd58 EFLAGS: 00010092 > >> [ 0.815422] RAX: RBX: 0001 RCX: > 0020 > >> [ 0.823404] RDX: RSI: 0020 RDI: > 0020 > >> [ 0.831386] RBP: 88007955dd80 R08: 880079604b58 R09: > > >> [ 0.839368] R10: 0004 R11: eae0 R12: > 8800797ea650 > >> [ 0.847350] R13: 4000 R14: 8800797ead52 R15: > 0206 > >> [ 0.855335] FS: () GS:88007aa0() > knlGS: > >> [ 0.864387] CS: 0010 DS: ES: CR0: 8005003b > >> [ 0.870817] CR2: 8800997ea928 CR3: 0220b000 CR4: > 001007f0 > >> [ 0.878796] Stack: > >> [ 0.881046] 0001 8800797ea650 4000 > > >> [ 0.889363] 003c 88007955ddf0 8107ddfd > 810b6a95 > >> [ 0.897680] 8800796beb00 8800 > 8100 > >> [ 0.905998] Call Trace: > >> [ 0.908752] [] do_fork+0x12d/0x3b0 > >> [ 0.914416] [] ? set_next_entity+0x95/0xb0 > >> [ 0.920856] [] kernel_thread+0x26/0x30 > >> [ 0.926903] [] __call_usermodehelper+0x2e/0x90 > >> [ 0.933730] [] process_one_work+0x171/0x490 > >> [ 0.940264] [] worker_thread+0x11b/0x3a0 > >> [ 0.946508] [] ? manage_workers.isra.27+0x2b0/0x2b0 > >> [ 0.953821] [] kthread+0xd2/0xf0 > >> [ 0.959289] [] ? kthread_create_on_node+0x170/0x170 > >> [ 0.966602] [] ret_from_fork+0x7c/0xb0 > >> [ 0.972652] [] ? kthread_create_on_node+0x170/0x170 > >> [ 0.979956] Code: 49 89 fc 4c 89 f7 53 e8 bc 5c a4 00 49 8b 54 24 08 31 c9 > >> 49 > 89 c7 49 8b 44 24 60 4c 89 e7 8b 72 18 ba 08 00 00 00 ff 50 40 89 > >> c2 <49> 0f a3 94 24 e0 02 00 00 19 c9 85 c9 0f 84 34 01 00 00 48 8b > >> [ 1.001809] RIP [] wake_up_new_task+0x43/0x1b0 > >> [ 1.008641] RSP > >> [ 1.012544] CR2: 8800997ea928 > >> [ 1.016279] --[ end trace 9737aaa337a5ca10 ]-- > >> > >> Signed-off-by: zhang jun > >> Signed-off-by: Chuansheng Liu > >> Signed-off-by: Changcheng Liu > >> --- > >> kernel/sched/fair.c |2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index 34baa60..123153f 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -4587,6 +4587,8 @@ select_task_rq_fair(struct task_struct *p, int > prev_cpu, int sd_flag, int wake_f > >> if (new_cpu == -1 || new_cpu == cpu) { > >> /* Now try balancing at a lower domain level of > cpu */ > >> sd = sd->child; > >> + if ((!sd) && (new_cpu == -1)) > >> + new_cpu = smp_processor_id(); > >> continue; > >> } > >> > > In 3.18-rc7 is -1 still selected? > > find_idlest_cpu doesn't return -1 anymore but always a valid cpu. The > local cpu will be used if no better cpu has been found So I guess we can make one similar patch based on 3.14.x branch? Latest: find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu; 3.14.X: find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) return idlest; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] sched/fair: fix select_task_rq_fair return -1
-Original Message- From: Vincent Guittot [mailto:vincent.guit...@linaro.org] Sent: Thursday, December 04, 2014 6:08 PM To: Hillf Danton Cc: Zhang, Jun; Ingo Molnar; Peter Zijlstra; linux-kernel; Liu, Chuansheng; Liu, Changcheng Subject: Re: [PATCH] sched/fair: fix select_task_rq_fair return -1 On 4 December 2014 at 10:05, Hillf Danton hillf...@alibaba-inc.com wrote: From: zhang jun jun.zh...@intel.com when cpu == -1 and sd-child == NULL, select_task_rq_fair return -1, system panic. [ 0.738326] BUG: unable to handle kernel paging request at 8800997ea928 [ 0.746138] IP: [810b15d3] wake_up_new_task+0x43/0x1b0 [ 0.752886] PGD 25df067 PUD 0 [ 0.756321] Oops: 1 PREEMPT SMP [ 0.760743] Modules linked in: [ 0.764179] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 3.14.19-quilt-b27ac761 #2 [ 0.772651] Hardware name: Intel Corporation CHERRYVIEW B1 PLATFORM/Cherry Trail CR, BIOS CHTTRVP1.X64.0003.R08.140453 11/11/2014 [ 0.786084] Workqueue: khelper __call_usermodehelper [ 0.791649] task: 88007955a150 ti: 88007955c000 task.ti: 88007955c000 [ 0.800021] RIP: 0010:[810b15d3] [810b15d3] wake_up_new_task+0x43/0x1b0 [ 0.809478] RSP: :88007955dd58 EFLAGS: 00010092 [ 0.815422] RAX: RBX: 0001 RCX: 0020 [ 0.823404] RDX: RSI: 0020 RDI: 0020 [ 0.831386] RBP: 88007955dd80 R08: 880079604b58 R09: [ 0.839368] R10: 0004 R11: eae0 R12: 8800797ea650 [ 0.847350] R13: 4000 R14: 8800797ead52 R15: 0206 [ 0.855335] FS: () GS:88007aa0() knlGS: [ 0.864387] CS: 0010 DS: ES: CR0: 8005003b [ 0.870817] CR2: 8800997ea928 CR3: 0220b000 CR4: 001007f0 [ 0.878796] Stack: [ 0.881046] 0001 8800797ea650 4000 [ 0.889363] 003c 88007955ddf0 8107ddfd 810b6a95 [ 0.897680] 8800796beb00 8800 8100 [ 0.905998] Call Trace: [ 0.908752] [8107ddfd] do_fork+0x12d/0x3b0 [ 0.914416] [810b6a95] ? set_next_entity+0x95/0xb0 [ 0.920856] [8107e0a6] kernel_thread+0x26/0x30 [ 0.926903] [8109703e] __call_usermodehelper+0x2e/0x90 [ 0.933730] [8109ad31] process_one_work+0x171/0x490 [ 0.940264] [8109ba4b] worker_thread+0x11b/0x3a0 [ 0.946508] [8109b930] ? manage_workers.isra.27+0x2b0/0x2b0 [ 0.953821] [810a1802] kthread+0xd2/0xf0 [ 0.959289] [810a1730] ? kthread_create_on_node+0x170/0x170 [ 0.966602] [81af81ac] ret_from_fork+0x7c/0xb0 [ 0.972652] [810a1730] ? kthread_create_on_node+0x170/0x170 [ 0.979956] Code: 49 89 fc 4c 89 f7 53 e8 bc 5c a4 00 49 8b 54 24 08 31 c9 49 89 c7 49 8b 44 24 60 4c 89 e7 8b 72 18 ba 08 00 00 00 ff 50 40 89 c2 49 0f a3 94 24 e0 02 00 00 19 c9 85 c9 0f 84 34 01 00 00 48 8b [ 1.001809] RIP [810b15d3] wake_up_new_task+0x43/0x1b0 [ 1.008641] RSP 88007955dd58 [ 1.012544] CR2: 8800997ea928 [ 1.016279] --[ end trace 9737aaa337a5ca10 ]-- Signed-off-by: zhang jun jun.zh...@intel.com Signed-off-by: Chuansheng Liu chuansheng@intel.com Signed-off-by: Changcheng Liu changcheng@intel.com --- kernel/sched/fair.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 34baa60..123153f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4587,6 +4587,8 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f if (new_cpu == -1 || new_cpu == cpu) { /* Now try balancing at a lower domain level of cpu */ sd = sd-child; + if ((!sd) (new_cpu == -1)) + new_cpu = smp_processor_id(); continue; } In 3.18-rc7 is -1 still selected? find_idlest_cpu doesn't return -1 anymore but always a valid cpu. The local cpu will be used if no better cpu has been found So I guess we can make one similar patch based on 3.14.x branch? Latest: find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu; 3.14.X: find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) return idlest; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] sched/fair: fix find_idlest_cpu return -1
-Original Message- From: Vincent Guittot [mailto:vincent.guit...@linaro.org] Sent: Thursday, December 04, 2014 8:15 PM To: Zhang, Jun Cc: mi...@redhat.com; Peter Zijlstra; gre...@linuxfoundation.org; Hillf Danton; linux-kernel; sta...@vger.kernel.org; Liu, Chuansheng; Liu, Changcheng Subject: Re: [PATCH v2] sched/fair: fix find_idlest_cpu return -1 On 4 December 2014 at 12:43, jun.zh...@intel.com wrote: From: zhang jun jun.zh...@intel.com in function select_task_rq_fair, when find_idlest_cpu return -1 and sd-child == NULL select_task_rq_fair return -1, system panic. you forgot to add on which kernel version this patch applies. We don't have such issue on 3.18-rc7 as find_idlest_cpu don't return -1. We only need a cleanup of the test against -1 for 3.18-rc Agreed, so Jun can make two patches: The first one patch is in find_idlest_cpu() to not return -1 and based on stable-branch such as 3.14.x; The second patch is one cleanup for new_cpu == -1 judgment against the 3.18-rc, since it is no possibility now; Best Regards Chuansheng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH urgent v2] sched: Add missing rcu protection to wake_up_all_idle_cpus
> -Original Message- > From: Andy Lutomirski [mailto:l...@amacapital.net] > Sent: Sunday, November 30, 2014 12:14 AM > To: Ingo Molnar; Thomas Gleixner; linux-kernel@vger.kernel.org > Cc: Peter Zijlstra; Andy Lutomirski; Liu, Chuansheng > Subject: [PATCH urgent v2] sched: Add missing rcu protection to > wake_up_all_idle_cpus > > Locklessly doing is_idle_task(rq->curr) is only okay because of RCU > protection. The older variant of the broken code checked > rq->curr == rq->idle instead and therefore didn't need RCU. > > Fixes: f6be8af1c95d sched: Add new API wake_up_if_idle() to wake up the idle > cpu > Cc: Chuansheng Liu > Signed-off-by: Andy Lutomirski > --- Reviewed-by: Chuansheng Liu Thanks Andy. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH urgent v2] sched: Add missing rcu protection to wake_up_all_idle_cpus
-Original Message- From: Andy Lutomirski [mailto:l...@amacapital.net] Sent: Sunday, November 30, 2014 12:14 AM To: Ingo Molnar; Thomas Gleixner; linux-kernel@vger.kernel.org Cc: Peter Zijlstra; Andy Lutomirski; Liu, Chuansheng Subject: [PATCH urgent v2] sched: Add missing rcu protection to wake_up_all_idle_cpus Locklessly doing is_idle_task(rq-curr) is only okay because of RCU protection. The older variant of the broken code checked rq-curr == rq-idle instead and therefore didn't need RCU. Fixes: f6be8af1c95d sched: Add new API wake_up_if_idle() to wake up the idle cpu Cc: Chuansheng Liu chuansheng@intel.com Signed-off-by: Andy Lutomirski l...@amacapital.net --- Reviewed-by: Chuansheng Liu chuansheng@intel.com Thanks Andy. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, Will send out one new quirk-solution, and some reaction below:) > -Original Message- > From: Bjorn Helgaas [mailto:bhelg...@google.com] > Sent: Friday, November 07, 2014 1:39 AM > To: Liu, Chuansheng > Cc: Lu, Aaron; Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > > On Wed, Nov 5, 2014 at 11:39 PM, Liu, Chuansheng > wrote: > > > > > >> -Original Message- > >> From: Lu, Aaron > >> Sent: Thursday, November 06, 2014 1:37 PM > >> To: Liu, Chuansheng; Bjorn Helgaas > >> Cc: Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; > >> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > >> > >> On 11/06/2014 01:29 PM, Liu, Chuansheng wrote: > >> > Hello Bjorn, > >> > > >> >> -Original Message- > >> >> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >> >> Sent: Thursday, November 06, 2014 12:09 PM > >> >> To: Liu, Chuansheng > >> >> Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; > >> >> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >> >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > >> >> > >> >> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng > >> >> wrote: > >> >>> Hello Bjorn, > >> >>> > >> >>>> -Original Message- > >> >>>> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >> >>>> Sent: Thursday, November 06, 2014 3:04 AM > >> >>>> To: Barto > >> >>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; > >> >>>> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >> >>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron > chips > >> >>>> > >> >>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto > > >> >>>> wrote: > >> >>>>> this patch solves these 2 bug reports : > >> >>>>> > >> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861 > >> >>>>> > >> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551 > >> >>>> > >> >>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to > solve > >> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a > >> >>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. > >> >>>> > >> >>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably > it > >> >>>> was tested and somebody thought it did fix the problem. > >> >>> > >> >>> The first patch e6b7e41cdd8c which is just exclude some of JMicron > >> >> chips(363/361) out of async_suspend, > >> >>> then Barto found the same issue on JMicron 368, so we need one more > >> >> general patch to let JMicron chips > >> >>> out of async_suspend, so we make this patch. > >> >>> > >> >>> Bjorn, tj, > >> >>> Could you kindly take this patch? As Barto said, it effected the user > >> >> experience indeed, thanks. > >> >> > >> >> Thanks for clarifying the changelog as far as the different chips and > >> >> the different bugzillas. > >> >> > >> >> But you haven't addressed my concerns about (1) putting a PCI vendor > >> >> ID check in the generic PCI core code, and (2) applying this to *all* > >> >> JMicron devices. You might want to explore a quirk-type solution or > >> >> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. > >> > Understand your point, in fact, before this patch submitted, I had > >> > written > >> another patch https://lkml.org/lkml/2014/9/24/68 > >> > which addressed to add the quirk-type solution in ATA code, and Aaron > given > >> better suggestion that implemented at pci_pm_init(). > >> > How do you think of it? Thanks. > >> > >> I think Bjorn means that we should place the code as a fixup somewhere > >> in the quirks
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, Will send out one new quirk-solution, and some reaction below:) -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Friday, November 07, 2014 1:39 AM To: Liu, Chuansheng Cc: Lu, Aaron; Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 11:39 PM, Liu, Chuansheng chuansheng@intel.com wrote: -Original Message- From: Lu, Aaron Sent: Thursday, November 06, 2014 1:37 PM To: Liu, Chuansheng; Bjorn Helgaas Cc: Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On 11/06/2014 01:29 PM, Liu, Chuansheng wrote: Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 12:09 PM To: Liu, Chuansheng Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng chuansheng@intel.com wrote: Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 3:04 AM To: Barto Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 11:46 AM, Barto mister.free...@laposte.net wrote: this patch solves these 2 bug reports : https://bugzilla.kernel.org/show_bug.cgi?id=84861 https://bugzilla.kernel.org/show_bug.cgi?id=81551 Those bugs were already mentioned. But e6b7e41cdd8c claims to solve https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. So the question is, why was e6b7e41cdd8c insufficient? Presumably it was tested and somebody thought it did fix the problem. The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend, then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips out of async_suspend, so we make this patch. Bjorn, tj, Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks. Thanks for clarifying the changelog as far as the different chips and the different bugzillas. But you haven't addressed my concerns about (1) putting a PCI vendor ID check in the generic PCI core code, and (2) applying this to *all* JMicron devices. You might want to explore a quirk-type solution or maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68 which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init(). How do you think of it? Thanks. I think Bjorn means that we should place the code as a fixup somewhere in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL for those JMicron PCI devices seems to be a proper phase. Thanks Aaron, then how about below patch? diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c index 47e418b..9e85f86 100644 --- a/drivers/ata/pata_jmicron.c +++ b/drivers/ata/pata_jmicron.c @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i return ata_pci_bmdma_init_one(pdev, ppi, jmicron_sht, NULL, 0); } +/* + * For JMicron chips, we need to disable the async_suspend method, otherwise + * they will hit the power-on issue when doing device resume, add one quick + * solution to disable the async_suspend method. A quick solution is a red flag for me, because it's a hint that you just want the problem to go away without fully understanding it. That's probably not the case; it's probably just that *I* don't understand it all yet. +static void pci_async_suspend_fixup(struct pci_dev *pdev) +{ + /* +* disabling the async_suspend method for JMicron chips to +* avoid device resuming issue. +*/ + device_disable_async_suspend(pdev-dev); +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup); I know Barto tested this and it didn't work, but this *strategy* is far better than adding the JMicron test in pci_pm_init(). It's ideal if a quirk like this could be in the pata_jmicron.c file, so it's only
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
> -Original Message- > From: Lu, Aaron > Sent: Thursday, November 06, 2014 1:37 PM > To: Liu, Chuansheng; Bjorn Helgaas > Cc: Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > > On 11/06/2014 01:29 PM, Liu, Chuansheng wrote: > > Hello Bjorn, > > > >> -Original Message- > >> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >> Sent: Thursday, November 06, 2014 12:09 PM > >> To: Liu, Chuansheng > >> Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; > >> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > >> > >> On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng > >> wrote: > >>> Hello Bjorn, > >>> > >>>> -Original Message- > >>>> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >>>> Sent: Thursday, November 06, 2014 3:04 AM > >>>> To: Barto > >>>> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; > >>>> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >>>> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > >>>> > >>>> On Wed, Nov 5, 2014 at 11:46 AM, Barto > >>>> wrote: > >>>>> this patch solves these 2 bug reports : > >>>>> > >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=84861 > >>>>> > >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551 > >>>> > >>>> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a > >>>> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. > >>>> > >>>> So the question is, why was e6b7e41cdd8c insufficient? Presumably it > >>>> was tested and somebody thought it did fix the problem. > >>> > >>> The first patch e6b7e41cdd8c which is just exclude some of JMicron > >> chips(363/361) out of async_suspend, > >>> then Barto found the same issue on JMicron 368, so we need one more > >> general patch to let JMicron chips > >>> out of async_suspend, so we make this patch. > >>> > >>> Bjorn, tj, > >>> Could you kindly take this patch? As Barto said, it effected the user > >> experience indeed, thanks. > >> > >> Thanks for clarifying the changelog as far as the different chips and > >> the different bugzillas. > >> > >> But you haven't addressed my concerns about (1) putting a PCI vendor > >> ID check in the generic PCI core code, and (2) applying this to *all* > >> JMicron devices. You might want to explore a quirk-type solution or > >> maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. > > Understand your point, in fact, before this patch submitted, I had written > another patch https://lkml.org/lkml/2014/9/24/68 > > which addressed to add the quirk-type solution in ATA code, and Aaron given > better suggestion that implemented at pci_pm_init(). > > How do you think of it? Thanks. > > I think Bjorn means that we should place the code as a fixup somewhere > in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL > for those JMicron PCI devices seems to be a proper phase. Thanks Aaron, then how about below patch? diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c index 47e418b..9e85f86 100644 --- a/drivers/ata/pata_jmicron.c +++ b/drivers/ata/pata_jmicron.c @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i return ata_pci_bmdma_init_one(pdev, ppi, _sht, NULL, 0); } +/* + * For JMicron chips, we need to disable the async_suspend method, otherwise + * they will hit the power-on issue when doing device resume, add one quick + * solution to disable the async_suspend method. + */ +static void pci_async_suspend_fixup(struct pci_dev *pdev) +{ + /* +* disabling the async_suspend method for JMicron chips to +* avoid device resuming issue. +*/ + device_disable_async_suspend(>dev); +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup); + static const struct pci_device_id jmicron_pci_tbl[] = { { PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE << 8, 0x00, 0 }, Barto, Could you have a try on your side? Thanks.
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, > -Original Message- > From: Bjorn Helgaas [mailto:bhelg...@google.com] > Sent: Thursday, November 06, 2014 12:09 PM > To: Liu, Chuansheng > Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > > On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng > wrote: > > Hello Bjorn, > > > >> -Original Message- > >> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >> Sent: Thursday, November 06, 2014 3:04 AM > >> To: Barto > >> Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; > >> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > >> Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > >> > >> On Wed, Nov 5, 2014 at 11:46 AM, Barto > >> wrote: > >> > this patch solves these 2 bug reports : > >> > > >> > https://bugzilla.kernel.org/show_bug.cgi?id=84861 > >> > > >> > https://bugzilla.kernel.org/show_bug.cgi?id=81551 > >> > >> Those bugs were already mentioned. But e6b7e41cdd8c claims to solve > >> https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a > >> duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. > >> > >> So the question is, why was e6b7e41cdd8c insufficient? Presumably it > >> was tested and somebody thought it did fix the problem. > > > > The first patch e6b7e41cdd8c which is just exclude some of JMicron > chips(363/361) out of async_suspend, > > then Barto found the same issue on JMicron 368, so we need one more > general patch to let JMicron chips > > out of async_suspend, so we make this patch. > > > > Bjorn, tj, > > Could you kindly take this patch? As Barto said, it effected the user > experience indeed, thanks. > > Thanks for clarifying the changelog as far as the different chips and > the different bugzillas. > > But you haven't addressed my concerns about (1) putting a PCI vendor > ID check in the generic PCI core code, and (2) applying this to *all* > JMicron devices. You might want to explore a quirk-type solution or > maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68 which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init(). How do you think of it? Thanks. N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, > -Original Message- > From: Bjorn Helgaas [mailto:bhelg...@google.com] > Sent: Thursday, November 06, 2014 3:04 AM > To: Barto > Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips > > On Wed, Nov 5, 2014 at 11:46 AM, Barto > wrote: > > this patch solves these 2 bug reports : > > > > https://bugzilla.kernel.org/show_bug.cgi?id=84861 > > > > https://bugzilla.kernel.org/show_bug.cgi?id=81551 > > Those bugs were already mentioned. But e6b7e41cdd8c claims to solve > https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a > duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. > > So the question is, why was e6b7e41cdd8c insufficient? Presumably it > was tested and somebody thought it did fix the problem. The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend, then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips out of async_suspend, so we make this patch. Bjorn, tj, Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks.
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 3:04 AM To: Barto Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 11:46 AM, Barto mister.free...@laposte.net wrote: this patch solves these 2 bug reports : https://bugzilla.kernel.org/show_bug.cgi?id=84861 https://bugzilla.kernel.org/show_bug.cgi?id=81551 Those bugs were already mentioned. But e6b7e41cdd8c claims to solve https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. So the question is, why was e6b7e41cdd8c insufficient? Presumably it was tested and somebody thought it did fix the problem. The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend, then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips out of async_suspend, so we make this patch. Bjorn, tj, Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks.
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 12:09 PM To: Liu, Chuansheng Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng chuansheng@intel.com wrote: Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 3:04 AM To: Barto Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 11:46 AM, Barto mister.free...@laposte.net wrote: this patch solves these 2 bug reports : https://bugzilla.kernel.org/show_bug.cgi?id=84861 https://bugzilla.kernel.org/show_bug.cgi?id=81551 Those bugs were already mentioned. But e6b7e41cdd8c claims to solve https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. So the question is, why was e6b7e41cdd8c insufficient? Presumably it was tested and somebody thought it did fix the problem. The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend, then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips out of async_suspend, so we make this patch. Bjorn, tj, Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks. Thanks for clarifying the changelog as far as the different chips and the different bugzillas. But you haven't addressed my concerns about (1) putting a PCI vendor ID check in the generic PCI core code, and (2) applying this to *all* JMicron devices. You might want to explore a quirk-type solution or maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68 which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init(). How do you think of it? Thanks. N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] PCI: Do not enable async suspend for JMicron chips
-Original Message- From: Lu, Aaron Sent: Thursday, November 06, 2014 1:37 PM To: Liu, Chuansheng; Bjorn Helgaas Cc: Barto; Tejun Heo (t...@kernel.org); Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On 11/06/2014 01:29 PM, Liu, Chuansheng wrote: Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 12:09 PM To: Liu, Chuansheng Cc: Barto; Tejun Heo (t...@kernel.org); Lu, Aaron; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 6:48 PM, Liu, Chuansheng chuansheng@intel.com wrote: Hello Bjorn, -Original Message- From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, November 06, 2014 3:04 AM To: Barto Cc: Liu, Chuansheng; Lu, Aaron; Tejun Heo; Rafael Wysocki; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] PCI: Do not enable async suspend for JMicron chips On Wed, Nov 5, 2014 at 11:46 AM, Barto mister.free...@laposte.net wrote: this patch solves these 2 bug reports : https://bugzilla.kernel.org/show_bug.cgi?id=84861 https://bugzilla.kernel.org/show_bug.cgi?id=81551 Those bugs were already mentioned. But e6b7e41cdd8c claims to solve https://bugzilla.kernel.org/show_bug.cgi?id=81551, and 84861 is a duplicate of 81551, so it should also be fixed by e6b7e41cdd8c. So the question is, why was e6b7e41cdd8c insufficient? Presumably it was tested and somebody thought it did fix the problem. The first patch e6b7e41cdd8c which is just exclude some of JMicron chips(363/361) out of async_suspend, then Barto found the same issue on JMicron 368, so we need one more general patch to let JMicron chips out of async_suspend, so we make this patch. Bjorn, tj, Could you kindly take this patch? As Barto said, it effected the user experience indeed, thanks. Thanks for clarifying the changelog as far as the different chips and the different bugzillas. But you haven't addressed my concerns about (1) putting a PCI vendor ID check in the generic PCI core code, and (2) applying this to *all* JMicron devices. You might want to explore a quirk-type solution or maybe just add the JMicron 368 to the checks added by e6b7e41cdd8c. Understand your point, in fact, before this patch submitted, I had written another patch https://lkml.org/lkml/2014/9/24/68 which addressed to add the quirk-type solution in ATA code, and Aaron given better suggestion that implemented at pci_pm_init(). How do you think of it? Thanks. I think Bjorn means that we should place the code as a fixup somewhere in the quirks.c. I didn't take a closer look but DECLARE_PCI_FIXUP_FINAL for those JMicron PCI devices seems to be a proper phase. Thanks Aaron, then how about below patch? diff --git a/drivers/ata/pata_jmicron.c b/drivers/ata/pata_jmicron.c index 47e418b..9e85f86 100644 --- a/drivers/ata/pata_jmicron.c +++ b/drivers/ata/pata_jmicron.c @@ -158,6 +158,21 @@ static int jmicron_init_one (struct pci_dev *pdev, const struct pci_device_id *i return ata_pci_bmdma_init_one(pdev, ppi, jmicron_sht, NULL, 0); } +/* + * For JMicron chips, we need to disable the async_suspend method, otherwise + * they will hit the power-on issue when doing device resume, add one quick + * solution to disable the async_suspend method. + */ +static void pci_async_suspend_fixup(struct pci_dev *pdev) +{ + /* +* disabling the async_suspend method for JMicron chips to +* avoid device resuming issue. +*/ + device_disable_async_suspend(pdev-dev); +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, pci_async_suspend_fixup); + static const struct pci_device_id jmicron_pci_tbl[] = { { PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_STORAGE_IDE 8, 0x00, 0 }, Barto, Could you have a try on your side? Thanks.
RE: [PATCH] ata: Disabling the async PM for JMicron chips
> -Original Message- > From: Tejun Heo [mailto:hte...@gmail.com] On Behalf Of Tejun Heo > Sent: Tuesday, September 23, 2014 10:01 PM > To: Liu, Chuansheng > Cc: r...@rjwysocki.net; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org; mister.free...@laposte.net; Zhang, Rui > Subject: Re: [PATCH] ata: Disabling the async PM for JMicron chips > > On Tue, Sep 23, 2014 at 01:37:51PM +0800, Chuansheng Liu wrote: > > Be similar with commit (ata: Disabling the async PM for JMicron chip > 363/361), > > Please use 12_DIGIT_SHA1_PREFIX ("SUBJ") format when referring to > commits. > > > @@ -1345,10 +1345,10 @@ static int ahci_init_one(struct pci_dev *pdev, > const struct pci_device_id *ent) > > * follow the sequence one by one, otherwise one of them can not be > > * powered on successfully, so here we disable the async suspend > > * method for these chips. > > +* Jmicron chip 368 has been found has the similar issue, here we can > > +* exclude the Jmicron family directly to avoid other similar issues. > > Let's update the comment as a whole instead of appending to it. > Thanks your comments, Tejun, will send one new patch to review. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] ata: Disabling the async PM for JMicron chips
-Original Message- From: Tejun Heo [mailto:hte...@gmail.com] On Behalf Of Tejun Heo Sent: Tuesday, September 23, 2014 10:01 PM To: Liu, Chuansheng Cc: r...@rjwysocki.net; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; mister.free...@laposte.net; Zhang, Rui Subject: Re: [PATCH] ata: Disabling the async PM for JMicron chips On Tue, Sep 23, 2014 at 01:37:51PM +0800, Chuansheng Liu wrote: Be similar with commit (ata: Disabling the async PM for JMicron chip 363/361), Please use 12_DIGIT_SHA1_PREFIX (SUBJ) format when referring to commits. @@ -1345,10 +1345,10 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) * follow the sequence one by one, otherwise one of them can not be * powered on successfully, so here we disable the async suspend * method for these chips. +* Jmicron chip 368 has been found has the similar issue, here we can +* exclude the Jmicron family directly to avoid other similar issues. Let's update the comment as a whole instead of appending to it. Thanks your comments, Tejun, will send one new patch to review. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 3/3] cpuidle: Using the wake_up_all_idle_cpus() to wake up all idle cpus
> -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, September 04, 2014 9:02 PM > To: Liu, Chuansheng; pet...@infradead.org; l...@amacapital.net; > r...@rjwysocki.net; mi...@redhat.com > Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; > Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH 3/3] cpuidle: Using the wake_up_all_idle_cpus() to wake up > all idle cpus > > On 09/04/2014 09:17 AM, Chuansheng Liu wrote: > > Currently kick_all_cpus_sync() or smp_call_function() can not > > break the polling idle cpu immediately. > > > > Here using wake_up_all_idle_cpus() which can wake up the polling idle > > cpu quickly is much helpful for power. > > Acked-by: Daniel Lezcano > > So IIUC, kick_all_cpus_sync is a broken function, right ? > > Wouldn't make sense to replace all calls to kick_all_cpus by > wake_up_all_idle_cpus ? and remove this broken function ? My initial patch(V1) indeed do it, but Andy pointed out some callers of kick_all_cpus_sync() really want the old behavior. My fault again that do not have the detailed changelog for the patches. Pasted the comments from Andy: == > Currently using smp_call_function() just woke up the corresponding > cpu, but can not break the polling idle loop. > > Here using the new sched API wake_up_if_idle() to implement it. kick_all_cpus_sync has other callers, and those other callers want the old behavior. I think this should be a new function. --Andy ==
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
> -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, September 04, 2014 8:53 PM > To: Liu, Chuansheng; pet...@infradead.org; l...@amacapital.net; > r...@rjwysocki.net; mi...@redhat.com > Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; > Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the > idle cpu > > On 09/04/2014 09:17 AM, Chuansheng Liu wrote: > > Implementing one new API wake_up_if_idle(), which is used to > > wake up the idle CPU. > > > > Suggested-by: Andy Lutomirski > > Signed-off-by: Chuansheng Liu > > Hi Chuanseng, > > in the future, please prefix the patches with the version number and a > changelog. > Thanks your advice, Daniel, will do that next time:)
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
-Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, September 04, 2014 8:53 PM To: Liu, Chuansheng; pet...@infradead.org; l...@amacapital.net; r...@rjwysocki.net; mi...@redhat.com Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu On 09/04/2014 09:17 AM, Chuansheng Liu wrote: Implementing one new API wake_up_if_idle(), which is used to wake up the idle CPU. Suggested-by: Andy Lutomirski l...@amacapital.net Signed-off-by: Chuansheng Liu chuansheng@intel.com Hi Chuanseng, in the future, please prefix the patches with the version number and a changelog. Thanks your advice, Daniel, will do that next time:)
RE: [PATCH 3/3] cpuidle: Using the wake_up_all_idle_cpus() to wake up all idle cpus
-Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, September 04, 2014 9:02 PM To: Liu, Chuansheng; pet...@infradead.org; l...@amacapital.net; r...@rjwysocki.net; mi...@redhat.com Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH 3/3] cpuidle: Using the wake_up_all_idle_cpus() to wake up all idle cpus On 09/04/2014 09:17 AM, Chuansheng Liu wrote: Currently kick_all_cpus_sync() or smp_call_function() can not break the polling idle cpu immediately. Here using wake_up_all_idle_cpus() which can wake up the polling idle cpu quickly is much helpful for power. Acked-by: Daniel Lezcano daniel.lezc...@linaro.org So IIUC, kick_all_cpus_sync is a broken function, right ? Wouldn't make sense to replace all calls to kick_all_cpus by wake_up_all_idle_cpus ? and remove this broken function ? My initial patch(V1) indeed do it, but Andy pointed out some callers of kick_all_cpus_sync() really want the old behavior. My fault again that do not have the detailed changelog for the patches. Pasted the comments from Andy: == Currently using smp_call_function() just woke up the corresponding cpu, but can not break the polling idle loop. Here using the new sched API wake_up_if_idle() to implement it. kick_all_cpus_sync has other callers, and those other callers want the old behavior. I think this should be a new function. --Andy ==
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, September 03, 2014 5:54 PM > To: Andy Lutomirski > Cc: Liu, Chuansheng; linux...@vger.kernel.org; Rafael J. Wysocki; Daniel > Lezcano; Wang, Xiaoming; Ingo Molnar; Chakravarty, Souvik K; Liu, Changcheng; > linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the > idle cpu > > On Thu, Aug 21, 2014 at 02:21:51PM -0700, Andy Lutomirski wrote: > > On Aug 18, 2014 3:52 AM, "Chuansheng Liu" > wrote: > > > > +void wake_up_if_idle(int cpu) > > > +{ > > > + struct rq *rq = cpu_rq(cpu); > > > + unsigned long flags; > > > + > > > + if (set_nr_if_polling(rq->idle)) { > > > + trace_sched_wake_idle_without_ipi(cpu); > > > + } else { > > > + > > > > FWIW, adding: > > > > if (rq->curr != rq->idle) > > return; > > > > Right here could improve performance on large, mostly non-idle > > systems. It would skip the spinlock in most cases. > > > > !is_idle_task() :-) Thanks Peter and Andy, will refine the patches again:) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Wednesday, September 03, 2014 5:54 PM To: Andy Lutomirski Cc: Liu, Chuansheng; linux...@vger.kernel.org; Rafael J. Wysocki; Daniel Lezcano; Wang, Xiaoming; Ingo Molnar; Chakravarty, Souvik K; Liu, Changcheng; linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu On Thu, Aug 21, 2014 at 02:21:51PM -0700, Andy Lutomirski wrote: On Aug 18, 2014 3:52 AM, Chuansheng Liu chuansheng@intel.com wrote: +void wake_up_if_idle(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + unsigned long flags; + + if (set_nr_if_polling(rq-idle)) { + trace_sched_wake_idle_without_ipi(cpu); + } else { + FWIW, adding: if (rq-curr != rq-idle) return; Right here could improve performance on large, mostly non-idle systems. It would skip the spinlock in most cases. !is_idle_task() :-) Thanks Peter and Andy, will refine the patches again:) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
Hello Daniel, > -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, August 21, 2014 9:54 AM > To: Liu, Chuansheng; l...@amacapital.net; pet...@infradead.org; > r...@rjwysocki.net; mi...@redhat.com > Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; > Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the > idle cpu > > On 08/18/2014 10:37 AM, Chuansheng Liu wrote: > > Implementing one new API wake_up_if_idle(), which is used to > > wake up the idle CPU. > > Is this patchset tested ? Did you check it solves the issue you were > facing ? We have done the basic test, and found the cores can exit C0 quickly with this patchset. Basically once the _TIF_NEED_RESCHED is set, then the poll_idle() can be broken. Please correct me if something is wrong, thanks. > > > Suggested-by: Andy Lutomirski > > Signed-off-by: Chuansheng Liu > > --- > > include/linux/sched.h |1 + > > kernel/sched/core.c | 16 > > 2 files changed, 17 insertions(+) > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 857ba40..3f89ac1 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -1024,6 +1024,7 @@ struct sched_domain_topology_level { > > extern struct sched_domain_topology_level *sched_domain_topology; > > > > extern void set_sched_topology(struct sched_domain_topology_level *tl); > > +extern void wake_up_if_idle(int cpu); > > > > #ifdef CONFIG_SCHED_DEBUG > > # define SD_INIT_NAME(type) .name = #type > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 1211575..adf104f 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -1620,6 +1620,22 @@ static void ttwu_queue_remote(struct > task_struct *p, int cpu) > > } > > } > > > > +void wake_up_if_idle(int cpu) > > +{ > > + struct rq *rq = cpu_rq(cpu); > > + unsigned long flags; > > + > > + if (set_nr_if_polling(rq->idle)) { > > + trace_sched_wake_idle_without_ipi(cpu); > > + } else { > > + raw_spin_lock_irqsave(>lock, flags); > > + if (rq->curr == rq->idle) > > + smp_send_reschedule(cpu); > > + /* Else cpu is not in idle, do nothing here */ > > + raw_spin_unlock_irqrestore(>lock, flags); > > + } > > +} > > + > > bool cpus_share_cache(int this_cpu, int that_cpu) > > { > > return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu); > > > > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM > SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog
RE: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu
Hello Daniel, -Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, August 21, 2014 9:54 AM To: Liu, Chuansheng; l...@amacapital.net; pet...@infradead.org; r...@rjwysocki.net; mi...@redhat.com Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH 1/3] sched: Add new API wake_up_if_idle() to wake up the idle cpu On 08/18/2014 10:37 AM, Chuansheng Liu wrote: Implementing one new API wake_up_if_idle(), which is used to wake up the idle CPU. Is this patchset tested ? Did you check it solves the issue you were facing ? We have done the basic test, and found the cores can exit C0 quickly with this patchset. Basically once the _TIF_NEED_RESCHED is set, then the poll_idle() can be broken. Please correct me if something is wrong, thanks. Suggested-by: Andy Lutomirski l...@amacapital.net Signed-off-by: Chuansheng Liu chuansheng@intel.com --- include/linux/sched.h |1 + kernel/sched/core.c | 16 2 files changed, 17 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 857ba40..3f89ac1 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1024,6 +1024,7 @@ struct sched_domain_topology_level { extern struct sched_domain_topology_level *sched_domain_topology; extern void set_sched_topology(struct sched_domain_topology_level *tl); +extern void wake_up_if_idle(int cpu); #ifdef CONFIG_SCHED_DEBUG # define SD_INIT_NAME(type) .name = #type diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1211575..adf104f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1620,6 +1620,22 @@ static void ttwu_queue_remote(struct task_struct *p, int cpu) } } +void wake_up_if_idle(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + unsigned long flags; + + if (set_nr_if_polling(rq-idle)) { + trace_sched_wake_idle_without_ipi(cpu); + } else { + raw_spin_lock_irqsave(rq-lock, flags); + if (rq-curr == rq-idle) + smp_send_reschedule(cpu); + /* Else cpu is not in idle, do nothing here */ + raw_spin_unlock_irqrestore(rq-lock, flags); + } +} + bool cpus_share_cache(int this_cpu, int that_cpu) { return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu); -- http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro Facebook | http://twitter.com/#!/linaroorg Twitter | http://www.linaro.org/linaro-blog/ Blog
RE: [PATCH 2/3] smp: re-implement the kick_all_cpus_sync() with wake_up_if_idle()
Hello Andy, > -Original Message- > From: Andy Lutomirski [mailto:l...@amacapital.net] > Sent: Friday, August 15, 2014 11:41 PM > To: Liu, Chuansheng > Cc: Peter Zijlstra; Daniel Lezcano; Rafael J. Wysocki; Ingo Molnar; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; > Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH 2/3] smp: re-implement the kick_all_cpus_sync() with > wake_up_if_idle() > > On Fri, Aug 15, 2014 at 12:01 AM, Chuansheng Liu > wrote: > > Currently using smp_call_function() just woke up the corresponding > > cpu, but can not break the polling idle loop. > > > > Here using the new sched API wake_up_if_idle() to implement it. > > kick_all_cpus_sync has other callers, and those other callers want the > old behavior. I think this should be a new function. > Yes, seems some current users of kick_all_cpus_sync() need IPI indeed, will try to send out patch V2 with one new function.
RE: [PATCH 2/3] smp: re-implement the kick_all_cpus_sync() with wake_up_if_idle()
Hello Andy, -Original Message- From: Andy Lutomirski [mailto:l...@amacapital.net] Sent: Friday, August 15, 2014 11:41 PM To: Liu, Chuansheng Cc: Peter Zijlstra; Daniel Lezcano; Rafael J. Wysocki; Ingo Molnar; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH 2/3] smp: re-implement the kick_all_cpus_sync() with wake_up_if_idle() On Fri, Aug 15, 2014 at 12:01 AM, Chuansheng Liu chuansheng@intel.com wrote: Currently using smp_call_function() just woke up the corresponding cpu, but can not break the polling idle loop. Here using the new sched API wake_up_if_idle() to implement it. kick_all_cpus_sync has other callers, and those other callers want the old behavior. I think this should be a new function. Yes, seems some current users of kick_all_cpus_sync() need IPI indeed, will try to send out patch V2 with one new function.
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Andy Lutomirski [mailto:l...@amacapital.net] > Sent: Friday, August 15, 2014 5:23 AM > To: Peter Zijlstra > Cc: Daniel Lezcano; Liu, Chuansheng; Rafael J. Wysocki; > linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; > Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > > On Thu, Aug 14, 2014 at 2:16 PM, Peter Zijlstra > wrote: > > On Thu, Aug 14, 2014 at 02:12:27PM -0700, Andy Lutomirski wrote: > >> On 08/14/2014 04:14 AM, Daniel Lezcano wrote: > >> > On 08/14/2014 01:00 PM, Peter Zijlstra wrote: > >> >> > >> >> So seeing how you're from @intel.com I'm assuming you're using x86 > here. > >> >> > >> >> I'm not seeing how this can be possible, MWAIT is interrupted by IPIs > >> >> just fine, which means we'll fall out of the cpuidle_enter(), which > >> >> means we'll cpuidle_reflect(), and then leave cpuidle_idle_call(). > >> >> > >> >> It will indeed not leave the cpu_idle_loop() function and go right back > >> >> into cpuidle_idle_call(), but that will then call cpuidle_select() which > >> >> should pick a new C state. > >> >> > >> >> So the interrupt _should_ work. If it doesn't you need to explain why. > >> > > >> > I think the issue is related to the poll_idle state, in > >> > drivers/cpuidle/driver.c. This state is x86 specific and inserted in the > >> > cpuidle table as the state 0 (POLL). There is no mwait for this state. > >> > It is a bit confusing because this state is not listed in the acpi / > >> > intel idle driver but inserted implicitly at the beginning of the idle > >> > table by the cpuidle framework when the driver is registered. > >> > > >> > static int poll_idle(struct cpuidle_device *dev, > >> > struct cpuidle_driver *drv, int index) > >> > { > >> > local_irq_enable(); > >> > if (!current_set_polling_and_test()) { > >> > while (!need_resched()) > >> > cpu_relax(); > >> > } > >> > current_clr_polling(); > >> > > >> > return index; > >> > } > >> > >> As the most recent person to have modified this function, and as an > >> avowed hater of pointless IPIs, let me ask a rather different question: > >> why are you sending IPIs at all? As of Linux 3.16, poll_idle actually > >> supports the polling idle interface :) > >> > >> Can't you just do: > >> > >> if (set_nr_if_polling(rq->idle)) { > >> trace_sched_wake_idle_without_ipi(cpu); > >> } else { > >> spin_lock_irqsave(>lock, flags); > >> if (rq->curr == rq->idle) > >> smp_send_reschedule(cpu); > >> // else the CPU wasn't idle; nothing to do > >> raw_spin_unlock_irqrestore(>lock, flags); > >> } > >> > >> In the common case (wake from C0, i.e. polling idle), this will skip the > >> IPI entirely unless you race with idle entry/exit, saving a few more > >> precious electrons and all of the latency involved in poking the APIC > >> registers. > > > > They could and they probably should, but that logic should _not_ live in > > the cpuidle driver. > > Sure. My point is that fixing the IPI handler is, I think, totally > bogus, because the IPI API isn't the right way to do this at all. > > It would be straightforward to add a new function wake_if_idle(int > cpu) to sched/core.c. > Thanks Andy and Peter's suggestion, it will save some IPI things in case the cores are not in idle. There is one similar API in sched/core.c wake_up_idle_cpu(), then just need add one new common smp API: smp_wake_up_cpus() { for_each_online_cpu() wake_up_idle_cpu(); } Will try one patch for it. N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, August 14, 2014 10:17 PM > To: Liu, Chuansheng; Peter Zijlstra > Cc: Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; > Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > > On 08/14/2014 04:10 PM, Liu, Chuansheng wrote: > > > > > >> -Original Message- > >> From: Peter Zijlstra [mailto:pet...@infradead.org] > >> Sent: Thursday, August 14, 2014 9:13 PM > >> To: Liu, Chuansheng > >> Cc: Daniel Lezcano; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, > >> Changcheng; Wang, Xiaoming; Chakravarty, Souvik K > >> Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > >> back to DEFAULT > >> > >> On Thu, Aug 14, 2014 at 11:24:06AM +, Liu, Chuansheng wrote: > >>> If inspecting the polling flag, we can not fix the race between poll_idle > >>> and > >> smp_callback, > >>> since in poll_idle(), before set polling flag, if the smp_callback come > >>> in, then > >> no resched bit set, > >>> after that, poll_idle() will do the polling action, without reselection > >> immediately, it will bring power > >>> regression here. > >> > >> -ENOPARSE. Is there a question there? > > > > Lezcano suggest to inspect the polling flag, then code is like below: > > smp_callback() { > > if (polling_flag) > >set_resched_bit; > > } > > > > And the poll_idle code is like below: > > static int poll_idle(struct cpuidle_device *dev, > > struct cpuidle_driver *drv, int index) > > { > > local_irq_enable(); > > if (!current_set_polling_and_test()) { > > while (!need_resched()) > > Or alternatively, something like: > > while (!need_resched() || kickme) { > ... > } > > > smp_callback() > { > kickme = 1; > } > > kickme is a percpu variable and set to zero when exiting the 'enter' > callback. > > So we don't mess with the polling flag, which is already a bit tricky. > > This patch is very straightforward to illustrate the idea. > > > cpu_relax(); > > } > > current_clr_polling(); > > > > return index; > > } > > Thanks Lezcano, the new flag kickme sounds making things simple, will try to send one new patch to review:)
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Thursday, August 14, 2014 9:13 PM > To: Liu, Chuansheng > Cc: Daniel Lezcano; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, > Changcheng; Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > > On Thu, Aug 14, 2014 at 11:24:06AM +, Liu, Chuansheng wrote: > > If inspecting the polling flag, we can not fix the race between poll_idle > > and > smp_callback, > > since in poll_idle(), before set polling flag, if the smp_callback come in, > > then > no resched bit set, > > after that, poll_idle() will do the polling action, without reselection > immediately, it will bring power > > regression here. > > -ENOPARSE. Is there a question there? Lezcano suggest to inspect the polling flag, then code is like below: smp_callback() { if (polling_flag) set_resched_bit; } And the poll_idle code is like below: static int poll_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { local_irq_enable(); if (!current_set_polling_and_test()) { while (!need_resched()) cpu_relax(); } current_clr_polling(); return index; } The race is: Idle task: poll_idle local_irq_enable() <== IPI interrupt coming, check the polling flag is not set yet, do nothing; Come back to poll_idle, it will stay in the poll loop for a while, instead break it immediately to let governor reselect the right C-state. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, August 14, 2014 9:30 PM > To: Peter Zijlstra > Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, > Changcheng; Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > I think the main issue here is to exit the poll_idle loop when an IPI is > received. IIUC, there is a pm_qos user, perhaps a driver (Chuansheng can > give more details), setting a very short latency, so the cpuidle > framework choose a shallow state like the poll_idle and then the driver > sets a bigger latency, leading to the IPI to wake all the cpus. As the > CPUs are in the poll_idle, they don't exit until an event make them to > exit the need_resched() loop (reschedule or whatever). This situation > can let the CPUs to stand in the infinite loop several seconds while we > are expecting them to exit the poll_idle and enter a deeper idle state, > thus with an extra energy consumption. > Exactly, no function error here. But do not enter the deeper C-state will bring more power consumption, in some mp3 standby mode, even 10% power can be saved. And this is the patch's aim here.
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Thursday, August 14, 2014 6:54 PM > To: Daniel Lezcano > Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, > Changcheng; Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > > On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote: > > Hi Chuansheng, > > > > On 14 August 2014 04:11, Chuansheng Liu > wrote: > > > > > We found sometimes even after we let PM_QOS back to DEFAULT, > > > the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state > > > selection immediately after received the IPI interrupt. > > > > > > The code model is simply like below: > > > { > > > pm_qos_update_request(_qos, C1 - 1); > > > < == Here keep all cores at C0 > > > ...; > > > pm_qos_update_request(_qos, PM_QOS_DEFAULT_VALUE); > > > < == Here some cores still stuck at C0 for 2-3s > > > } > > > > > > The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to > > > wake up the core, but when core is in poll idle state, the IPI interrupt > > > can not break the polling loop. > > > > > > So here in the IPI callback interrupt, when currently the idle task is > > > running, we need to forcedly set reschedule bit to break the polling loop, > > > as for other non-polling idle state, IPI interrupt can break them > > > directly, > > > and setting reschedule bit has no harm for them too. > > > > > > With this fix, we saved about 30mV power in our android platform. > > > > > > Signed-off-by: Chuansheng Liu > > > --- > > > drivers/cpuidle/cpuidle.c |8 +++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > > > index ee9df5e..9e28a13 100644 > > > --- a/drivers/cpuidle/cpuidle.c > > > +++ b/drivers/cpuidle/cpuidle.c > > > @@ -532,7 +532,13 @@ EXPORT_SYMBOL_GPL(cpuidle_register); > > > > > > static void smp_callback(void *v) > > > { > > > - /* we already woke the CPU up, nothing more to do */ > > > + /* we already woke the CPU up, and when the corresponding > > > +* CPU is at polling idle state, we need to set the sched > > > +* bit to trigger reselect the new suitable C-state, it > > > +* will be helpful for power. > > > + */ > > > + if (is_idle_task(current)) > > > + set_tsk_need_resched(current); > > > > > > > Mmh, shouldn't we inspect the polling flag instead ? Peter (Cc'ed) did some > > changes around this and I think we should ask its opinion. I am not sure > > this code won't make all cpu to return to the scheduler and go back to the > > idle task. > > Yes, this is wrong.. Also cpuidle should not know about this, so this is > very much the wrong place to go fix this. Lemme have a look. If inspecting the polling flag, we can not fix the race between poll_idle and smp_callback, since in poll_idle(), before set polling flag, if the smp_callback come in, then no resched bit set, after that, poll_idle() will do the polling action, without reselection immediately, it will bring power regression here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
> -Original Message- > From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] > Sent: Thursday, August 14, 2014 7:15 PM > To: Peter Zijlstra > Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, > Changcheng; Wang, Xiaoming; Chakravarty, Souvik K > Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS > back to DEFAULT > > On 08/14/2014 01:00 PM, Peter Zijlstra wrote: > > On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote: > >> Hi Chuansheng, > >> > >> On 14 August 2014 04:11, Chuansheng Liu > wrote: > >> > >>> We found sometimes even after we let PM_QOS back to DEFAULT, > >>> the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state > >>> selection immediately after received the IPI interrupt. > >>> > >>> The code model is simply like below: > >>> { > >>> pm_qos_update_request(_qos, C1 - 1); > >>> < == Here keep all cores at C0 > >>> ...; > >>> pm_qos_update_request(_qos, > PM_QOS_DEFAULT_VALUE); > >>> < == Here some cores still stuck at C0 for 2-3s > >>> } > >>> > >>> The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to > >>> wake up the core, but when core is in poll idle state, the IPI interrupt > >>> can not break the polling loop. > > > > So seeing how you're from @intel.com I'm assuming you're using x86 here. > > > > I'm not seeing how this can be possible, MWAIT is interrupted by IPIs > > just fine, which means we'll fall out of the cpuidle_enter(), which > > means we'll cpuidle_reflect(), and then leave cpuidle_idle_call(). > > > > It will indeed not leave the cpu_idle_loop() function and go right back > > into cpuidle_idle_call(), but that will then call cpuidle_select() which > > should pick a new C state. > > > > So the interrupt _should_ work. If it doesn't you need to explain why. > > I think the issue is related to the poll_idle state, in > drivers/cpuidle/driver.c. This state is x86 specific and inserted in the > cpuidle table as the state 0 (POLL). There is no mwait for this state. > It is a bit confusing because this state is not listed in the acpi / > intel idle driver but inserted implicitly at the beginning of the idle > table by the cpuidle framework when the driver is registered. Yes, I am talking about the poll_idle() function which didn't use the mwait, If we want the reselection happening immediately, we need to break the poll while loop with setting schedule bit, insteadly we didn't care if real re-schedule happening or not.
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, August 14, 2014 7:15 PM To: Peter Zijlstra Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On 08/14/2014 01:00 PM, Peter Zijlstra wrote: On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote: Hi Chuansheng, On 14 August 2014 04:11, Chuansheng Liu chuansheng@intel.com wrote: We found sometimes even after we let PM_QOS back to DEFAULT, the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state selection immediately after received the IPI interrupt. The code model is simply like below: { pm_qos_update_request(pm_qos, C1 - 1); == Here keep all cores at C0 ...; pm_qos_update_request(pm_qos, PM_QOS_DEFAULT_VALUE); == Here some cores still stuck at C0 for 2-3s } The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to wake up the core, but when core is in poll idle state, the IPI interrupt can not break the polling loop. So seeing how you're from @intel.com I'm assuming you're using x86 here. I'm not seeing how this can be possible, MWAIT is interrupted by IPIs just fine, which means we'll fall out of the cpuidle_enter(), which means we'll cpuidle_reflect(), and then leave cpuidle_idle_call(). It will indeed not leave the cpu_idle_loop() function and go right back into cpuidle_idle_call(), but that will then call cpuidle_select() which should pick a new C state. So the interrupt _should_ work. If it doesn't you need to explain why. I think the issue is related to the poll_idle state, in drivers/cpuidle/driver.c. This state is x86 specific and inserted in the cpuidle table as the state 0 (POLL). There is no mwait for this state. It is a bit confusing because this state is not listed in the acpi / intel idle driver but inserted implicitly at the beginning of the idle table by the cpuidle framework when the driver is registered. Yes, I am talking about the poll_idle() function which didn't use the mwait, If we want the reselection happening immediately, we need to break the poll while loop with setting schedule bit, insteadly we didn't care if real re-schedule happening or not.
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Thursday, August 14, 2014 6:54 PM To: Daniel Lezcano Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote: Hi Chuansheng, On 14 August 2014 04:11, Chuansheng Liu chuansheng@intel.com wrote: We found sometimes even after we let PM_QOS back to DEFAULT, the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state selection immediately after received the IPI interrupt. The code model is simply like below: { pm_qos_update_request(pm_qos, C1 - 1); == Here keep all cores at C0 ...; pm_qos_update_request(pm_qos, PM_QOS_DEFAULT_VALUE); == Here some cores still stuck at C0 for 2-3s } The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to wake up the core, but when core is in poll idle state, the IPI interrupt can not break the polling loop. So here in the IPI callback interrupt, when currently the idle task is running, we need to forcedly set reschedule bit to break the polling loop, as for other non-polling idle state, IPI interrupt can break them directly, and setting reschedule bit has no harm for them too. With this fix, we saved about 30mV power in our android platform. Signed-off-by: Chuansheng Liu chuansheng@intel.com --- drivers/cpuidle/cpuidle.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index ee9df5e..9e28a13 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -532,7 +532,13 @@ EXPORT_SYMBOL_GPL(cpuidle_register); static void smp_callback(void *v) { - /* we already woke the CPU up, nothing more to do */ + /* we already woke the CPU up, and when the corresponding +* CPU is at polling idle state, we need to set the sched +* bit to trigger reselect the new suitable C-state, it +* will be helpful for power. + */ + if (is_idle_task(current)) + set_tsk_need_resched(current); Mmh, shouldn't we inspect the polling flag instead ? Peter (Cc'ed) did some changes around this and I think we should ask its opinion. I am not sure this code won't make all cpu to return to the scheduler and go back to the idle task. Yes, this is wrong.. Also cpuidle should not know about this, so this is very much the wrong place to go fix this. Lemme have a look. If inspecting the polling flag, we can not fix the race between poll_idle and smp_callback, since in poll_idle(), before set polling flag, if the smp_callback come in, then no resched bit set, after that, poll_idle() will do the polling action, without reselection immediately, it will bring power regression here. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, August 14, 2014 9:30 PM To: Peter Zijlstra Cc: Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT I think the main issue here is to exit the poll_idle loop when an IPI is received. IIUC, there is a pm_qos user, perhaps a driver (Chuansheng can give more details), setting a very short latency, so the cpuidle framework choose a shallow state like the poll_idle and then the driver sets a bigger latency, leading to the IPI to wake all the cpus. As the CPUs are in the poll_idle, they don't exit until an event make them to exit the need_resched() loop (reschedule or whatever). This situation can let the CPUs to stand in the infinite loop several seconds while we are expecting them to exit the poll_idle and enter a deeper idle state, thus with an extra energy consumption. Exactly, no function error here. But do not enter the deeper C-state will bring more power consumption, in some mp3 standby mode, even 10% power can be saved. And this is the patch's aim here.
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Thursday, August 14, 2014 9:13 PM To: Liu, Chuansheng Cc: Daniel Lezcano; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On Thu, Aug 14, 2014 at 11:24:06AM +, Liu, Chuansheng wrote: If inspecting the polling flag, we can not fix the race between poll_idle and smp_callback, since in poll_idle(), before set polling flag, if the smp_callback come in, then no resched bit set, after that, poll_idle() will do the polling action, without reselection immediately, it will bring power regression here. -ENOPARSE. Is there a question there? Lezcano suggest to inspect the polling flag, then code is like below: smp_callback() { if (polling_flag) set_resched_bit; } And the poll_idle code is like below: static int poll_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { local_irq_enable(); if (!current_set_polling_and_test()) { while (!need_resched()) cpu_relax(); } current_clr_polling(); return index; } The race is: Idle task: poll_idle local_irq_enable() == IPI interrupt coming, check the polling flag is not set yet, do nothing; Come back to poll_idle, it will stay in the poll loop for a while, instead break it immediately to let governor reselect the right C-state. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Daniel Lezcano [mailto:daniel.lezc...@linaro.org] Sent: Thursday, August 14, 2014 10:17 PM To: Liu, Chuansheng; Peter Zijlstra Cc: Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On 08/14/2014 04:10 PM, Liu, Chuansheng wrote: -Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Thursday, August 14, 2014 9:13 PM To: Liu, Chuansheng Cc: Daniel Lezcano; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On Thu, Aug 14, 2014 at 11:24:06AM +, Liu, Chuansheng wrote: If inspecting the polling flag, we can not fix the race between poll_idle and smp_callback, since in poll_idle(), before set polling flag, if the smp_callback come in, then no resched bit set, after that, poll_idle() will do the polling action, without reselection immediately, it will bring power regression here. -ENOPARSE. Is there a question there? Lezcano suggest to inspect the polling flag, then code is like below: smp_callback() { if (polling_flag) set_resched_bit; } And the poll_idle code is like below: static int poll_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { local_irq_enable(); if (!current_set_polling_and_test()) { while (!need_resched()) Or alternatively, something like: while (!need_resched() || kickme) { ... } smp_callback() { kickme = 1; } kickme is a percpu variable and set to zero when exiting the 'enter' callback. So we don't mess with the polling flag, which is already a bit tricky. This patch is very straightforward to illustrate the idea. cpu_relax(); } current_clr_polling(); return index; } Thanks Lezcano, the new flag kickme sounds making things simple, will try to send one new patch to review:)
RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
-Original Message- From: Andy Lutomirski [mailto:l...@amacapital.net] Sent: Friday, August 15, 2014 5:23 AM To: Peter Zijlstra Cc: Daniel Lezcano; Liu, Chuansheng; Rafael J. Wysocki; linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming; Chakravarty, Souvik K Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT On Thu, Aug 14, 2014 at 2:16 PM, Peter Zijlstra pet...@infradead.org wrote: On Thu, Aug 14, 2014 at 02:12:27PM -0700, Andy Lutomirski wrote: On 08/14/2014 04:14 AM, Daniel Lezcano wrote: On 08/14/2014 01:00 PM, Peter Zijlstra wrote: So seeing how you're from @intel.com I'm assuming you're using x86 here. I'm not seeing how this can be possible, MWAIT is interrupted by IPIs just fine, which means we'll fall out of the cpuidle_enter(), which means we'll cpuidle_reflect(), and then leave cpuidle_idle_call(). It will indeed not leave the cpu_idle_loop() function and go right back into cpuidle_idle_call(), but that will then call cpuidle_select() which should pick a new C state. So the interrupt _should_ work. If it doesn't you need to explain why. I think the issue is related to the poll_idle state, in drivers/cpuidle/driver.c. This state is x86 specific and inserted in the cpuidle table as the state 0 (POLL). There is no mwait for this state. It is a bit confusing because this state is not listed in the acpi / intel idle driver but inserted implicitly at the beginning of the idle table by the cpuidle framework when the driver is registered. static int poll_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { local_irq_enable(); if (!current_set_polling_and_test()) { while (!need_resched()) cpu_relax(); } current_clr_polling(); return index; } As the most recent person to have modified this function, and as an avowed hater of pointless IPIs, let me ask a rather different question: why are you sending IPIs at all? As of Linux 3.16, poll_idle actually supports the polling idle interface :) Can't you just do: if (set_nr_if_polling(rq-idle)) { trace_sched_wake_idle_without_ipi(cpu); } else { spin_lock_irqsave(rq-lock, flags); if (rq-curr == rq-idle) smp_send_reschedule(cpu); // else the CPU wasn't idle; nothing to do raw_spin_unlock_irqrestore(rq-lock, flags); } In the common case (wake from C0, i.e. polling idle), this will skip the IPI entirely unless you race with idle entry/exit, saving a few more precious electrons and all of the latency involved in poking the APIC registers. They could and they probably should, but that logic should _not_ live in the cpuidle driver. Sure. My point is that fixing the IPI handler is, I think, totally bogus, because the IPI API isn't the right way to do this at all. It would be straightforward to add a new function wake_if_idle(int cpu) to sched/core.c. Thanks Andy and Peter's suggestion, it will save some IPI things in case the cores are not in idle. There is one similar API in sched/core.c wake_up_idle_cpu(), then just need add one new common smp API: smp_wake_up_cpus() { for_each_online_cpu() wake_up_idle_cpu(); } Will try one patch for it. N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] lib/spinlock_debug: Tweak the loop time to fit different _delay()
Hello Peter, > -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, April 30, 2014 4:55 PM > To: Wang, Xiaoming > Cc: mi...@redhat.com; linux-kernel@vger.kernel.org; Liu, Chuansheng > Subject: Re: [PATCH] lib/spinlock_debug: Tweak the loop time to fit different > _delay() > > On Wed, Apr 30, 2014 at 06:40:38PM -0400, Wang, Xiaoming wrote: > > loops_per_jiffy*Hz is not always 1 second exactly > > it depends on the realization of _delay() . > > delay_tsc is used as _delay() in arch/x86/lib/delay.c > > This just states delay() is broken. The primary response should be to > try and fix that, no? delay(1s_count) is accurate, but delay(1) is not accurate indeed, since executing some instruction, then the 1 cycle delay maybe be used already. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] lib/spinlock_debug: Tweak the loop time to fit different _delay()
Hello Peter, -Original Message- From: Peter Zijlstra [mailto:pet...@infradead.org] Sent: Wednesday, April 30, 2014 4:55 PM To: Wang, Xiaoming Cc: mi...@redhat.com; linux-kernel@vger.kernel.org; Liu, Chuansheng Subject: Re: [PATCH] lib/spinlock_debug: Tweak the loop time to fit different _delay() On Wed, Apr 30, 2014 at 06:40:38PM -0400, Wang, Xiaoming wrote: loops_per_jiffy*Hz is not always 1 second exactly it depends on the realization of _delay() . delay_tsc is used as _delay() in arch/x86/lib/delay.c This just states delay() is broken. The primary response should be to try and fix that, no? delay(1s_count) is accurate, but delay(1) is not accurate indeed, since executing some instruction, then the 1 cycle delay maybe be used already. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [PATCH V2] tty: memleak in alloc_pid
> > Change-Id: Ic960dda039c8f99aad3e0f4d176489a966c62f6a > > Why is this line here? Tingjie, Greg is asking you the sentence of "Change-Id", which is not needed, please remove it with one new patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [PATCH V2] tty: memleak in alloc_pid
Change-Id: Ic960dda039c8f99aad3e0f4d176489a966c62f6a Why is this line here? Tingjie, Greg is asking you the sentence of Change-Id, which is not needed, please remove it with one new patch. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if command is (_Set_Drv_Extra)
Hi, > -Original Message- > From: Wang, Xiaoming > Sent: Monday, March 10, 2014 11:38 PM > To: gre...@linuxfoundation.org; valentina.mane...@gmail.com; > dan.carpen...@oracle.com; standby2...@gmail.com > Cc: de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org; Zhang, > Dongxing; Wang, Xiaoming; Liu, Chuansheng > Subject: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if > command is (_Set_Drv_Extra) > > pcmd->parmbuf->pbuf has been allocated if command is > GEN_CMD_CODE(_Set_Drv_Extra), > and it enqueued by rtw_enqueue_cmd. rtw_cmd_thread dequeue pcmd by > rtw_dequeue_cmd. > The memory leak happened on this branch "if( _FAIL == > rtw_cmd_filter(pcmdpriv, pcmd) )" > which goto post_process directly against freeing pcmd->parmbuf->pbuf in > rtw_drvextra_cmd_hdl which is the cmd_hdl if command is (_Set_Drv_Extra). > This patch free pcmd->parmbuf->pbuf on the forgotten branch to avoid memory > leak. > > Signed-off-by: Zhang Dongxing > Signed-off-by: xiaoming wang > > diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c > b/drivers/staging/rtl8188eu/core/rtw_cmd.c > index c0a0a52..1c7f505 100644 > --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c > +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c > @@ -288,7 +288,7 @@ int rtw_cmd_thread(void *context) > void (*pcmd_callback)(struct adapter *dev, struct cmd_obj *pcmd); > struct adapter *padapter = (struct adapter *)context; > struct cmd_priv *pcmdpriv = &(padapter->cmdpriv); > - > + struct drvextra_cmd_parm *extra_parm = NULL; > > thread_enter("RTW_CMD_THREAD"); > > @@ -323,6 +323,11 @@ _next: > > if (_FAIL == rtw_cmd_filter(pcmdpriv, pcmd)) { > pcmd->res = H2C_DROPPED; > + if (pcmd->cmdcode == > GEN_CMD_CODE(_Set_Drv_Extra)) { > + extra_parm = (struct > drvextra_cmd_parm *)pcmd->parmbuf; > + if (extra_parm && extra_parm->pbuf > && extra_parm->size > 0) > + rtw_mfree(extra_parm->pbuf, > extra_parm->size); > + } > goto post_process; > } > Reviewed-by: Chuansheng Liu Thanks. Best Regards Chuansheng
RE: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if command is (_Set_Drv_Extra)
Hi, -Original Message- From: Wang, Xiaoming Sent: Monday, March 10, 2014 11:38 PM To: gre...@linuxfoundation.org; valentina.mane...@gmail.com; dan.carpen...@oracle.com; standby2...@gmail.com Cc: de...@driverdev.osuosl.org; linux-kernel@vger.kernel.org; Zhang, Dongxing; Wang, Xiaoming; Liu, Chuansheng Subject: [PATCH] [staging][r8188eu]: memory leak in rtw_free_cmd_obj if command is (_Set_Drv_Extra) pcmd-parmbuf-pbuf has been allocated if command is GEN_CMD_CODE(_Set_Drv_Extra), and it enqueued by rtw_enqueue_cmd. rtw_cmd_thread dequeue pcmd by rtw_dequeue_cmd. The memory leak happened on this branch if( _FAIL == rtw_cmd_filter(pcmdpriv, pcmd) ) which goto post_process directly against freeing pcmd-parmbuf-pbuf in rtw_drvextra_cmd_hdl which is the cmd_hdl if command is (_Set_Drv_Extra). This patch free pcmd-parmbuf-pbuf on the forgotten branch to avoid memory leak. Signed-off-by: Zhang Dongxing dongxing.zh...@intel.com Signed-off-by: xiaoming wang xiaoming.w...@intel.com diff --git a/drivers/staging/rtl8188eu/core/rtw_cmd.c b/drivers/staging/rtl8188eu/core/rtw_cmd.c index c0a0a52..1c7f505 100644 --- a/drivers/staging/rtl8188eu/core/rtw_cmd.c +++ b/drivers/staging/rtl8188eu/core/rtw_cmd.c @@ -288,7 +288,7 @@ int rtw_cmd_thread(void *context) void (*pcmd_callback)(struct adapter *dev, struct cmd_obj *pcmd); struct adapter *padapter = (struct adapter *)context; struct cmd_priv *pcmdpriv = (padapter-cmdpriv); - + struct drvextra_cmd_parm *extra_parm = NULL; thread_enter(RTW_CMD_THREAD); @@ -323,6 +323,11 @@ _next: if (_FAIL == rtw_cmd_filter(pcmdpriv, pcmd)) { pcmd-res = H2C_DROPPED; + if (pcmd-cmdcode == GEN_CMD_CODE(_Set_Drv_Extra)) { + extra_parm = (struct drvextra_cmd_parm *)pcmd-parmbuf; + if (extra_parm extra_parm-pbuf extra_parm-size 0) + rtw_mfree(extra_parm-pbuf, extra_parm-size); + } goto post_process; } Reviewed-by: Chuansheng Liu chuansheng@intel.com Thanks. Best Regards Chuansheng
RE: [PATCH v2] usb: gadget: return the right length in ffs_epfile_io()
Hi Balbi, > -Original Message- > From: Felipe Balbi [mailto:ba...@ti.com] > Sent: Wednesday, March 05, 2014 3:56 AM > To: Michal Nazarewicz > Cc: Robert Baldyga; Felipe Balbi; Sergei Shtylyov; Liu, Chuansheng; > gre...@linuxfoundation.org; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org; david.a.co...@linux.intel.com > Subject: Re: [PATCH v2] usb: gadget: return the right length in > ffs_epfile_io() > > On Tue, Mar 04, 2014 at 08:53:40PM +0100, Michal Nazarewicz wrote: > > >> On 03/04/2014 10:34 AM, Chuansheng Liu wrote: > > >> >@@ -845,12 +845,14 @@ static ssize_t ffs_epfile_io(struct file *file, > struct ffs_io_data *io_data) > > >> > * we may end up with more data then user space > > >> > has > > >> > * space for. > > >> > */ > > >> >- ret = ep->status; > > >> >- if (io_data->read && ret > 0 && > > >> >- unlikely(copy_to_user(io_data->buf, data, > > >> >- min_t(size_t, ret, > > >> >- io_data->len > > >> >- ret = -EFAULT; > > >> >+ ret = ep->status; > > > > On Tue, Mar 04 2014, Felipe Balbi wrote: > > >>Why the indentation jumped suddenly to the right? > > > > > On Tue, Mar 04, 2014 at 08:01:15PM +0300, Sergei Shtylyov wrote: > > > because it was wrong before ;-) > > > > Yep. It looks like Robert's [2e4c7553: add aio support] introduced an > > if-else-if-else flow but did not indent the code and I didn't caught it > > when reviewing that patch. > > it's in my testing/next now, I also fixed the comment indentation which > was also wrong. Thanks your help and the fix for comment indentation also:) Best Regards Chuansheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2] usb: gadget: return the right length in ffs_epfile_io()
Hi Balbi, -Original Message- From: Felipe Balbi [mailto:ba...@ti.com] Sent: Wednesday, March 05, 2014 3:56 AM To: Michal Nazarewicz Cc: Robert Baldyga; Felipe Balbi; Sergei Shtylyov; Liu, Chuansheng; gre...@linuxfoundation.org; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; david.a.co...@linux.intel.com Subject: Re: [PATCH v2] usb: gadget: return the right length in ffs_epfile_io() On Tue, Mar 04, 2014 at 08:53:40PM +0100, Michal Nazarewicz wrote: On 03/04/2014 10:34 AM, Chuansheng Liu wrote: @@ -845,12 +845,14 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) * we may end up with more data then user space has * space for. */ - ret = ep-status; - if (io_data-read ret 0 - unlikely(copy_to_user(io_data-buf, data, - min_t(size_t, ret, - io_data-len - ret = -EFAULT; + ret = ep-status; On Tue, Mar 04 2014, Felipe Balbi wrote: Why the indentation jumped suddenly to the right? On Tue, Mar 04, 2014 at 08:01:15PM +0300, Sergei Shtylyov wrote: because it was wrong before ;-) Yep. It looks like Robert's [2e4c7553: add aio support] introduced an if-else-if-else flow but did not indent the code and I didn't caught it when reviewing that patch. it's in my testing/next now, I also fixed the comment indentation which was also wrong. Thanks your help and the fix for comment indentation also:) Best Regards Chuansheng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] usb: gadget: return the right length in ffs_epfile_io()
Hello Balbi, > -Original Message- > From: Felipe Balbi [mailto:ba...@ti.com] > Sent: Tuesday, March 04, 2014 12:30 AM > To: Liu, Chuansheng > Cc: ba...@ti.com; gre...@linuxfoundation.org; min...@mina86.com; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Cohen, David A; > Zhuang, Jin Can; Wang, Yu Y > Subject: Re: [PATCH] usb: gadget: return the right length in ffs_epfile_io() > > Hi, > > On Thu, Feb 27, 2014 at 02:49:31PM +0800, Chuansheng Liu wrote: > > When the request length is aligned to maxpacketsize, sometimes > > the return length ret > the user space requested len. > > > > At that time, we will use min_t(size_t, ret, len) to limit the > > size in case of user data buffer overflow. > > > > But we need return the min_t(size_t, ret, len) to tell the user > > space rightly also. > > > > Signed-off-by: Chuansheng Liu > please rebase on my "testing/next" branch Based on your branch "testing/next", I have sent patch v2 with some ack and reviewing, thanks. [PATCH v2] usb: gadget: return the right length in ffs_epfile_io() Best Regards Chuansheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] usb: gadget: return the right length in ffs_epfile_io()
Hello Balbi, -Original Message- From: Felipe Balbi [mailto:ba...@ti.com] Sent: Tuesday, March 04, 2014 12:30 AM To: Liu, Chuansheng Cc: ba...@ti.com; gre...@linuxfoundation.org; min...@mina86.com; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Cohen, David A; Zhuang, Jin Can; Wang, Yu Y Subject: Re: [PATCH] usb: gadget: return the right length in ffs_epfile_io() Hi, On Thu, Feb 27, 2014 at 02:49:31PM +0800, Chuansheng Liu wrote: When the request length is aligned to maxpacketsize, sometimes the return length ret the user space requested len. At that time, we will use min_t(size_t, ret, len) to limit the size in case of user data buffer overflow. But we need return the min_t(size_t, ret, len) to tell the user space rightly also. Signed-off-by: Chuansheng Liu chuansheng@intel.com please rebase on my testing/next branch Based on your branch testing/next, I have sent patch v2 with some ack and reviewing, thanks. [PATCH v2] usb: gadget: return the right length in ffs_epfile_io() Best Regards Chuansheng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, > -Original Message- > From: Thomas Gleixner [mailto:t...@linutronix.de] > Sent: Friday, February 21, 2014 7:53 PM > To: Liu, Chuansheng > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > On Fri, 21 Feb 2014, Liu, Chuansheng wrote: > > > > > I think you have a point there, but not on x86 wherre the atomic_dec > > > > > and the spinlock on the queueing side are full barriers. For non-x86 > > > > > there is definitely a potential issue. > > > > > > > > > But even on X86, spin_unlock has no full barrier, the following > > > > scenario: > > > > CPU0 CPU1 > > > > spin_lock > > > >atomic_dec_and_test > > > > insert into queue > > > > spin_unlock > > > >checking waitqueue_active > > > > > > But CPU0 sees the 0, right? > > Not be clear here:) > > The atomic_read has no barrier. > > > > Found commit 6cb2a21049b89 has one similar smp_mb() calling before > > waitqueue_active() on one X86 CPU. > > Indeed, you are completely right. Great detective work! Thanks your encouraging. > > I'm inclined to remove the waitqueue_active() alltogether. It's > creating more headache than it's worth. If I am understanding well, removing the checking of waitqueue_active(), and call wakeup() directly which will check list with spinlock protection. If so, I can prepare one patch for it:) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
> -Original Message- > From: Thomas Gleixner [mailto:t...@linutronix.de] > Sent: Friday, February 21, 2014 7:11 PM > To: Liu, Chuansheng > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > On Fri, 21 Feb 2014, Liu, Chuansheng wrote: > > Hello Thomas, > > > > > -Original Message- > > > From: Thomas Gleixner [mailto:t...@linutronix.de] > > > Sent: Friday, February 21, 2014 6:34 PM > > > To: Liu, Chuansheng > > > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > > > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > > > > > On Fri, 21 Feb 2014, Liu, Chuansheng wrote: > > > > But feels there is another case which the synchronize_irq waited there > > > forever, > > > > it is no waking up action from irq_thread(). > > > > > > > > CPU0 CPU1 > > > > disable_irq() irq_thread() > > > > synchronize_irq() > > > > wait_event() > > > > adding the __wait into the queue wake_threads_waitq > > > >test threads_active==0 > > > > > > > > atomic_dec_and_test(threads_active) 1 -- > 0 > > > > > > > waitqueue_active(>wait_for_threads) > > > > <== Here without smp_mb(), > CPU1 > > > maybe detect > > > > the queue is still empty?? > > > > schedule() > > > > > > > > It will cause although the threads_active is 0, but irq_thread() didn't > > > > do > the > > > waking up action. > > > > Is it reasonable? Then maybe we can add one smp_mb() before > > > waitqueue_active. > > > > > > I think you have a point there, but not on x86 wherre the atomic_dec > > > and the spinlock on the queueing side are full barriers. For non-x86 > > > there is definitely a potential issue. > > > > > But even on X86, spin_unlock has no full barrier, the following scenario: > > CPU0 CPU1 > > spin_lock > >atomic_dec_and_test > > insert into queue > > spin_unlock > >checking waitqueue_active > > But CPU0 sees the 0, right? Not be clear here:) The atomic_read has no barrier. Found commit 6cb2a21049b89 has one similar smp_mb() calling before waitqueue_active() on one X86 CPU. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, > -Original Message- > From: Thomas Gleixner [mailto:t...@linutronix.de] > Sent: Friday, February 21, 2014 6:34 PM > To: Liu, Chuansheng > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > On Fri, 21 Feb 2014, Liu, Chuansheng wrote: > > But feels there is another case which the synchronize_irq waited there > forever, > > it is no waking up action from irq_thread(). > > > > CPU0 CPU1 > > disable_irq() irq_thread() > > synchronize_irq() > > wait_event() > > adding the __wait into the queue wake_threads_waitq > >test threads_active==0 > > atomic_dec_and_test(threads_active) 1 > > -- > 0 > > > waitqueue_active(>wait_for_threads) > > <== Here without smp_mb(), CPU1 > maybe detect > > the queue is still empty?? > > schedule() > > > > It will cause although the threads_active is 0, but irq_thread() didn't do > > the > waking up action. > > Is it reasonable? Then maybe we can add one smp_mb() before > waitqueue_active. > > I think you have a point there, but not on x86 wherre the atomic_dec > and the spinlock on the queueing side are full barriers. For non-x86 > there is definitely a potential issue. > But even on X86, spin_unlock has no full barrier, the following scenario: CPU0 CPU1 spin_lock atomic_dec_and_test insert into queue spin_unlock checking waitqueue_active Here after inserting into the queue, before waitqueue_active, there is no mb. So is it still the case? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Friday, February 21, 2014 6:34 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Fri, 21 Feb 2014, Liu, Chuansheng wrote: But feels there is another case which the synchronize_irq waited there forever, it is no waking up action from irq_thread(). CPU0 CPU1 disable_irq() irq_thread() synchronize_irq() wait_event() adding the __wait into the queue wake_threads_waitq test threads_active==0 atomic_dec_and_test(threads_active) 1 -- 0 waitqueue_active(desc-wait_for_threads) == Here without smp_mb(), CPU1 maybe detect the queue is still empty?? schedule() It will cause although the threads_active is 0, but irq_thread() didn't do the waking up action. Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active. I think you have a point there, but not on x86 wherre the atomic_dec and the spinlock on the queueing side are full barriers. For non-x86 there is definitely a potential issue. But even on X86, spin_unlock has no full barrier, the following scenario: CPU0 CPU1 spin_lock atomic_dec_and_test insert into queue spin_unlock checking waitqueue_active Here after inserting into the queue, before waitqueue_active, there is no mb. So is it still the case? Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
-Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Friday, February 21, 2014 7:11 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Fri, 21 Feb 2014, Liu, Chuansheng wrote: Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Friday, February 21, 2014 6:34 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Fri, 21 Feb 2014, Liu, Chuansheng wrote: But feels there is another case which the synchronize_irq waited there forever, it is no waking up action from irq_thread(). CPU0 CPU1 disable_irq() irq_thread() synchronize_irq() wait_event() adding the __wait into the queue wake_threads_waitq test threads_active==0 atomic_dec_and_test(threads_active) 1 -- 0 waitqueue_active(desc-wait_for_threads) == Here without smp_mb(), CPU1 maybe detect the queue is still empty?? schedule() It will cause although the threads_active is 0, but irq_thread() didn't do the waking up action. Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active. I think you have a point there, but not on x86 wherre the atomic_dec and the spinlock on the queueing side are full barriers. For non-x86 there is definitely a potential issue. But even on X86, spin_unlock has no full barrier, the following scenario: CPU0 CPU1 spin_lock atomic_dec_and_test insert into queue spin_unlock checking waitqueue_active But CPU0 sees the 0, right? Not be clear here:) The atomic_read has no barrier. Found commit 6cb2a21049b89 has one similar smp_mb() calling before waitqueue_active() on one X86 CPU. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Friday, February 21, 2014 7:53 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Fri, 21 Feb 2014, Liu, Chuansheng wrote: I think you have a point there, but not on x86 wherre the atomic_dec and the spinlock on the queueing side are full barriers. For non-x86 there is definitely a potential issue. But even on X86, spin_unlock has no full barrier, the following scenario: CPU0 CPU1 spin_lock atomic_dec_and_test insert into queue spin_unlock checking waitqueue_active But CPU0 sees the 0, right? Not be clear here:) The atomic_read has no barrier. Found commit 6cb2a21049b89 has one similar smp_mb() calling before waitqueue_active() on one X86 CPU. Indeed, you are completely right. Great detective work! Thanks your encouraging. I'm inclined to remove the waitqueue_active() alltogether. It's creating more headache than it's worth. If I am understanding well, removing the checking of waitqueue_active(), and call wakeup() directly which will check list with spinlock protection. If so, I can prepare one patch for it:) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v4 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, > -Original Message- > From: Liu, Chuansheng > Sent: Tuesday, February 18, 2014 10:29 AM > To: r...@rjwysocki.net; gre...@linuxfoundation.org; Brown, Len; > pa...@ucw.cz > Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi; > Liu, > Chuansheng > Subject: [PATCH v4 0/5] Enabling the asynchronous threads for other phases > > Hello, > > This patch series are for enabling the asynchronous threads for the phases > resume_noirq, resume_early, suspend_noirq and suspend_late. > After dig more about async threads, I found dpm_prepare/complete phases need them also. dpm_prepare() -- > call pci_pm_prepare() one by one -- > call pm_runtime_resume() one by one -- > call pci_set_power_state to D3HOT -- > D0 It will cause much time delaying due to d3_delay. And I made one draft patch to implement the asynchronous threads for dpm_prepare(), it is really save much time. If you like it, I can prepare another series patches for dpm_prepare() and dpm_complete(), thanks. Best Regards, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, > -Original Message- > From: Thomas Gleixner [mailto:t...@linutronix.de] > Sent: Thursday, February 20, 2014 8:53 PM > To: Liu, Chuansheng > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > On Thu, 20 Feb 2014, Liu, Chuansheng wrote: > > Hello Thomas, > > > > > -Original Message- > > > From: Thomas Gleixner [mailto:t...@linutronix.de] > > > Sent: Monday, February 10, 2014 4:58 PM > > > To: Liu, Chuansheng > > > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > > > Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > > > > > On Mon, 10 Feb 2014, Chuansheng Liu wrote: > > > > There is below race between irq handler and irq thread: > > > > irq handler irq thread > > > > > > > > irq_wake_thread() irq_thread() > > > > set bit RUNTHREAD > > > > ...clear bit RUNTHREAD > > > > thread_fn() > > > > [A]test_and_decrease > > > >thread_active > > > > [B]increase thread_active > > > > > > > > If action A is before action B, after that the thread_active > > > > will be always > 0, and for synchronize_irq() calling, which > > > > will be waiting there forever. > > > > > > No. thread_active is 0, simply because after the atomic_dec_and_test() > > > it is -1 and the atomic_inc on the other side will bring it back to 0. > > > > > Yes, you are right. The thread_active is back to 0 at last. > > > > The case we meet is: > > 1/ T1: blocking at disable_irq() -- > sync_irq() -- > wait_event() > > [ 142.678681] [] schedule+0x23/0x60 > > [ 142.683466] [] synchronize_irq+0x75/0xb0 > > [ 142.688931] [] ? wake_up_bit+0x30/0x30 > > [ 142.694201] [] disable_irq+0x1b/0x20 > > [ 142.699278] [] smb347_shutdown+0x2c/0x50 > > [ 142.704744] [] i2c_device_shutdown+0x2d/0x40 > > [ 142.710597] [] device_shutdown+0x14/0x140 > > [ 142.716161] [] kernel_restart_prepare+0x32/0x40 > > [ 142.722307] [] kernel_restart+0x13/0x60 > > > > 2/ The corresponding irq thread is at sleep state: > > [ 587.552408] irq/388-SMB0349 S f1c47620 7276 119 2 > 0x > > [ 587.552439] f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c > 0001 c1a5f3a5 > > [ 587.552468] c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 > c20469c0 f1d6bef0 > > [ 587.552497] 0296 0296 f1d6bef0 c1a5bfa6 > f1c47620 f1d6bf14 c126e329 > > [ 587.552501] Call Trace: > > [ 587.552519] [] ? sub_preempt_count+0x55/0xe0 > > [ 587.552535] [] ? _raw_spin_unlock_irqrestore+0x26/0x50 > > [ 587.552548] [] ? set_cpus_allowed_ptr+0x59/0xe0 > > [ 587.552563] [] schedule+0x23/0x60 > > [ 587.552576] [] irq_thread+0xa1/0x130 > > [ 587.552588] [] ? irq_thread_dtor+0xa0/0xa0 > > > > 3/ All the cpus are in the idle task; > > Lets look at it again: > > CPU 0 CPU1 > > irq handlerirq thread > set IRQS_INPROGRESS > ... > irq_wake_thread()irq_thread() > set bit RUNTHREAD > ... clear bit RUNTHREAD > thread_fn() >atomic_dec_and_test(threads_active) ( 0 -> -1) > > atomic_inc(threads_active) ( -1 -> 0) > clr IRQS_INPROGRESS > > Now synchronize_irq comes into play, that's what caused you to look > into this. > > synchronize_irq() can never observe the -1 state because it is > serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is > cleared, the threads_active state is back to 0. > > I'm really not seing how this can happen. Any chance you can reproduce > this by executing the situation which led to this in a loop? We can have a try to forcedly reproduce the case of threads_active -1/0. But feels there is another case which the synchronize_irq waited there forever, it is no waking up action from irq_thread(). CPU0 CPU1 disable_irq() irq_thread() synchronize_irq() wait_event() adding the __wait into the queuewake_threads_waitq test threads_active==0
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Thursday, February 20, 2014 8:53 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Thu, 20 Feb 2014, Liu, Chuansheng wrote: Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Monday, February 10, 2014 4:58 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Mon, 10 Feb 2014, Chuansheng Liu wrote: There is below race between irq handler and irq thread: irq handler irq thread irq_wake_thread() irq_thread() set bit RUNTHREAD ...clear bit RUNTHREAD thread_fn() [A]test_and_decrease thread_active [B]increase thread_active If action A is before action B, after that the thread_active will be always 0, and for synchronize_irq() calling, which will be waiting there forever. No. thread_active is 0, simply because after the atomic_dec_and_test() it is -1 and the atomic_inc on the other side will bring it back to 0. Yes, you are right. The thread_active is back to 0 at last. The case we meet is: 1/ T1: blocking at disable_irq() -- sync_irq() -- wait_event() [ 142.678681] [c1a5b353] schedule+0x23/0x60 [ 142.683466] [c12b24c5] synchronize_irq+0x75/0xb0 [ 142.688931] [c125fad0] ? wake_up_bit+0x30/0x30 [ 142.694201] [c12b33ab] disable_irq+0x1b/0x20 [ 142.699278] [c17a79bc] smb347_shutdown+0x2c/0x50 [ 142.704744] [c1789f7d] i2c_device_shutdown+0x2d/0x40 [ 142.710597] [c1601734] device_shutdown+0x14/0x140 [ 142.716161] [c12535f2] kernel_restart_prepare+0x32/0x40 [ 142.722307] [c1253613] kernel_restart+0x13/0x60 2/ The corresponding irq thread is at sleep state: [ 587.552408] irq/388-SMB0349 S f1c47620 7276 119 2 0x [ 587.552439] f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 0001 c1a5f3a5 [ 587.552468] c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 c20469c0 f1d6bef0 [ 587.552497] 0296 0296 f1d6bef0 c1a5bfa6 f1c47620 f1d6bf14 c126e329 [ 587.552501] Call Trace: [ 587.552519] [c1a5f3a5] ? sub_preempt_count+0x55/0xe0 [ 587.552535] [c1a5bfa6] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 587.552548] [c126e329] ? set_cpus_allowed_ptr+0x59/0xe0 [ 587.552563] [c1a5b093] schedule+0x23/0x60 [ 587.552576] [c12b2ae1] irq_thread+0xa1/0x130 [ 587.552588] [c12b27f0] ? irq_thread_dtor+0xa0/0xa0 3/ All the cpus are in the idle task; Lets look at it again: CPU 0 CPU1 irq handlerirq thread set IRQS_INPROGRESS ... irq_wake_thread()irq_thread() set bit RUNTHREAD ... clear bit RUNTHREAD thread_fn() atomic_dec_and_test(threads_active) ( 0 - -1) atomic_inc(threads_active) ( -1 - 0) clr IRQS_INPROGRESS Now synchronize_irq comes into play, that's what caused you to look into this. synchronize_irq() can never observe the -1 state because it is serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is cleared, the threads_active state is back to 0. I'm really not seing how this can happen. Any chance you can reproduce this by executing the situation which led to this in a loop? We can have a try to forcedly reproduce the case of threads_active -1/0. But feels there is another case which the synchronize_irq waited there forever, it is no waking up action from irq_thread(). CPU0 CPU1 disable_irq() irq_thread() synchronize_irq() wait_event() adding the __wait into the queuewake_threads_waitq test threads_active==0 atomic_dec_and_test(threads_active) 1 -- 0 waitqueue_active(desc-wait_for_threads) == Here without smp_mb(), CPU1 maybe detect the queue is still empty?? schedule() It will cause although the threads_active is 0, but irq_thread() didn't do the waking up action. Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body
RE: [PATCH v4 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, -Original Message- From: Liu, Chuansheng Sent: Tuesday, February 18, 2014 10:29 AM To: r...@rjwysocki.net; gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz Cc: linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi; Liu, Chuansheng Subject: [PATCH v4 0/5] Enabling the asynchronous threads for other phases Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. After dig more about async threads, I found dpm_prepare/complete phases need them also. dpm_prepare() -- call pci_pm_prepare() one by one -- call pm_runtime_resume() one by one -- call pci_set_power_state to D3HOT -- D0 It will cause much time delaying due to d3_delay. And I made one draft patch to implement the asynchronous threads for dpm_prepare(), it is really save much time. If you like it, I can prepare another series patches for dpm_prepare() and dpm_complete(), thanks. Best Regards, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, > -Original Message- > From: Thomas Gleixner [mailto:t...@linutronix.de] > Sent: Monday, February 10, 2014 4:58 PM > To: Liu, Chuansheng > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming > Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() > wait-forever > > On Mon, 10 Feb 2014, Chuansheng Liu wrote: > > There is below race between irq handler and irq thread: > > irq handler irq thread > > > > irq_wake_thread() irq_thread() > > set bit RUNTHREAD > > ...clear bit RUNTHREAD > > thread_fn() > > [A]test_and_decrease > >thread_active > > [B]increase thread_active > > > > If action A is before action B, after that the thread_active > > will be always > 0, and for synchronize_irq() calling, which > > will be waiting there forever. > > No. thread_active is 0, simply because after the atomic_dec_and_test() > it is -1 and the atomic_inc on the other side will bring it back to 0. > Yes, you are right. The thread_active is back to 0 at last. The case we meet is: 1/ T1: blocking at disable_irq() -- > sync_irq() -- > wait_event() [ 142.678681] [] schedule+0x23/0x60 [ 142.683466] [] synchronize_irq+0x75/0xb0 [ 142.688931] [] ? wake_up_bit+0x30/0x30 [ 142.694201] [] disable_irq+0x1b/0x20 [ 142.699278] [] smb347_shutdown+0x2c/0x50 [ 142.704744] [] i2c_device_shutdown+0x2d/0x40 [ 142.710597] [] device_shutdown+0x14/0x140 [ 142.716161] [] kernel_restart_prepare+0x32/0x40 [ 142.722307] [] kernel_restart+0x13/0x60 2/ The corresponding irq thread is at sleep state: [ 587.552408] irq/388-SMB0349 S f1c47620 7276 119 2 0x [ 587.552439] f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 0001 c1a5f3a5 [ 587.552468] c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 c20469c0 f1d6bef0 [ 587.552497] 0296 0296 f1d6bef0 c1a5bfa6 f1c47620 f1d6bf14 c126e329 [ 587.552501] Call Trace: [ 587.552519] [] ? sub_preempt_count+0x55/0xe0 [ 587.552535] [] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 587.552548] [] ? set_cpus_allowed_ptr+0x59/0xe0 [ 587.552563] [] schedule+0x23/0x60 [ 587.552576] [] irq_thread+0xa1/0x130 [ 587.552588] [] ? irq_thread_dtor+0xa0/0xa0 3/ All the cpus are in the idle task; So we guess the thread_active is not 0 at that time, but irq thread is doing nothing at that time. Thought for a long time, but there is no idea, and it is just hit once. Appreciated if you have some idea, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever
Hello Thomas, -Original Message- From: Thomas Gleixner [mailto:t...@linutronix.de] Sent: Monday, February 10, 2014 4:58 PM To: Liu, Chuansheng Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever On Mon, 10 Feb 2014, Chuansheng Liu wrote: There is below race between irq handler and irq thread: irq handler irq thread irq_wake_thread() irq_thread() set bit RUNTHREAD ...clear bit RUNTHREAD thread_fn() [A]test_and_decrease thread_active [B]increase thread_active If action A is before action B, after that the thread_active will be always 0, and for synchronize_irq() calling, which will be waiting there forever. No. thread_active is 0, simply because after the atomic_dec_and_test() it is -1 and the atomic_inc on the other side will bring it back to 0. Yes, you are right. The thread_active is back to 0 at last. The case we meet is: 1/ T1: blocking at disable_irq() -- sync_irq() -- wait_event() [ 142.678681] [c1a5b353] schedule+0x23/0x60 [ 142.683466] [c12b24c5] synchronize_irq+0x75/0xb0 [ 142.688931] [c125fad0] ? wake_up_bit+0x30/0x30 [ 142.694201] [c12b33ab] disable_irq+0x1b/0x20 [ 142.699278] [c17a79bc] smb347_shutdown+0x2c/0x50 [ 142.704744] [c1789f7d] i2c_device_shutdown+0x2d/0x40 [ 142.710597] [c1601734] device_shutdown+0x14/0x140 [ 142.716161] [c12535f2] kernel_restart_prepare+0x32/0x40 [ 142.722307] [c1253613] kernel_restart+0x13/0x60 2/ The corresponding irq thread is at sleep state: [ 587.552408] irq/388-SMB0349 S f1c47620 7276 119 2 0x [ 587.552439] f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 0001 c1a5f3a5 [ 587.552468] c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 c20469c0 f1d6bef0 [ 587.552497] 0296 0296 f1d6bef0 c1a5bfa6 f1c47620 f1d6bf14 c126e329 [ 587.552501] Call Trace: [ 587.552519] [c1a5f3a5] ? sub_preempt_count+0x55/0xe0 [ 587.552535] [c1a5bfa6] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 587.552548] [c126e329] ? set_cpus_allowed_ptr+0x59/0xe0 [ 587.552563] [c1a5b093] schedule+0x23/0x60 [ 587.552576] [c12b2ae1] irq_thread+0xa1/0x130 [ 587.552588] [c12b27f0] ? irq_thread_dtor+0xa0/0xa0 3/ All the cpus are in the idle task; So we guess the thread_active is not 0 at that time, but irq thread is doing nothing at that time. Thought for a long time, but there is no idea, and it is just hit once. Appreciated if you have some idea, thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late
Hello Rafael, > -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Tuesday, February 18, 2014 11:29 PM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late > > On Tuesday, February 18, 2014 12:33:11 AM Liu, Chuansheng wrote: > > Hello Rafael, > > > > > -Original Message- > > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > > > Sent: Tuesday, February 18, 2014 1:28 AM > > > To: Liu, Chuansheng > > > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > > > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > > > Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for > suspend_late > > > > > > On Monday, February 17, 2014 02:19:14 PM Chuansheng Liu wrote: > > > > In analogy with commits 5af84b82701a and 97df8c12995, using > > > > asynchronous threads can improve the overall suspend_late > > > > time significantly. > > > > > > > > This patch is for suspend_late phase. > > > > > > > > Signed-off-by: Chuansheng Liu > > > > --- > > > > drivers/base/power/main.c | 66 > > > ++- > > > > 1 file changed, 54 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > > > > index 72b4c9c..c031050 100644 > > > > --- a/drivers/base/power/main.c > > > > +++ b/drivers/base/power/main.c > > > > @@ -1127,16 +1127,26 @@ static int > dpm_suspend_noirq(pm_message_t > > > state) > > > > * > > > > * Runtime PM is disabled for @dev while this function is being > executed. > > > > */ > > > > -static int device_suspend_late(struct device *dev, pm_message_t state) > > > > +static int __device_suspend_late(struct device *dev, pm_message_t > state, > > > bool async) > > > > { > > > > pm_callback_t callback = NULL; > > > > char *info = NULL; > > > > - int error; > > > > + int error = 0; > > > > + > > > > + dpm_wait_for_children(dev, async); > > > > > > > > > > Like in patch [4/5], all of the "goto Complete" statements can go before > > > the waiting. > > > > > You are right, will do that in patch V4, thanks your reviewing. > > I've queued up your last series: > > http://marc.info/?l=linux-pm=139269109711745=4 > > for 3.15, but I've rebased it on some patches already in my queue, so please > double check the result in linux-pm.git/bleeding-edge. > Thanks. I have pulled the latest linux-pm.git/bleeding-edge branch, and checked them which are OK.
RE: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late
Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Tuesday, February 18, 2014 11:29 PM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late On Tuesday, February 18, 2014 12:33:11 AM Liu, Chuansheng wrote: Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Tuesday, February 18, 2014 1:28 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late On Monday, February 17, 2014 02:19:14 PM Chuansheng Liu wrote: In analogy with commits 5af84b82701a and 97df8c12995, using asynchronous threads can improve the overall suspend_late time significantly. This patch is for suspend_late phase. Signed-off-by: Chuansheng Liu chuansheng@intel.com --- drivers/base/power/main.c | 66 ++- 1 file changed, 54 insertions(+), 12 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 72b4c9c..c031050 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1127,16 +1127,26 @@ static int dpm_suspend_noirq(pm_message_t state) * * Runtime PM is disabled for @dev while this function is being executed. */ -static int device_suspend_late(struct device *dev, pm_message_t state) +static int __device_suspend_late(struct device *dev, pm_message_t state, bool async) { pm_callback_t callback = NULL; char *info = NULL; - int error; + int error = 0; + + dpm_wait_for_children(dev, async); Like in patch [4/5], all of the goto Complete statements can go before the waiting. You are right, will do that in patch V4, thanks your reviewing. I've queued up your last series: http://marc.info/?l=linux-pmm=139269109711745w=4 for 3.15, but I've rebased it on some patches already in my queue, so please double check the result in linux-pm.git/bleeding-edge. Thanks. I have pulled the latest linux-pm.git/bleeding-edge branch, and checked them which are OK.
RE: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late
Hello Rafael, > -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Tuesday, February 18, 2014 1:28 AM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late > > On Monday, February 17, 2014 02:19:14 PM Chuansheng Liu wrote: > > In analogy with commits 5af84b82701a and 97df8c12995, using > > asynchronous threads can improve the overall suspend_late > > time significantly. > > > > This patch is for suspend_late phase. > > > > Signed-off-by: Chuansheng Liu > > --- > > drivers/base/power/main.c | 66 > ++- > > 1 file changed, 54 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > > index 72b4c9c..c031050 100644 > > --- a/drivers/base/power/main.c > > +++ b/drivers/base/power/main.c > > @@ -1127,16 +1127,26 @@ static int dpm_suspend_noirq(pm_message_t > state) > > * > > * Runtime PM is disabled for @dev while this function is being executed. > > */ > > -static int device_suspend_late(struct device *dev, pm_message_t state) > > +static int __device_suspend_late(struct device *dev, pm_message_t state, > bool async) > > { > > pm_callback_t callback = NULL; > > char *info = NULL; > > - int error; > > + int error = 0; > > + > > + dpm_wait_for_children(dev, async); > > > > Like in patch [4/5], all of the "goto Complete" statements can go before > the waiting. > You are right, will do that in patch V4, thanks your reviewing.
RE: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late
Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Tuesday, February 18, 2014 1:28 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH v3 5/5] PM / sleep: Asynchronous threads for suspend_late On Monday, February 17, 2014 02:19:14 PM Chuansheng Liu wrote: In analogy with commits 5af84b82701a and 97df8c12995, using asynchronous threads can improve the overall suspend_late time significantly. This patch is for suspend_late phase. Signed-off-by: Chuansheng Liu chuansheng@intel.com --- drivers/base/power/main.c | 66 ++- 1 file changed, 54 insertions(+), 12 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 72b4c9c..c031050 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1127,16 +1127,26 @@ static int dpm_suspend_noirq(pm_message_t state) * * Runtime PM is disabled for @dev while this function is being executed. */ -static int device_suspend_late(struct device *dev, pm_message_t state) +static int __device_suspend_late(struct device *dev, pm_message_t state, bool async) { pm_callback_t callback = NULL; char *info = NULL; - int error; + int error = 0; + + dpm_wait_for_children(dev, async); Like in patch [4/5], all of the goto Complete statements can go before the waiting. You are right, will do that in patch V4, thanks your reviewing.
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, > -Original Message- > From: Liu, Chuansheng > Sent: Monday, February 17, 2014 9:44 AM > To: 'Rafael J. Wysocki' > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: RE: [PATCH 0/5] Enabling the asynchronous threads for other phases > > Hello Rafael, > > > -Original Message- > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > > Sent: Monday, February 17, 2014 7:41 AM > > To: Liu, Chuansheng > > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > > Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases > > > > On Monday, February 10, 2014 08:36:57 AM Liu, Chuansheng wrote: > > > Hello Rafael, > > > > > > > -Original Message- > > > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > > > > Sent: Thursday, February 06, 2014 5:53 AM > > > > To: Liu, Chuansheng > > > > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > > > > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > > > > Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other > > phases > > > > > > > > On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: > > > > > Hello, > > > > > > > > > > This patch series are for enabling the asynchronous threads for the > > phases > > > > > resume_noirq, resume_early, suspend_noirq and suspend_late. > > > > > > > > > > Just like commit 5af84b82701a and 97df8c12995, with async threads it > > will > > > > > reduce the system suspending and resuming time significantly. > > > > > > > > > > With these patches, in my test platform, it saved 80% time in > > resume_noirq > > > > > phase. > > > > > > > > > > Has done the suspend-resume stress test for a long time, please help > > > > > to > > > > > review. > > > > > > > > > > Best Regards, > > > > > > > > > > [PATCH 1/5] PM: Adding two flags for async suspend_noirq and > > > > > [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq > > > > > [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early > > > > > [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq > > > > > [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late > > > > > > > > I've applied this to the bleeding-edge branch of the linux-pm.git tree, > > > > with > > > > minor changes related to coding style, white space etc. > > > Thanks your help. > > > > Unfortunately, I've just realized that your patches don't add any mechanism > > ensuring that parent devices' .suspend_noirq() will be called *after* the > > .suspend_noirq() of all their children. Of course, analogously for > > .suspend_late() and for the resume part (where children have to wait for > > their parents). > > > > In the original async suspend/resume code that is implemented using > > power.completion and I suppose you can do the same thing here. > > I think "parent devices suspend after their children" is not related directly > with using async or not; > In the original code, even without asyncing, the suspend/resume has the > waiting > action, but there no waiting action for suspend_noirq and other phases. > Sorry, re-checked the code, the order in the list has made sure the sequence of parent-then-children when without asyncing. Thanks your pointing out, will re-cook this series patch.
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, > -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Monday, February 17, 2014 7:41 AM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases > > On Monday, February 10, 2014 08:36:57 AM Liu, Chuansheng wrote: > > Hello Rafael, > > > > > -Original Message- > > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > > > Sent: Thursday, February 06, 2014 5:53 AM > > > To: Liu, Chuansheng > > > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > > > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > > > Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other > phases > > > > > > On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: > > > > Hello, > > > > > > > > This patch series are for enabling the asynchronous threads for the > phases > > > > resume_noirq, resume_early, suspend_noirq and suspend_late. > > > > > > > > Just like commit 5af84b82701a and 97df8c12995, with async threads it > will > > > > reduce the system suspending and resuming time significantly. > > > > > > > > With these patches, in my test platform, it saved 80% time in > resume_noirq > > > > phase. > > > > > > > > Has done the suspend-resume stress test for a long time, please help to > > > > review. > > > > > > > > Best Regards, > > > > > > > > [PATCH 1/5] PM: Adding two flags for async suspend_noirq and > > > > [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq > > > > [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early > > > > [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq > > > > [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late > > > > > > I've applied this to the bleeding-edge branch of the linux-pm.git tree, > > > with > > > minor changes related to coding style, white space etc. > > Thanks your help. > > Unfortunately, I've just realized that your patches don't add any mechanism > ensuring that parent devices' .suspend_noirq() will be called *after* the > .suspend_noirq() of all their children. Of course, analogously for > .suspend_late() and for the resume part (where children have to wait for > their parents). > > In the original async suspend/resume code that is implemented using > power.completion and I suppose you can do the same thing here. I think "parent devices suspend after their children" is not related directly with using async or not; In the original code, even without asyncing, the suspend/resume has the waiting action, but there no waiting action for suspend_noirq and other phases. If you think the parent waiting for children is necessary for suspend_noirq/suspend_late /resume_early/resume_noirq, I can cook other series patches. But we can separate them with current asyncing series patches. Agree?
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Monday, February 17, 2014 7:41 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases On Monday, February 10, 2014 08:36:57 AM Liu, Chuansheng wrote: Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Thursday, February 06, 2014 5:53 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. Just like commit 5af84b82701a and 97df8c12995, with async threads it will reduce the system suspending and resuming time significantly. With these patches, in my test platform, it saved 80% time in resume_noirq phase. Has done the suspend-resume stress test for a long time, please help to review. Best Regards, [PATCH 1/5] PM: Adding two flags for async suspend_noirq and [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late I've applied this to the bleeding-edge branch of the linux-pm.git tree, with minor changes related to coding style, white space etc. Thanks your help. Unfortunately, I've just realized that your patches don't add any mechanism ensuring that parent devices' .suspend_noirq() will be called *after* the .suspend_noirq() of all their children. Of course, analogously for .suspend_late() and for the resume part (where children have to wait for their parents). In the original async suspend/resume code that is implemented using power.completion and I suppose you can do the same thing here. I think parent devices suspend after their children is not related directly with using async or not; In the original code, even without asyncing, the suspend/resume has the waiting action, but there no waiting action for suspend_noirq and other phases. If you think the parent waiting for children is necessary for suspend_noirq/suspend_late /resume_early/resume_noirq, I can cook other series patches. But we can separate them with current asyncing series patches. Agree?
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, -Original Message- From: Liu, Chuansheng Sent: Monday, February 17, 2014 9:44 AM To: 'Rafael J. Wysocki' Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: RE: [PATCH 0/5] Enabling the asynchronous threads for other phases Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Monday, February 17, 2014 7:41 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases On Monday, February 10, 2014 08:36:57 AM Liu, Chuansheng wrote: Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Thursday, February 06, 2014 5:53 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. Just like commit 5af84b82701a and 97df8c12995, with async threads it will reduce the system suspending and resuming time significantly. With these patches, in my test platform, it saved 80% time in resume_noirq phase. Has done the suspend-resume stress test for a long time, please help to review. Best Regards, [PATCH 1/5] PM: Adding two flags for async suspend_noirq and [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late I've applied this to the bleeding-edge branch of the linux-pm.git tree, with minor changes related to coding style, white space etc. Thanks your help. Unfortunately, I've just realized that your patches don't add any mechanism ensuring that parent devices' .suspend_noirq() will be called *after* the .suspend_noirq() of all their children. Of course, analogously for .suspend_late() and for the resume part (where children have to wait for their parents). In the original async suspend/resume code that is implemented using power.completion and I suppose you can do the same thing here. I think parent devices suspend after their children is not related directly with using async or not; In the original code, even without asyncing, the suspend/resume has the waiting action, but there no waiting action for suspend_noirq and other phases. Sorry, re-checked the code, the order in the list has made sure the sequence of parent-then-children when without asyncing. Thanks your pointing out, will re-cook this series patch.
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, > -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Thursday, February 06, 2014 5:53 AM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases > > On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: > > Hello, > > > > This patch series are for enabling the asynchronous threads for the phases > > resume_noirq, resume_early, suspend_noirq and suspend_late. > > > > Just like commit 5af84b82701a and 97df8c12995, with async threads it will > > reduce the system suspending and resuming time significantly. > > > > With these patches, in my test platform, it saved 80% time in resume_noirq > > phase. > > > > Has done the suspend-resume stress test for a long time, please help to > > review. > > > > Best Regards, > > > > [PATCH 1/5] PM: Adding two flags for async suspend_noirq and > > [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq > > [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early > > [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq > > [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late > > I've applied this to the bleeding-edge branch of the linux-pm.git tree, with > minor changes related to coding style, white space etc. Thanks your help. > > Can you please verify that the bleeding-edge branch works for you as > expected? I have picked them from your bleeding-edge branch and has done the suspend-resume stress test for about 200 times and 3 hours, It is still working well. Best Regards Chuansheng Liu N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH 0/5] Enabling the asynchronous threads for other phases
Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Thursday, February 06, 2014 5:53 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; pa...@ucw.cz; Brown, Len; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH 0/5] Enabling the asynchronous threads for other phases On Monday, January 20, 2014 04:44:34 PM Liu, Chuansheng wrote: Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. Just like commit 5af84b82701a and 97df8c12995, with async threads it will reduce the system suspending and resuming time significantly. With these patches, in my test platform, it saved 80% time in resume_noirq phase. Has done the suspend-resume stress test for a long time, please help to review. Best Regards, [PATCH 1/5] PM: Adding two flags for async suspend_noirq and [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late I've applied this to the bleeding-edge branch of the linux-pm.git tree, with minor changes related to coding style, white space etc. Thanks your help. Can you please verify that the bleeding-edge branch works for you as expected? I have picked them from your bleeding-edge branch and has done the suspend-resume stress test for about 200 times and 3 hours, It is still working well. Best Regards Chuansheng Liu N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
[PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late
From: "Liu, Chuansheng" Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall suspend_late time significantly. This patch is for suspend_late phase. Signed-off-by: Liu, Chuansheng --- drivers/base/power/main.c | 54 + 1 file changed, 45 insertions(+), 9 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index ec946aa..19f5195 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1094,7 +1094,7 @@ static int dpm_suspend_noirq(pm_message_t state) * * Runtime PM is disabled for @dev while this function is being executed. */ -static int device_suspend_late(struct device *dev, pm_message_t state) +static int __device_suspend_late(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; @@ -1102,6 +1102,14 @@ static int device_suspend_late(struct device *dev, pm_message_t state) __pm_runtime_disable(dev, false); + if (async_error) + return 0; + + if (pm_wakeup_pending()) { + async_error = -EBUSY; + return async_error; + } + if (dev->power.syscore) return 0; @@ -1127,9 +1135,35 @@ static int device_suspend_late(struct device *dev, pm_message_t state) error = dpm_run_callback(callback, dev, state, info); if (!error) dev->power.is_late_suspended = true; + else + async_error = error; return error; } +static void async_suspend_late(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend_late(dev, pm_transition); + if (error) { + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, pm_transition, " async", error); + } + put_device(dev); +} + +static int device_suspend_late(struct device *dev) +{ + if (pm_async_enabled && dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend_late, dev); + return 0; + } + + return __device_suspend_late(dev, pm_transition); +} + /** * dpm_suspend_late - Execute "late suspend" callbacks for all devices. * @state: PM transition of the system being carried out. @@ -1140,19 +1174,20 @@ static int dpm_suspend_late(pm_message_t state) int error = 0; mutex_lock(_list_mtx); + pm_transition = state; + async_error = 0; + while (!list_empty(_suspended_list)) { struct device *dev = to_device(dpm_suspended_list.prev); get_device(dev); mutex_unlock(_list_mtx); - error = device_suspend_late(dev, state); + error = device_suspend_late(dev); mutex_lock(_list_mtx); if (error) { pm_dev_err(dev, state, " late", error); - suspend_stats.failed_suspend_late++; - dpm_save_failed_step(SUSPEND_SUSPEND_LATE); dpm_save_failed_dev(dev_name(dev)); put_device(dev); break; @@ -1161,15 +1196,16 @@ static int dpm_suspend_late(pm_message_t state) list_move(>power.entry, _late_early_list); put_device(dev); - if (pm_wakeup_pending()) { - error = -EBUSY; + if (async_error) break; - } } mutex_unlock(_list_mtx); - if (error) + async_synchronize_full(); + if (error) { + suspend_stats.failed_suspend_late++; + dpm_save_failed_step(SUSPEND_SUSPEND_LATE); dpm_resume_early(resume_event(state)); - else + } else dpm_show_time(starttime, state, "late"); return error; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/5] PM: Adding two flags for async suspend_noirq and suspend_late
From: "Liu, Chuansheng" The patch is one helper that adding two new flags for implementing async threads for suspend_noirq and suspend_late. Signed-off-by: Liu, Chuansheng --- drivers/base/power/main.c | 22 -- include/linux/pm.h|2 ++ 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 6a33dd8..bf71ddb 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -72,6 +72,8 @@ void device_pm_sleep_init(struct device *dev) { dev->power.is_prepared = false; dev->power.is_suspended = false; + dev->power.is_noirq_suspended = false; + dev->power.is_late_suspended = false; init_completion(>power.completion); complete_all(>power.completion); dev->power.wakeup = NULL; @@ -464,6 +466,9 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) if (dev->power.syscore) goto Out; + if (!dev->power.is_noirq_suspended) + goto Out; + if (dev->pm_domain) { info = "noirq power domain "; callback = pm_noirq_op(>pm_domain->ops, state); @@ -484,6 +489,7 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) } error = dpm_run_callback(callback, dev, state, info); + dev->power.is_noirq_suspended = false; Out: TRACE_RESUME(error); @@ -546,6 +552,9 @@ static int device_resume_early(struct device *dev, pm_message_t state) if (dev->power.syscore) goto Out; + if (!dev->power.is_late_suspended) + goto Out; + if (dev->pm_domain) { info = "early power domain "; callback = pm_late_early_op(>pm_domain->ops, state); @@ -566,6 +575,7 @@ static int device_resume_early(struct device *dev, pm_message_t state) } error = dpm_run_callback(callback, dev, state, info); + dev->power.is_late_suspended = false; Out: TRACE_RESUME(error); @@ -902,6 +912,7 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; + int error; if (dev->power.syscore) return 0; @@ -925,7 +936,10 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) callback = pm_noirq_op(dev->driver->pm, state); } - return dpm_run_callback(callback, dev, state, info); + error = dpm_run_callback(callback, dev, state, info); + if (!error) + dev->power.is_noirq_suspended = true; + return error; } /** @@ -988,6 +1002,7 @@ static int device_suspend_late(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; + int error; __pm_runtime_disable(dev, false); @@ -1013,7 +1028,10 @@ static int device_suspend_late(struct device *dev, pm_message_t state) callback = pm_late_early_op(dev->driver->pm, state); } - return dpm_run_callback(callback, dev, state, info); + error = dpm_run_callback(callback, dev, state, info); + if (!error) + dev->power.is_late_suspended = true; + return error; } /** diff --git a/include/linux/pm.h b/include/linux/pm.h index a224c7f..fe90a8b 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -521,6 +521,8 @@ struct dev_pm_info { unsigned intasync_suspend:1; boolis_prepared:1; /* Owned by the PM core */ boolis_suspended:1; /* Ditto */ + boolis_noirq_suspended:1; + boolis_late_suspended:1; boolignore_children:1; boolearly_init:1; /* Owned by the PM core */ spinlock_t lock; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq
From: "Liu, Chuansheng" Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall resume_noirq time significantly. One typical case is: In resume_noirq phase and for the PCI devices, the function pci_pm_resume_noirq() will be called, and there is one d3_delay (10ms) at least. With the way of asynchronous threads, we just need wait d3_delay time once in parallel for each calling, which save much time to resume quickly. Signed-off-by: Liu, Chuansheng --- drivers/base/power/main.c | 59 + 1 file changed, 44 insertions(+), 15 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index bf71ddb..c8a00fc 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -496,6 +496,23 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) return error; } +static bool is_async(struct device *dev) +{ + return dev->power.async_suspend && pm_async_enabled + && !pm_trace_is_enabled(); +} + +static void async_resume_noirq(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = device_resume_noirq(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + /** * dpm_resume_noirq - Execute "noirq resume" callbacks for all devices. * @state: PM transition of the system being carried out. @@ -505,29 +522,47 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) */ static void dpm_resume_noirq(pm_message_t state) { + struct device *dev; ktime_t starttime = ktime_get(); mutex_lock(_list_mtx); - while (!list_empty(_noirq_list)) { - struct device *dev = to_device(dpm_noirq_list.next); - int error; + pm_transition = state; + /* +* Advanced the async threads upfront, +* in case the starting of async threads is +* delayed by non-async resuming devices. +*/ + list_for_each_entry(dev, _noirq_list, power.entry) { + if (is_async(dev)) { + get_device(dev); + async_schedule(async_resume_noirq, dev); + } + } + + while (!list_empty(_noirq_list)) { + dev = to_device(dpm_noirq_list.next); get_device(dev); list_move_tail(>power.entry, _late_early_list); mutex_unlock(_list_mtx); - error = device_resume_noirq(dev, state); - if (error) { - suspend_stats.failed_resume_noirq++; - dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, " noirq", error); + if (!is_async(dev)) { + int error; + + error = device_resume_noirq(dev, state); + if (error) { + suspend_stats.failed_resume_noirq++; + dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, " noirq", error); + } } mutex_lock(_list_mtx); put_device(dev); } mutex_unlock(_list_mtx); + async_synchronize_full(); dpm_show_time(starttime, state, "noirq"); resume_device_irqs(); cpuidle_resume(); @@ -727,12 +762,6 @@ static void async_resume(void *data, async_cookie_t cookie) put_device(dev); } -static bool is_async(struct device *dev) -{ - return dev->power.async_suspend && pm_async_enabled - && !pm_trace_is_enabled(); -} - /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] PM: Enabling the asyncronous threads for resume_early
From: "Liu, Chuansheng" Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall resume_early time significantly. This patch is for resume_early phase. Signed-off-by: Liu, Chuansheng --- drivers/base/power/main.c | 48 +++-- 1 file changed, 38 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index c8a00fc..1bad6bd 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -619,35 +619,63 @@ static int device_resume_early(struct device *dev, pm_message_t state) return error; } +static void async_resume_early(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = device_resume_early(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + /** * dpm_resume_early - Execute "early resume" callbacks for all devices. * @state: PM transition of the system being carried out. */ static void dpm_resume_early(pm_message_t state) { + struct device *dev; ktime_t starttime = ktime_get(); mutex_lock(_list_mtx); - while (!list_empty(_late_early_list)) { - struct device *dev = to_device(dpm_late_early_list.next); - int error; + pm_transition = state; + /* +* Advanced the async threads upfront, +* in case the starting of async threads is +* delayed by non-async resuming devices. +*/ + list_for_each_entry(dev, _late_early_list, power.entry) { + if (is_async(dev)) { + get_device(dev); + async_schedule(async_resume_early, dev); + } + } + + while (!list_empty(_late_early_list)) { + dev = to_device(dpm_late_early_list.next); get_device(dev); list_move_tail(>power.entry, _suspended_list); mutex_unlock(_list_mtx); - error = device_resume_early(dev, state); - if (error) { - suspend_stats.failed_resume_early++; - dpm_save_failed_step(SUSPEND_RESUME_EARLY); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, " early", error); - } + if (!is_async(dev)) { + int error; + error = device_resume_early(dev, state); + if (error) { + suspend_stats.failed_resume_early++; + dpm_save_failed_step(SUSPEND_RESUME_EARLY); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, " early", error); + } + } mutex_lock(_list_mtx); put_device(dev); } mutex_unlock(_list_mtx); + async_synchronize_full(); dpm_show_time(starttime, state, "early"); } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq
From: "Liu, Chuansheng" Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall suspend_noirq time significantly. This patch is for suspend_noirq phase. Signed-off-by: Liu, Chuansheng --- drivers/base/power/main.c | 57 ++--- 1 file changed, 48 insertions(+), 9 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 1bad6bd..ec946aa 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -965,12 +965,20 @@ static pm_message_t resume_event(pm_message_t sleep_state) * The driver of @dev will not receive interrupts while this function is being * executed. */ -static int device_suspend_noirq(struct device *dev, pm_message_t state) +static int __device_suspend_noirq(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; int error; + if (async_error) + return 0; + + if (pm_wakeup_pending()) { + async_error = -EBUSY; + return async_error; + } + if (dev->power.syscore) return 0; @@ -996,9 +1004,36 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) error = dpm_run_callback(callback, dev, state, info); if (!error) dev->power.is_noirq_suspended = true; + else + async_error = error; + return error; } +static void async_suspend_noirq(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend_noirq(dev, pm_transition); + if (error) { + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, pm_transition, " async", error); + } + + put_device(dev); +} + +static int device_suspend_noirq(struct device *dev) +{ + if (pm_async_enabled && dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend_noirq, dev); + return 0; + } + return __device_suspend_noirq(dev, pm_transition); +} + /** * dpm_suspend_noirq - Execute "noirq suspend" callbacks for all devices. * @state: PM transition of the system being carried out. @@ -1014,19 +1049,20 @@ static int dpm_suspend_noirq(pm_message_t state) cpuidle_pause(); suspend_device_irqs(); mutex_lock(_list_mtx); + pm_transition = state; + async_error = 0; + while (!list_empty(_late_early_list)) { struct device *dev = to_device(dpm_late_early_list.prev); get_device(dev); mutex_unlock(_list_mtx); - error = device_suspend_noirq(dev, state); + error = device_suspend_noirq(dev); mutex_lock(_list_mtx); if (error) { pm_dev_err(dev, state, " noirq", error); - suspend_stats.failed_suspend_noirq++; - dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); dpm_save_failed_dev(dev_name(dev)); put_device(dev); break; @@ -1035,15 +1071,18 @@ static int dpm_suspend_noirq(pm_message_t state) list_move(>power.entry, _noirq_list); put_device(dev); - if (pm_wakeup_pending()) { - error = -EBUSY; + if (async_error) break; - } } mutex_unlock(_list_mtx); - if (error) + async_synchronize_full(); + if (!error) + error = async_error; + if (error) { + suspend_stats.failed_suspend_noirq++; + dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); dpm_resume_noirq(resume_event(state)); - else + } else dpm_show_time(starttime, state, "noirq"); return error; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/5] Enabling the asynchronous threads for other phases
Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. Just like commit 5af84b82701a and 97df8c12995, with async threads it will reduce the system suspending and resuming time significantly. With these patches, in my test platform, it saved 80% time in resume_noirq phase. Has done the suspend-resume stress test for a long time, please help to review. Best Regards, [PATCH 1/5] PM: Adding two flags for async suspend_noirq and [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late drivers/base/power/main.c | 240 - include/linux/pm.h|2 + 2 files changed, 197 insertions(+), 45 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/5] Enabling the asynchronous threads for other phases
Hello, This patch series are for enabling the asynchronous threads for the phases resume_noirq, resume_early, suspend_noirq and suspend_late. Just like commit 5af84b82701a and 97df8c12995, with async threads it will reduce the system suspending and resuming time significantly. With these patches, in my test platform, it saved 80% time in resume_noirq phase. Has done the suspend-resume stress test for a long time, please help to review. Best Regards, [PATCH 1/5] PM: Adding two flags for async suspend_noirq and [PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq [PATCH 3/5] PM: Enabling the asyncronous threads for resume_early [PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq [PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late drivers/base/power/main.c | 240 - include/linux/pm.h|2 + 2 files changed, 197 insertions(+), 45 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/5] PM: Enabling the asyncronous threads for suspend_noirq
From: Liu, Chuansheng chuansheng@intel.com Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall suspend_noirq time significantly. This patch is for suspend_noirq phase. Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- drivers/base/power/main.c | 57 ++--- 1 file changed, 48 insertions(+), 9 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 1bad6bd..ec946aa 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -965,12 +965,20 @@ static pm_message_t resume_event(pm_message_t sleep_state) * The driver of @dev will not receive interrupts while this function is being * executed. */ -static int device_suspend_noirq(struct device *dev, pm_message_t state) +static int __device_suspend_noirq(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; int error; + if (async_error) + return 0; + + if (pm_wakeup_pending()) { + async_error = -EBUSY; + return async_error; + } + if (dev-power.syscore) return 0; @@ -996,9 +1004,36 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) error = dpm_run_callback(callback, dev, state, info); if (!error) dev-power.is_noirq_suspended = true; + else + async_error = error; + return error; } +static void async_suspend_noirq(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend_noirq(dev, pm_transition); + if (error) { + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, pm_transition, async, error); + } + + put_device(dev); +} + +static int device_suspend_noirq(struct device *dev) +{ + if (pm_async_enabled dev-power.async_suspend) { + get_device(dev); + async_schedule(async_suspend_noirq, dev); + return 0; + } + return __device_suspend_noirq(dev, pm_transition); +} + /** * dpm_suspend_noirq - Execute noirq suspend callbacks for all devices. * @state: PM transition of the system being carried out. @@ -1014,19 +1049,20 @@ static int dpm_suspend_noirq(pm_message_t state) cpuidle_pause(); suspend_device_irqs(); mutex_lock(dpm_list_mtx); + pm_transition = state; + async_error = 0; + while (!list_empty(dpm_late_early_list)) { struct device *dev = to_device(dpm_late_early_list.prev); get_device(dev); mutex_unlock(dpm_list_mtx); - error = device_suspend_noirq(dev, state); + error = device_suspend_noirq(dev); mutex_lock(dpm_list_mtx); if (error) { pm_dev_err(dev, state, noirq, error); - suspend_stats.failed_suspend_noirq++; - dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); dpm_save_failed_dev(dev_name(dev)); put_device(dev); break; @@ -1035,15 +1071,18 @@ static int dpm_suspend_noirq(pm_message_t state) list_move(dev-power.entry, dpm_noirq_list); put_device(dev); - if (pm_wakeup_pending()) { - error = -EBUSY; + if (async_error) break; - } } mutex_unlock(dpm_list_mtx); - if (error) + async_synchronize_full(); + if (!error) + error = async_error; + if (error) { + suspend_stats.failed_suspend_noirq++; + dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); dpm_resume_noirq(resume_event(state)); - else + } else dpm_show_time(starttime, state, noirq); return error; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/5] PM: Adding two flags for async suspend_noirq and suspend_late
From: Liu, Chuansheng chuansheng@intel.com The patch is one helper that adding two new flags for implementing async threads for suspend_noirq and suspend_late. Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- drivers/base/power/main.c | 22 -- include/linux/pm.h|2 ++ 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 6a33dd8..bf71ddb 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -72,6 +72,8 @@ void device_pm_sleep_init(struct device *dev) { dev-power.is_prepared = false; dev-power.is_suspended = false; + dev-power.is_noirq_suspended = false; + dev-power.is_late_suspended = false; init_completion(dev-power.completion); complete_all(dev-power.completion); dev-power.wakeup = NULL; @@ -464,6 +466,9 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) if (dev-power.syscore) goto Out; + if (!dev-power.is_noirq_suspended) + goto Out; + if (dev-pm_domain) { info = noirq power domain ; callback = pm_noirq_op(dev-pm_domain-ops, state); @@ -484,6 +489,7 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) } error = dpm_run_callback(callback, dev, state, info); + dev-power.is_noirq_suspended = false; Out: TRACE_RESUME(error); @@ -546,6 +552,9 @@ static int device_resume_early(struct device *dev, pm_message_t state) if (dev-power.syscore) goto Out; + if (!dev-power.is_late_suspended) + goto Out; + if (dev-pm_domain) { info = early power domain ; callback = pm_late_early_op(dev-pm_domain-ops, state); @@ -566,6 +575,7 @@ static int device_resume_early(struct device *dev, pm_message_t state) } error = dpm_run_callback(callback, dev, state, info); + dev-power.is_late_suspended = false; Out: TRACE_RESUME(error); @@ -902,6 +912,7 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; + int error; if (dev-power.syscore) return 0; @@ -925,7 +936,10 @@ static int device_suspend_noirq(struct device *dev, pm_message_t state) callback = pm_noirq_op(dev-driver-pm, state); } - return dpm_run_callback(callback, dev, state, info); + error = dpm_run_callback(callback, dev, state, info); + if (!error) + dev-power.is_noirq_suspended = true; + return error; } /** @@ -988,6 +1002,7 @@ static int device_suspend_late(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; + int error; __pm_runtime_disable(dev, false); @@ -1013,7 +1028,10 @@ static int device_suspend_late(struct device *dev, pm_message_t state) callback = pm_late_early_op(dev-driver-pm, state); } - return dpm_run_callback(callback, dev, state, info); + error = dpm_run_callback(callback, dev, state, info); + if (!error) + dev-power.is_late_suspended = true; + return error; } /** diff --git a/include/linux/pm.h b/include/linux/pm.h index a224c7f..fe90a8b 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -521,6 +521,8 @@ struct dev_pm_info { unsigned intasync_suspend:1; boolis_prepared:1; /* Owned by the PM core */ boolis_suspended:1; /* Ditto */ + boolis_noirq_suspended:1; + boolis_late_suspended:1; boolignore_children:1; boolearly_init:1; /* Owned by the PM core */ spinlock_t lock; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/5] PM: Enabling the asynchronous threads for resume_noirq
From: Liu, Chuansheng chuansheng@intel.com Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall resume_noirq time significantly. One typical case is: In resume_noirq phase and for the PCI devices, the function pci_pm_resume_noirq() will be called, and there is one d3_delay (10ms) at least. With the way of asynchronous threads, we just need wait d3_delay time once in parallel for each calling, which save much time to resume quickly. Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- drivers/base/power/main.c | 59 + 1 file changed, 44 insertions(+), 15 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index bf71ddb..c8a00fc 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -496,6 +496,23 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) return error; } +static bool is_async(struct device *dev) +{ + return dev-power.async_suspend pm_async_enabled +!pm_trace_is_enabled(); +} + +static void async_resume_noirq(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = device_resume_noirq(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, async, error); + put_device(dev); +} + /** * dpm_resume_noirq - Execute noirq resume callbacks for all devices. * @state: PM transition of the system being carried out. @@ -505,29 +522,47 @@ static int device_resume_noirq(struct device *dev, pm_message_t state) */ static void dpm_resume_noirq(pm_message_t state) { + struct device *dev; ktime_t starttime = ktime_get(); mutex_lock(dpm_list_mtx); - while (!list_empty(dpm_noirq_list)) { - struct device *dev = to_device(dpm_noirq_list.next); - int error; + pm_transition = state; + /* +* Advanced the async threads upfront, +* in case the starting of async threads is +* delayed by non-async resuming devices. +*/ + list_for_each_entry(dev, dpm_noirq_list, power.entry) { + if (is_async(dev)) { + get_device(dev); + async_schedule(async_resume_noirq, dev); + } + } + + while (!list_empty(dpm_noirq_list)) { + dev = to_device(dpm_noirq_list.next); get_device(dev); list_move_tail(dev-power.entry, dpm_late_early_list); mutex_unlock(dpm_list_mtx); - error = device_resume_noirq(dev, state); - if (error) { - suspend_stats.failed_resume_noirq++; - dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, noirq, error); + if (!is_async(dev)) { + int error; + + error = device_resume_noirq(dev, state); + if (error) { + suspend_stats.failed_resume_noirq++; + dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, noirq, error); + } } mutex_lock(dpm_list_mtx); put_device(dev); } mutex_unlock(dpm_list_mtx); + async_synchronize_full(); dpm_show_time(starttime, state, noirq); resume_device_irqs(); cpuidle_resume(); @@ -727,12 +762,6 @@ static void async_resume(void *data, async_cookie_t cookie) put_device(dev); } -static bool is_async(struct device *dev) -{ - return dev-power.async_suspend pm_async_enabled -!pm_trace_is_enabled(); -} - /** * dpm_resume - Execute resume callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/5] PM: Enabling the asyncronous threads for resume_early
From: Liu, Chuansheng chuansheng@intel.com Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall resume_early time significantly. This patch is for resume_early phase. Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- drivers/base/power/main.c | 48 +++-- 1 file changed, 38 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index c8a00fc..1bad6bd 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -619,35 +619,63 @@ static int device_resume_early(struct device *dev, pm_message_t state) return error; } +static void async_resume_early(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = device_resume_early(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, async, error); + put_device(dev); +} + /** * dpm_resume_early - Execute early resume callbacks for all devices. * @state: PM transition of the system being carried out. */ static void dpm_resume_early(pm_message_t state) { + struct device *dev; ktime_t starttime = ktime_get(); mutex_lock(dpm_list_mtx); - while (!list_empty(dpm_late_early_list)) { - struct device *dev = to_device(dpm_late_early_list.next); - int error; + pm_transition = state; + /* +* Advanced the async threads upfront, +* in case the starting of async threads is +* delayed by non-async resuming devices. +*/ + list_for_each_entry(dev, dpm_late_early_list, power.entry) { + if (is_async(dev)) { + get_device(dev); + async_schedule(async_resume_early, dev); + } + } + + while (!list_empty(dpm_late_early_list)) { + dev = to_device(dpm_late_early_list.next); get_device(dev); list_move_tail(dev-power.entry, dpm_suspended_list); mutex_unlock(dpm_list_mtx); - error = device_resume_early(dev, state); - if (error) { - suspend_stats.failed_resume_early++; - dpm_save_failed_step(SUSPEND_RESUME_EARLY); - dpm_save_failed_dev(dev_name(dev)); - pm_dev_err(dev, state, early, error); - } + if (!is_async(dev)) { + int error; + error = device_resume_early(dev, state); + if (error) { + suspend_stats.failed_resume_early++; + dpm_save_failed_step(SUSPEND_RESUME_EARLY); + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, state, early, error); + } + } mutex_lock(dpm_list_mtx); put_device(dev); } mutex_unlock(dpm_list_mtx); + async_synchronize_full(); dpm_show_time(starttime, state, early); } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/5] PM: Enabling the asyncronous threads for suspend_late
From: Liu, Chuansheng chuansheng@intel.com Just like commit 5af84b82701a and 97df8c12995, using the asynchronous threads can improve the overall suspend_late time significantly. This patch is for suspend_late phase. Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- drivers/base/power/main.c | 54 + 1 file changed, 45 insertions(+), 9 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index ec946aa..19f5195 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1094,7 +1094,7 @@ static int dpm_suspend_noirq(pm_message_t state) * * Runtime PM is disabled for @dev while this function is being executed. */ -static int device_suspend_late(struct device *dev, pm_message_t state) +static int __device_suspend_late(struct device *dev, pm_message_t state) { pm_callback_t callback = NULL; char *info = NULL; @@ -1102,6 +1102,14 @@ static int device_suspend_late(struct device *dev, pm_message_t state) __pm_runtime_disable(dev, false); + if (async_error) + return 0; + + if (pm_wakeup_pending()) { + async_error = -EBUSY; + return async_error; + } + if (dev-power.syscore) return 0; @@ -1127,9 +1135,35 @@ static int device_suspend_late(struct device *dev, pm_message_t state) error = dpm_run_callback(callback, dev, state, info); if (!error) dev-power.is_late_suspended = true; + else + async_error = error; return error; } +static void async_suspend_late(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend_late(dev, pm_transition); + if (error) { + dpm_save_failed_dev(dev_name(dev)); + pm_dev_err(dev, pm_transition, async, error); + } + put_device(dev); +} + +static int device_suspend_late(struct device *dev) +{ + if (pm_async_enabled dev-power.async_suspend) { + get_device(dev); + async_schedule(async_suspend_late, dev); + return 0; + } + + return __device_suspend_late(dev, pm_transition); +} + /** * dpm_suspend_late - Execute late suspend callbacks for all devices. * @state: PM transition of the system being carried out. @@ -1140,19 +1174,20 @@ static int dpm_suspend_late(pm_message_t state) int error = 0; mutex_lock(dpm_list_mtx); + pm_transition = state; + async_error = 0; + while (!list_empty(dpm_suspended_list)) { struct device *dev = to_device(dpm_suspended_list.prev); get_device(dev); mutex_unlock(dpm_list_mtx); - error = device_suspend_late(dev, state); + error = device_suspend_late(dev); mutex_lock(dpm_list_mtx); if (error) { pm_dev_err(dev, state, late, error); - suspend_stats.failed_suspend_late++; - dpm_save_failed_step(SUSPEND_SUSPEND_LATE); dpm_save_failed_dev(dev_name(dev)); put_device(dev); break; @@ -1161,15 +1196,16 @@ static int dpm_suspend_late(pm_message_t state) list_move(dev-power.entry, dpm_late_early_list); put_device(dev); - if (pm_wakeup_pending()) { - error = -EBUSY; + if (async_error) break; - } } mutex_unlock(dpm_list_mtx); - if (error) + async_synchronize_full(); + if (error) { + suspend_stats.failed_suspend_late++; + dpm_save_failed_step(SUSPEND_SUSPEND_LATE); dpm_resume_early(resume_event(state)); - else + } else dpm_show_time(starttime, state, late); return error; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time
> -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Wednesday, January 15, 2014 9:28 PM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to save > the resuming time > > On Wednesday, January 15, 2014 12:35:16 AM Liu, Chuansheng wrote: > > Hello Rafael, > > > > > -Original Message- > > > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > > > Sent: Wednesday, January 15, 2014 6:55 AM > > > To: Liu, Chuansheng > > > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > > > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > > > Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to > save > > > the resuming time > > > > > > On Tuesday, January 14, 2014 03:18:08 PM Chuansheng Liu wrote: > > > > > > > > Currently, the dpm_resume_noirq() is done synchronously, and for PCI > devices > > > > pci_pm_resume_noirq(): > > > > > > > > pci_pm_resume_noirq() > > > > pci_pm_default_resume_early() > > > > pci_power_up() > > > >pci_raw_set_power_state() > > > > Which set the device from D3hot to D0 mostly, for every device, there > > > > will > > > > be one 10ms(pci_pm_d3_delay) to wait. > > > > > > > > Hence normally dpm_resume_noirq() will cost > 100ms, which is bigger > for > > > mobile > > > > platform. > > > > > > pci_pm_d3_delay usually is not 10 ms. What is 10 ms is dev->d3_delay, > but > > > that also is not on every platform now. > > Yes, it is d3_delay exactly, thanks your correction. > > > > > > > > That said the approach here makes sense, but I have two questions: > > > > > > 1. On what systems has it been tested? > > I have been tested on Baytrail platform, and at the beginning the d3_delay > > is > set to 0, > > but we really meet the HANG issue if we operate the device immediately > sometimes, > > so currently most PCI devices in Baytrail has the 10ms delay. > > And it caused the whole resume time bigger than Clovertrail platform; > > Does setting d3_delay to something smaller that 10 ms work on BayTrail? > > You can try to set 5 ms and then 2.5 ms if that works and so on. That still > would be an improvement from the 10 ms. Yes, for most devices in Baytrail platform, we already set the default value to 3ms, it works well now, but for rare device, we set the d3_delay to 10ms. So with this patch, the noirq execution time is 10ms; > > > > 2. What about suspend_late/resume_early? If the _noirq things are to be > > > executed asynchronously, those should be too. > > I noticed it just taken little time often, and I cannot find out some > > reasons for > changing them > > in theory, so if you like it, I will update them into patch V2 also:) > > Yes, please do that. Will do it, thanks.
RE: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time
-Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Wednesday, January 15, 2014 9:28 PM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time On Wednesday, January 15, 2014 12:35:16 AM Liu, Chuansheng wrote: Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Wednesday, January 15, 2014 6:55 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time On Tuesday, January 14, 2014 03:18:08 PM Chuansheng Liu wrote: Currently, the dpm_resume_noirq() is done synchronously, and for PCI devices pci_pm_resume_noirq(): pci_pm_resume_noirq() pci_pm_default_resume_early() pci_power_up() pci_raw_set_power_state() Which set the device from D3hot to D0 mostly, for every device, there will be one 10ms(pci_pm_d3_delay) to wait. Hence normally dpm_resume_noirq() will cost 100ms, which is bigger for mobile platform. pci_pm_d3_delay usually is not 10 ms. What is 10 ms is dev-d3_delay, but that also is not on every platform now. Yes, it is d3_delay exactly, thanks your correction. That said the approach here makes sense, but I have two questions: 1. On what systems has it been tested? I have been tested on Baytrail platform, and at the beginning the d3_delay is set to 0, but we really meet the HANG issue if we operate the device immediately sometimes, so currently most PCI devices in Baytrail has the 10ms delay. And it caused the whole resume time bigger than Clovertrail platform; Does setting d3_delay to something smaller that 10 ms work on BayTrail? You can try to set 5 ms and then 2.5 ms if that works and so on. That still would be an improvement from the 10 ms. Yes, for most devices in Baytrail platform, we already set the default value to 3ms, it works well now, but for rare device, we set the d3_delay to 10ms. So with this patch, the noirq execution time is 10ms; 2. What about suspend_late/resume_early? If the _noirq things are to be executed asynchronously, those should be too. I noticed it just taken little time often, and I cannot find out some reasons for changing them in theory, so if you like it, I will update them into patch V2 also:) Yes, please do that. Will do it, thanks.
RE: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time
Hello Rafael, > -Original Message- > From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] > Sent: Wednesday, January 15, 2014 6:55 AM > To: Liu, Chuansheng > Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; > linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi > Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to save > the resuming time > > On Tuesday, January 14, 2014 03:18:08 PM Chuansheng Liu wrote: > > > > Currently, the dpm_resume_noirq() is done synchronously, and for PCI devices > > pci_pm_resume_noirq(): > > > > pci_pm_resume_noirq() > > pci_pm_default_resume_early() > > pci_power_up() > >pci_raw_set_power_state() > > Which set the device from D3hot to D0 mostly, for every device, there will > > be one 10ms(pci_pm_d3_delay) to wait. > > > > Hence normally dpm_resume_noirq() will cost > 100ms, which is bigger for > mobile > > platform. > > pci_pm_d3_delay usually is not 10 ms. What is 10 ms is dev->d3_delay, but > that also is not on every platform now. Yes, it is d3_delay exactly, thanks your correction. > > That said the approach here makes sense, but I have two questions: > > 1. On what systems has it been tested? I have been tested on Baytrail platform, and at the beginning the d3_delay is set to 0, but we really meet the HANG issue if we operate the device immediately sometimes, so currently most PCI devices in Baytrail has the 10ms delay. And it caused the whole resume time bigger than Clovertrail platform; > 2. What about suspend_late/resume_early? If the _noirq things are to be > executed asynchronously, those should be too. I noticed it just taken little time often, and I cannot find out some reasons for changing them in theory, so if you like it, I will update them into patch V2 also:) N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time
Hello Rafael, -Original Message- From: Rafael J. Wysocki [mailto:r...@rjwysocki.net] Sent: Wednesday, January 15, 2014 6:55 AM To: Liu, Chuansheng Cc: gre...@linuxfoundation.org; Brown, Len; pa...@ucw.cz; linux...@vger.kernel.org; linux-kernel@vger.kernel.org; Li, Zhuangzhi Subject: Re: [PATCH] PM: Enable asynchronous noirq resume threads to save the resuming time On Tuesday, January 14, 2014 03:18:08 PM Chuansheng Liu wrote: Currently, the dpm_resume_noirq() is done synchronously, and for PCI devices pci_pm_resume_noirq(): pci_pm_resume_noirq() pci_pm_default_resume_early() pci_power_up() pci_raw_set_power_state() Which set the device from D3hot to D0 mostly, for every device, there will be one 10ms(pci_pm_d3_delay) to wait. Hence normally dpm_resume_noirq() will cost 100ms, which is bigger for mobile platform. pci_pm_d3_delay usually is not 10 ms. What is 10 ms is dev-d3_delay, but that also is not on every platform now. Yes, it is d3_delay exactly, thanks your correction. That said the approach here makes sense, but I have two questions: 1. On what systems has it been tested? I have been tested on Baytrail platform, and at the beginning the d3_delay is set to 0, but we really meet the HANG issue if we operate the device immediately sometimes, so currently most PCI devices in Baytrail has the 10ms delay. And it caused the whole resume time bigger than Clovertrail platform; 2. What about suspend_late/resume_early? If the _noirq things are to be executed asynchronously, those should be too. I noticed it just taken little time often, and I cannot find out some reasons for changing them in theory, so if you like it, I will update them into patch V2 also:) N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH 2/3] xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
Hello Jeff, > -Original Message- > From: Jeff Liu [mailto:jeff@oracle.com] > Sent: Tuesday, January 07, 2014 7:21 PM > To: Liu, Chuansheng; dchin...@fromorbit.com; b...@sgi.com > Cc: linux-kernel@vger.kernel.org; x...@oss.sgi.com > Subject: Re: [PATCH 2/3] xfs: Calling destroy_work_on_stack() to pair with > INIT_WORK_ONSTACK() > > Hi Chuansheng, > > On 01/07 2014 16:53 PM, Chuansheng Liu wrote: > > > > In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to > > call destroy_work_on_stack() which frees the debug object to pair > > with INIT_WORK_ONSTACK(). > > > > Signed-off-by: Liu, Chuansheng > > --- > Thanks for your patch and it work fine for my testing. I missed this in an > old commit: [ 3b876c8f2a xfs: fix debug_object WARN at xfs_alloc_vextent() ] Thanks your testing. > > Just out of curious, do you notice memory leaks or other hints which help you > finding out this problem? I am trying to use the INIT_WORK_ONSTACK() in my code, then I found in some places Calling destroy_work_on_stack() is missed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/3] xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
Hello Jeff, -Original Message- From: Jeff Liu [mailto:jeff@oracle.com] Sent: Tuesday, January 07, 2014 7:21 PM To: Liu, Chuansheng; dchin...@fromorbit.com; b...@sgi.com Cc: linux-kernel@vger.kernel.org; x...@oss.sgi.com Subject: Re: [PATCH 2/3] xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() Hi Chuansheng, On 01/07 2014 16:53 PM, Chuansheng Liu wrote: In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to call destroy_work_on_stack() which frees the debug object to pair with INIT_WORK_ONSTACK(). Signed-off-by: Liu, Chuansheng chuansheng@intel.com --- Thanks for your patch and it work fine for my testing. I missed this in an old commit: [ 3b876c8f2a xfs: fix debug_object WARN at xfs_alloc_vextent() ] Thanks your testing. Just out of curious, do you notice memory leaks or other hints which help you finding out this problem? I am trying to use the INIT_WORK_ONSTACK() in my code, then I found in some places Calling destroy_work_on_stack() is missed. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [fuse-devel] fuse try to freeze with lock held
Hello Miklos, > -Original Message- > From: Miklos Szeredi [mailto:mik...@szeredi.hu] > Sent: Wednesday, December 11, 2013 11:35 PM > To: Liu, Chuansheng > Cc: mszer...@suse.cz; fuse-de...@lists.sourceforge.net; > linux-kernel@vger.kernel.org > Subject: Re: [fuse-devel] fuse try to freeze with lock held > > On Wed, Nov 27, 2013 at 3:19 AM, Liu, Chuansheng > wrote: > > Hit the below warning when enabling the CONFIG_DEBUG_LOCK_ALLOC, > > __fuse_request_send == > > > request_wait_answer == > > > wait_event_freezable == > > > try_to_freeze() > > > > But here the inode->i_mutex is hold yet in vfs_readdir(). > > > > Could anyone give some help? Thanks. > > > > Mainline doesn't have wait_event_freezable() in fuse. What kernel > variant is this? > Thanks your pointing out, it is my fault since it is not mainline.
RE: [fuse-devel] fuse try to freeze with lock held
Hello Miklos, -Original Message- From: Miklos Szeredi [mailto:mik...@szeredi.hu] Sent: Wednesday, December 11, 2013 11:35 PM To: Liu, Chuansheng Cc: mszer...@suse.cz; fuse-de...@lists.sourceforge.net; linux-kernel@vger.kernel.org Subject: Re: [fuse-devel] fuse try to freeze with lock held On Wed, Nov 27, 2013 at 3:19 AM, Liu, Chuansheng chuansheng@intel.com wrote: Hit the below warning when enabling the CONFIG_DEBUG_LOCK_ALLOC, __fuse_request_send == request_wait_answer == wait_event_freezable == try_to_freeze() But here the inode-i_mutex is hold yet in vfs_readdir(). Could anyone give some help? Thanks. Mainline doesn't have wait_event_freezable() in fuse. What kernel variant is this? Thanks your pointing out, it is my fault since it is not mainline.
fuse try to freeze with lock held
Hit the below warning when enabling the CONFIG_DEBUG_LOCK_ALLOC, __fuse_request_send == > request_wait_answer == > wait_event_freezable == > try_to_freeze() But here the inode->i_mutex is hold yet in vfs_readdir(). Could anyone give some help? Thanks. [ 3148.552435] Freezing user space processes ... [ 3148.557781] [ 3148.559505] = [ 3148.564803] [ BUG: du/21716 still has locks held! ] [ 3148.570319] 3.10.16-261493-g1308ae2 #8 Tainted: GW O [ 3148.586082] #0: (>i_mutex_dir_key#3){..}, at: [] vfs_readdir+0x66/0xd0 [ 3148.595967] [ 3148.595967] stack backtrace: [ 3148.600875] CPU: 2 PID: 21716 Comm: du Tainted: GW O 3.10.16-261493-g1308ae2 #8 [ 3148.609973] d7e494e0 d7e494e0 cca6be78 c1b0fccb cca6be90 c12913f1 c1db2cc5 d7e497e4 [ 3148.618872] 54d4 cca6beac cca6becc c1440d8b d7e494e0 f2557800 f12acedc [ 3148.627747] d7e494e0 c125f100 cca6beb8 cca6beb8 f12ace40 f2557800 [ 3148.636614] Call Trace: [ 3148.639395] [] dump_stack+0x16/0x18 [ 3148.644375] [] debug_check_no_locks_held+0x91/0xa0 [ 3148.650862] [] __fuse_request_send+0x14b/0x320 [ 3148.656961] [] ? wake_up_bit+0x30/0x30 [ 3148.662242] [] fuse_request_send+0x11/0x20 [ 3148.667941] [] fuse_readdir+0x109/0x690 [ 3148.673327] [] ? vfs_readdir+0x66/0xd0 [ 3148.678636] [] ? sub_preempt_count+0x45/0x60 [ 3148.684503] [] ? mutex_lock_killable_nested+0x214/0x300 [ 3148.691457] [] ? SyS_ioctl+0x90/0x90 [ 3148.696553] [] ? vfs_readdir+0x66/0xd0 [ 3148.701857] [] ? SyS_ioctl+0x90/0x90 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/