RE: [PATCH V2] output the cpu number when printking.
Thanks, Greg. I did use this patch to fix many races on SMP. But to respect maintainer, I stop pushing the patch to upstream. -Original Message- From: Greg KH [mailto:gre...@linuxfoundation.org] Sent: Thursday, December 27, 2012 1:50 AM To: Yanmin Zhang Cc: He, Bo; Randy Dunlap; a...@linux-foundation.org; mi...@elte.hu; linux-kernel@vger.kernel.org; a.p.zijls...@chello.nl Subject: Re: [PATCH V2] output the cpu number when printking. On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote: > On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: > > On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: > > > From: "he, bo" > > > > > > We often hit kernel panic issues on SMP machines because processes > > > race on multiple cpu. By adding a new parameter printk.cpu, kernel > > > prints cpu number at printk information line. It’s useful to debug > > > what cpus are racing. > > > > How useful is this really for normal developers? > It's very useful to debug race conditions under SMP environment. > We applied the patch to our Android build image on our smartphones. That's fine for your application, and seemed to be useful to others with their first interactions with SMP systems. However, once you start to get to "real" numbers of CPUs, this information turns pretty pointless, which is why the patch was originally rejected. sorry, greg k-h
Re: [PATCH V2] output the cpu number when printking.
On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote: > On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: > > On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: > > > From: "he, bo" > > > > > > We often hit kernel panic issues on SMP machines because processes race > > > on multiple cpu. By adding a new parameter printk.cpu, kernel prints > > > cpu number at printk information line. It’s useful to debug what cpus > > > are racing. > > > > How useful is this really for normal developers? > It's very useful to debug race conditions under SMP environment. > We applied the patch to our Android build image on our smartphones. That's fine for your application, and seemed to be useful to others with their first interactions with SMP systems. However, once you start to get to "real" numbers of CPUs, this information turns pretty pointless, which is why the patch was originally rejected. sorry, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] output the cpu number when printking.
On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote: On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: From: he, bo bo...@intel.com We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. How useful is this really for normal developers? It's very useful to debug race conditions under SMP environment. We applied the patch to our Android build image on our smartphones. That's fine for your application, and seemed to be useful to others with their first interactions with SMP systems. However, once you start to get to real numbers of CPUs, this information turns pretty pointless, which is why the patch was originally rejected. sorry, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH V2] output the cpu number when printking.
Thanks, Greg. I did use this patch to fix many races on SMP. But to respect maintainer, I stop pushing the patch to upstream. -Original Message- From: Greg KH [mailto:gre...@linuxfoundation.org] Sent: Thursday, December 27, 2012 1:50 AM To: Yanmin Zhang Cc: He, Bo; Randy Dunlap; a...@linux-foundation.org; mi...@elte.hu; linux-kernel@vger.kernel.org; a.p.zijls...@chello.nl Subject: Re: [PATCH V2] output the cpu number when printking. On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote: On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: From: he, bo bo...@intel.com We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. How useful is this really for normal developers? It's very useful to debug race conditions under SMP environment. We applied the patch to our Android build image on our smartphones. That's fine for your application, and seemed to be useful to others with their first interactions with SMP systems. However, once you start to get to real numbers of CPUs, this information turns pretty pointless, which is why the patch was originally rejected. sorry, greg k-h
Re: [PATCH V2] output the cpu number when printking.
On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: > On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: > > From: "he, bo" > > > > We often hit kernel panic issues on SMP machines because processes race > > on multiple cpu. By adding a new parameter printk.cpu, kernel prints > > cpu number at printk information line. It’s useful to debug what cpus > > are racing. > > How useful is this really for normal developers? It's very useful to debug race conditions under SMP environment. We applied the patch to our Android build image on our smartphones. > We complained when > this option was proposed by the ARM developers who were, for the first > time, handling more than one processor and the issues involved with > that. You are enabling this as a default option, for all developers, > and almost no one will ever need it. At least, I am a developer to use it every day. :). I need debug various weird issues and tried many debug tools. Anyway, we would change the default to _disable_. > > So, if you really want this, don't enable this by default. Also, go > back and read the old thread about this option and why it was rejeted > previously. Thanks for your kind reminder. I googled it. Is the link like https://lkml.org/lkml/2012/8/3/121? Or http://lkml.indiana.edu/hypermail/linux/kernel/1208.0/02097.html? Above links raise a good question that we might be able to use ftrace (trace_printk) instead of to extend printk. 1) There are indeed lots of debug methods/tools in kernel because of many excellent developers' contributions. However, the most frequently-used tool is still printk. 2) ftrace (I use it often) is good when system mostly doesn't crash. If system crashes, we couldn't get ftrace info sometimes. When we debug some hard issues on a multi-core device, the system often crashes when Graphics driver accesses registers incorrectly. There is no chance for kernel to dump ftrace or panic info. With printk, we find that's because 2 cpu are racing. One cpu turns off Graphics while another cpu is accessing it. Of course, we also applied the Android persistent ram patch, so after rebooting, we still could see the printk buffer of previous booting. 3) Current drivers usually have many dev_dbg statements. We could enable dynamic debug control to get such info quickly. With the printk cpu number patch, we quickly enhance them to dump info about potential races among cpu. With ftrace, we need change all these dev_dbg to trace_printk and that's not the original objective of driver developers. Mostly, the patch is to reuse all current printk statements in kernel with a little overhead. Thanks for your kind comments. Yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] output the cpu number when printking.
On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: > From: "he, bo" > > We often hit kernel panic issues on SMP machines because processes race > on multiple cpu. By adding a new parameter printk.cpu, kernel prints > cpu number at printk information line. It’s useful to debug what cpus > are racing. How useful is this really for normal developers? We complained when this option was proposed by the ARM developers who were, for the first time, handling more than one processor and the issues involved with that. You are enabling this as a default option, for all developers, and almost no one will ever need it. So, if you really want this, don't enable this by default. Also, go back and read the old thread about this option and why it was rejeted previously. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] output the cpu number when printking.
On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: From: he, bo bo...@intel.com We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. How useful is this really for normal developers? We complained when this option was proposed by the ARM developers who were, for the first time, handling more than one processor and the issues involved with that. You are enabling this as a default option, for all developers, and almost no one will ever need it. So, if you really want this, don't enable this by default. Also, go back and read the old thread about this option and why it was rejeted previously. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] output the cpu number when printking.
On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote: On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote: From: he, bo bo...@intel.com We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. How useful is this really for normal developers? It's very useful to debug race conditions under SMP environment. We applied the patch to our Android build image on our smartphones. We complained when this option was proposed by the ARM developers who were, for the first time, handling more than one processor and the issues involved with that. You are enabling this as a default option, for all developers, and almost no one will ever need it. At least, I am a developer to use it every day. :). I need debug various weird issues and tried many debug tools. Anyway, we would change the default to _disable_. So, if you really want this, don't enable this by default. Also, go back and read the old thread about this option and why it was rejeted previously. Thanks for your kind reminder. I googled it. Is the link like https://lkml.org/lkml/2012/8/3/121? Or http://lkml.indiana.edu/hypermail/linux/kernel/1208.0/02097.html? Above links raise a good question that we might be able to use ftrace (trace_printk) instead of to extend printk. 1) There are indeed lots of debug methods/tools in kernel because of many excellent developers' contributions. However, the most frequently-used tool is still printk. 2) ftrace (I use it often) is good when system mostly doesn't crash. If system crashes, we couldn't get ftrace info sometimes. When we debug some hard issues on a multi-core device, the system often crashes when Graphics driver accesses registers incorrectly. There is no chance for kernel to dump ftrace or panic info. With printk, we find that's because 2 cpu are racing. One cpu turns off Graphics while another cpu is accessing it. Of course, we also applied the Android persistent ram patch, so after rebooting, we still could see the printk buffer of previous booting. 3) Current drivers usually have many dev_dbg statements. We could enable dynamic debug control to get such info quickly. With the printk cpu number patch, we quickly enhance them to dump info about potential races among cpu. With ftrace, we need change all these dev_dbg to trace_printk and that's not the original objective of driver developers. Mostly, the patch is to reuse all current printk statements in kernel with a little overhead. Thanks for your kind comments. Yanmin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] output the cpu number when printking.
From: "he, bo" We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. Signed-off-by: he, bo --- Documentation/kernel-parameters.txt |4 kernel/printk.c | 18 -- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index ddd84d6..3387a7f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. printk.time=Show timing data prefixed to each printk message line Format: (1/Y/y=enable, 0/N/n=disable) + printk.cpu= Show cpu data prefixed to each printk message line + Format: (1/Y/y=enable, 0/N/n=disable) + printk.cpu takes effect only when printk.time=y. + processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state max_cstate=9 overrides any DMI blacklist limit. diff --git a/kernel/printk.c b/kernel/printk.c index 19c0d7b..873a226 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -863,9 +863,14 @@ static bool printk_time; #endif module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); +static bool printk_cpu = 1; +module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR); + static size_t print_time(u64 ts, char *buf) { unsigned long rem_nsec; + size_t len = 0; + int this_cpu; if (!printk_time) return 0; @@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf) return 15; rem_nsec = do_div(ts, 10); - return sprintf(buf, "[%5lu.%06lu] ", - (unsigned long)ts, rem_nsec / 1000); + + if (printk_cpu) { + this_cpu = raw_smp_processor_id(); + len = sprintf(buf, "[%5lu.%06lu,%u] ", + (unsigned long)ts, rem_nsec / 1000, this_cpu); + } else { + len = sprintf(buf, "[%5lu.%06lu] ", + (unsigned long)ts, rem_nsec / 1000); + } + + return len; } static size_t print_prefix(const struct log *msg, bool syslog, char *buf) -- 1.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] output the cpu number when printking.
From: he, bo bo...@intel.com We often hit kernel panic issues on SMP machines because processes race on multiple cpu. By adding a new parameter printk.cpu, kernel prints cpu number at printk information line. It’s useful to debug what cpus are racing. Signed-off-by: he, bo bo...@intel.com --- Documentation/kernel-parameters.txt |4 kernel/printk.c | 18 -- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index ddd84d6..3387a7f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. printk.time=Show timing data prefixed to each printk message line Format: bool (1/Y/y=enable, 0/N/n=disable) + printk.cpu= Show cpu data prefixed to each printk message line + Format: bool (1/Y/y=enable, 0/N/n=disable) + printk.cpu takes effect only when printk.time=y. + processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state max_cstate=9 overrides any DMI blacklist limit. diff --git a/kernel/printk.c b/kernel/printk.c index 19c0d7b..873a226 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -863,9 +863,14 @@ static bool printk_time; #endif module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); +static bool printk_cpu = 1; +module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR); + static size_t print_time(u64 ts, char *buf) { unsigned long rem_nsec; + size_t len = 0; + int this_cpu; if (!printk_time) return 0; @@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf) return 15; rem_nsec = do_div(ts, 10); - return sprintf(buf, [%5lu.%06lu] , - (unsigned long)ts, rem_nsec / 1000); + + if (printk_cpu) { + this_cpu = raw_smp_processor_id(); + len = sprintf(buf, [%5lu.%06lu,%u] , + (unsigned long)ts, rem_nsec / 1000, this_cpu); + } else { + len = sprintf(buf, [%5lu.%06lu] , + (unsigned long)ts, rem_nsec / 1000); + } + + return len; } static size_t print_prefix(const struct log *msg, bool syslog, char *buf) -- 1.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/