RE: [RFC PATCH 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support
> From: Andy Lutomirski > On Thu, Oct 1, 2020 at 1:43 PM Chang S. Bae wrote: > > "xstate.disable=0x6000" will disable AMX on a system that has AMX > > compiled into XFEATURE_MASK_USER_SUPPORTED. > Can we please use words for this? Perhaps: > xstate.disable=amx,zmm Yes, I think it is reasonable to add support for keywords for the features that are both supported by the hardware and known by the kernel. However, we need to continue to support numerical state-component bitmap format. Otherwise, it would not be possible to coerce a legacy kernel (eg. a distro kernel) to enable a feature on a new machine until it has been updated to know that keyword. > and maybe add a list in /proc/cpuinfo or somewhere in /proc or /sys that > shows all the currently enabled xsave states. Agreed, if we invent keywords, the list of supported+known keywords should be available on the system. I do not advocate /proc/cpuinfo -- which is already an out of control parsing mess. We could add the keywords to dmesg, since we already print the supported XSAVE BV: [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers' [0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR' Or maybe a sysfs attribute or a modparam that simply lists them all. We wouldn't be able to dynamically _write_ to that attribute, since the cmdline is boot-time only. > > "xstate.enable=0x6000" will enable AMX on a system that does NOT have > > AMX compiled into XFEATURE_MASK_USER_SUPPORTED (assuming the kernel is > > new enough to support this feature). > > This sounds like it will be quite confusing to anyone reading the kernel code > to discover that a feature that is not "SUPPORTED" is nonetheless enabled. Right now, this cmdline will only allow a new kernel to enable/disable kernel support for AMX on hardware that also supports AMX. But we hope to re-use this XSTATE code -- unchanged -- for future features that require just this simple state management support from the kernel. We anticipate a future hardware enumeration mechanism to identify such qualified features to assist the kernel in deciding whether to support a feature or not, by default. The kernel can use the combination of its build-time config with advice from this boot-time enumeration to decide if it wants to enable a particular feature or not. And at the end, a user is empowered to override that default using this cmdline. Thanks, -Len
RE: [RFC PATCH 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support
> From: Randy Dunlap > What do these bitmasks look like? what do the bits mean? > Where does a user find this info? The XSAVE state component bitmaps are detailed in the Intel Software Developer's Manual, volume 1, Chapter 13: "Managing State using the XSAVE Feature Set". http://intel.com/sdm In the kernel source, they are enumerated in xstate.c and you can observe them in dmesg: [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers' [0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR' Thanks, -Len
RE: [RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use
> From: Andy Lutomirski > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c > > @@ -518,3 +518,40 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) .. > > +bool xfirstuse_event_handler(struct fpu *fpu) > > +{ > > + bool handled = false; > > + u64 event_mask; > > + > > + /* Check whether the first-use detection is running. */ > > + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled()) > > + return handled; > > + > MSR_IA32_XFD_ERR needs to be wired up in the exception handler, not in > some helper called farther down the stack xfirstuse_event_handler() is called directly from the IDTENTRY exc_device_not_available(): > > @@ -1028,6 +1028,9 @@ DEFINE_IDTENTRY(exc_device_not_available) > > { > > unsigned long cr0 = read_cr0(); > > > > + if (xfirstuse_event_handler(>thread.fpu)) > > + return; Are you suggesting we should instead open code it inside that routine? > But this raises an interesting point -- what happens if allocation > fails? I think that, from kernel code, we simply cannot support this > exception mechanism. If kernel code wants to use AMX (and that would > be very strange indeed), it should call x86_i_am_crazy_amx_begin() and > handle errors, not rely on exceptions. From user code, I assume we > send a signal if allocation fails. The XFD feature allows us to transparently expand the kernel context switch buffer for a user task, when that task first touches this associated hardware. It allows applications to operate as if the kernel had allocated the backing AMX context switch buffer at initialization time. However, since we expect only a sub-set of tasks to actually use AMX, we instead defer allocation until we actually see first use for that task, rather than allocating for all tasks. While we currently don't propose AMX use inside the kernel, it is conceivable that could be done in the future, just like AVX is used by the RAID code; and it would be done the same way -- kernel_fpu_begin()/kernel_fpu_end(). Such future Kernel AMX use would _not_ arm XFD, and would _not_ provoke this fault. (note that we context switch the XFD-armed state per task) vmalloc() does not fail, and does not return an error, and so there is no concept of returning a signal. If we got to the point where vmalloc() sleeps, then the system has bigger OOM issues, and the OOM killer would be on the prowl. If we were concerned about using vmalloc for a couple of pages in the task structure, Then we could implement a routine to harvest unused buffers and free them -- but that didn't seem worth the complexity. Note that this feature is 64-bit only. Thanks, -Len
RE: [RFC PATCH 07/22] x86/fpu/xstate: Introduce helpers to manage an xstate area dynamically
> > From: Andy Lutomirski > > > + /* > > +* The caller may be under interrupt disabled condition. Ensure > > interrupt > > +* allowance before the memory allocation, which may involve with > > page faults. > > +*/ > > + local_irq_enable(); > ... you can't just enable IRQs here. If IRQs are off, they're off for a > reason. Secondly, if they're *on*, you just forgot that fact. Good catch. This is a trap handler from user-space and should never be called with irqs disabled, So the local_irq_enable() should be dead code. Chang suggested that he erroneously left it in from a previous implementation. > > + /* We need 64B aligned pointer, but vmalloc() returns a > > page-aligned address */ > > + state_ptr = vmalloc(newsz); > And allocating this state from vmalloc() seems questionable. Why are you > doing this? While the buffer needs to be virtually contiguous, it doesn't need to be physically contiguous, And so vmalloc() is less overhead than kmalloc() for this. Thanks, -Len
RE: [PATCH AUTOSEL 5.2 82/94] tools/power turbostat: fix file descriptor leaks
FWIW, The latest turbostat and x86_energy_perf_policy utilities in the upstream kernel tree should always be backward compatible with all old kernels. If that is EVER not the case, I want to know about it. Yes, I know that some distros ship old versions of these utilities built out of their matching kernel tree snapshots. Yes, applying upstream fixes to .stable for such distros is a good thing. However, the better solution for these particular utilities, is that they simply always use upstream utilities -- even with old kernels. When somebody reports a problem and I need them to run these tools, 100% of the time, I start by sending them the latest upstream version to replace the old version shipped by the distro. Cheers, -Len -Original Message- From: Sasha Levin [mailto:sas...@kernel.org] Sent: Wednesday, September 04, 2019 11:57 AM To: linux-kernel@vger.kernel.org; sta...@vger.kernel.org Cc: Gustavo A. R. Silva ; Prarit Bhargava ; Brown, Len ; Sasha Levin ; linux...@vger.kernel.org Subject: [PATCH AUTOSEL 5.2 82/94] tools/power turbostat: fix file descriptor leaks From: "Gustavo A. R. Silva" [ Upstream commit 605736c6929d541c78a85dffae4d33a23b6b2149 ] Fix file descriptor leaks by closing fp before return. Addresses-Coverity-ID: 1444591 ("Resource leak") Addresses-Coverity-ID: 1444592 ("Resource leak") Fixes: 5ea7647b333f ("tools/power turbostat: Warn on bad ACPI LPIT data") Signed-off-by: Gustavo A. R. Silva Reviewed-by: Prarit Bhargava Signed-off-by: Len Brown Signed-off-by: Sasha Levin --- tools/power/x86/turbostat/turbostat.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index 71a931813de00..066bd43ed6c9f 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -2912,6 +2912,7 @@ int snapshot_cpu_lpi_us(void) if (retval != 1) { fprintf(stderr, "Disabling Low Power Idle CPU output\n"); BIC_NOT_PRESENT(BIC_CPU_LPI); + fclose(fp); return -1; } -- 2.20.1
RE: [PATCH 09/14] thermal/x86_pkg_temp_thermal: Support multi-die/package
> please add a new helper function topology_max_dies() and retrieve it from > that. I'll do this for Rui -- since his patch is in my x86 branch, and it is already (very early) Saturday morning for him:-) FYI, there were a couple of other small changes I made to that branch in response to list feedback that I pushed, but didn't re-email out the series. Thomas, Let me know if you prefer me to re-email the series, or just send a git pull request. Thanks, -Len
RE: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support
Hi Brice, Yes, you re-discovered the bug that Kan Liang pointed out last week. I have updated this patch set, and the latest for testing is in my git tree on kernel.org or https://github.com/lenb/linux.git x86 Note that I took your advice and left the core_siblings with its original definition, And created package_threads as a synonym. I will e-mail out the patch set again When I do 2 more things: 1. add core_threads map to sysfs 2. replace unique_die_id with logical_die_id -- turns out it is useful for same reason as logical_package_id. Thanks, -Len -Original Message- From: Brice Goglin [mailto:brice.gog...@inria.fr] Sent: Sunday, February 24, 2019 5:04 AM To: Len Brown ; x...@kernel.org Cc: linux-kernel@vger.kernel.org; Brown, Len ; linux-...@vger.kernel.org; Like Xu Subject: Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support Le 19/02/2019 à 04:40, Len Brown a écrit : > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index ccd1f2a8e557..4250a87f57db 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -393,6 +393,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct > cpuinfo_x86 *o) > int cpu1 = c->cpu_index, cpu2 = o->cpu_index; > > if (c->phys_proc_id == o->phys_proc_id && > + c->cpu_die_id == o->cpu_die_id && > per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) { > if (c->cpu_core_id == o->cpu_core_id) > return topology_sane(c, o, "smt"); @@ -404,6 > +405,7 @@ static > bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) > } > > } else if (c->phys_proc_id == o->phys_proc_id && > +c->cpu_die_id == o->cpu_die_id && > c->cpu_core_id == o->cpu_core_id) { > return topology_sane(c, o, "smt"); > } > @@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct > cpuinfo_x86 *o) > */ > static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) > { > - if (c->phys_proc_id == o->phys_proc_id) > + if (c->cpu_die_id == o->cpu_die_id) > return true; > return false; > } Hello Len, I am testing your patches in a VM with CPUID.1f QEMU patches (author Like Xu is CC'ed), booted to get 2 packages with 1 NUMA node each and 2 dies each: -smp cpus=16,threads=2,cores=2,dies=2,sockets=2 -numa node,cpus=0-7,nodeid=0 -numa node,cpus=8-15,nodeid=1 sysfs files expose wrong information: core_siblings contains threads of the local die AND of die with same number in other processors, eg cpu 0-3 and 8-11 instead of 0-3 only. The issue is that you seem to assume that die ids will always be unique across multiple packages. Is this a valid assumption? If so, QEMU patches should be fixed. If not, I fixed the issue by changing match_die() to check both phys_proc_id and cpu_die_id: static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) { if (c->phys_proc_id == o->phys_proc_id && c->cpu_die_id == o->cpu_die_id) return true; return false; } Thanks Brice
RE: [PATCH] MAINTAINERS: update simple firmware interface (SFI) section entry
Yes, this can be scheduled to be deleted.
RE: [linux-drivers-review] [PATCH 02/11] topolgy: simplify cputopology.txt formatting and wording
>> +if CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are selected, respectively. >> >> CONFIG_SCHED_BOOK and CONFIG_DRAWER are currently only used on s390, >> where >fwiw:CONFIG_SCHED_DRAWER Heh. Seems that every line of update to a .txt files uncovers two lines of existing bugs. Hopefully we are better at writing software, than documentation! Indeed, maybe if we had a compiler or checkpatch.pl file for documentation files, They would approach the quality of the source code? I updated the patch to fix this too. Thanks, -Len
RE: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support
>> @@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct >> cpuinfo_x86 *o) >>*/ >> static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) >> { >> -if (c->>phys_proc_id == o->>phys_proc_id) >> +if (c->>cpu_die_id == o->>cpu_die_id) >> return true; >> return false; >> } > Shouldn't we check the unique_die_id here? > Die from different package can have the same cpu_die_id. Good catch. Updated this hunk as below. Thanks! -Len @@ -461,7 +463,8 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) */ static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) { - if (c->phys_proc_id == o->phys_proc_id) + if ((c->phys_proc_id == o->phys_proc_id) && + (c->cpu_die_id == o->cpu_die_id)) return true; return false; }
RE: [PATCH 05/11] x86 topology: export die_siblings
Thanks for the comments, Kan, >> diff --git a/Documentation/cputopology.txt >> b/Documentation/cputopology.txt index 287213b4517b..7dd2ae3df233 >> 100644 >> --- a/Documentation/cputopology.txt >> +++ b/Documentation/cputopology.txt >> @@ -56,6 +56,16 @@ core_siblings_list: >> human-readable list of cpuX's hardware threads within the same >> die_id. >> >> +die_siblings: >> + >> +internal kernel map of cpuX's hardware threads within the same >> +physical_package_id. >> + >> +die_siblings_list: >> + >> +human-readable list of cpuX's hardware threads within the same >> +physical_package_id. >> + >> book_siblings: >> >> internal kernel map of cpuX's hardware threads within the same diff > Could you please update the document regarding to topology_die_cpumask and > topology_core_cpumask in Documentation/x86/topology.txt I agree that the top part of this file, as updated above, should document the external sysfs interface... I'm less excited about the center of the file trying to document the internal implementation -- as the source code is actually more clear than the document, but here is an update, consistent with the existing file. Let me know if it does not fully address your comment. Thanks, -Len --- diff --git a/Documentation/cputopology.txt b/Documentation/cputopology.txt index 7dd2ae3..f39c759 100644 --- a/Documentation/cputopology.txt +++ b/Documentation/cputopology.txt @@ -102,6 +102,7 @@ these macros in include/asm-XXX/topology.h:: #define topology_drawer_id(cpu) #define topology_sibling_cpumask(cpu) #define topology_core_cpumask(cpu) + #define topology_die_cpumask(cpu) #define topology_book_cpumask(cpu) #define topology_drawer_cpumask(cpu) @@ -114,10 +115,11 @@ To be consistent on all architectures, include/linux/topology.h provides default definitions for any of the above macros that are not defined by include/asm-XXX/topology.h: -1) physical_package_id: -1 -2) core_id: 0 -3) sibling_cpumask: just the given CPU -4) core_cpumask: just the given CPU +1) topology_physical_package_id: -1 +2) topology_core_id: 0 +3) topology_sibling_cpumask: just the given CPU +4) topology_core_cpumask: just the given CPU +5) topology_die_cpumask: just the given CPU For architectures that don't support books (CONFIG_SCHED_BOOK) there are no default definitions for topology_book_id() and topology_book_cpumask().
RE: [RFC] x86, tsc: Add kcmdline args for skipping tsc calibration sequences
We disabled CPUID-based TSC calibration on SKX in December for several reasons. If you still have it enabled, you need this patch: commit b511203093489eb1829cb4de86e8214752205ac6 x86/tsc: Fix erroneous TSC rate on Skylake Xeon If you are referring to another platform that has CPUID-TSC calibration... it should still work on an over-clocked system. Over-clocked platforms should use exactly the same reference crystal as non-overclocked platforms, but should modify the crystal/core multiplier. If you are changing the reference crystal, then I believe you are using an un-supported hardware configuration, and my ability to support you is limited. -Len
RE: [RFC] x86, tsc: Add kcmdline args for skipping tsc calibration sequences
We disabled CPUID-based TSC calibration on SKX in December for several reasons. If you still have it enabled, you need this patch: commit b511203093489eb1829cb4de86e8214752205ac6 x86/tsc: Fix erroneous TSC rate on Skylake Xeon If you are referring to another platform that has CPUID-TSC calibration... it should still work on an over-clocked system. Over-clocked platforms should use exactly the same reference crystal as non-overclocked platforms, but should modify the crystal/core multiplier. If you are changing the reference crystal, then I believe you are using an un-supported hardware configuration, and my ability to support you is limited. -Len
RE: [PATCH 1/2] tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
You just did it:-) -Original Message- From: Matt Turner [mailto:matts...@gmail.com] Sent: Saturday, April 07, 2018 6:25 PM To: Brown, Len <len.br...@intel.com> Cc: Thomas Gleixner <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter Anvin <h...@zytor.com>; Borislav Petkov <b...@suse.de>; LKML <linux-kernel@vger.kernel.org>; x...@kernel.org; Matt Turner <matts...@gmail.com> Subject: Re: [PATCH 1/2] tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines On Tue, Feb 13, 2018 at 11:12 AM, Matt Turner <matts...@gmail.com> wrote: > According to the Intel Software Developers' Manual, Vol. 4, Order No. > 335592, these macros have been reversed since they were added. > > Fixes: 889facbee3e6 ("tools/power turbostat: v3.0: monitor Watts and > Temperature") > Signed-off-by: Matt Turner <matts...@gmail.com> Is there something I need to do to ensure these two trivial patches get picked up?
RE: [PATCH 1/2] tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
You just did it:-) -Original Message- From: Matt Turner [mailto:matts...@gmail.com] Sent: Saturday, April 07, 2018 6:25 PM To: Brown, Len Cc: Thomas Gleixner ; Ingo Molnar ; H. Peter Anvin ; Borislav Petkov ; LKML ; x...@kernel.org; Matt Turner Subject: Re: [PATCH 1/2] tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines On Tue, Feb 13, 2018 at 11:12 AM, Matt Turner wrote: > According to the Intel Software Developers' Manual, Vol. 4, Order No. > 335592, these macros have been reversed since they were added. > > Fixes: 889facbee3e6 ("tools/power turbostat: v3.0: monitor Watts and > Temperature") > Signed-off-by: Matt Turner Is there something I need to do to ensure these two trivial patches get picked up?
RE: [PATCH] tools/power x86_energy_perf_policy: fix resource leak on file descriptor
Acked-by: Len Brown <len.br...@intel.com> -Original Message- From: Colin King [mailto:colin.k...@canonical.com] Sent: Wednesday, May 17, 2017 8:52 AM To: Brown, Len <len.br...@intel.com> Cc: kernel-janit...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: [PATCH] tools/power x86_energy_perf_policy: fix resource leak on file descriptor From: Colin Ian King <colin.k...@canonical.com> Function get_pkg_num is leaking an open file, fix this with a fclose(). Detected with static analysis by cppcheck: [tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c:1115]: (error) Resource leak: fp Fixes: 4beec1d7519691 ("tools/power x86_energy_perf_policy: support HWP.EPP") Signed-off-by: Colin Ian King <colin.k...@canonical.com> --- tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c index 65bbe627a425..26a5b5265290 100644 --- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c +++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c @@ -1110,6 +1110,7 @@ unsigned int get_pkg_num(int cpu) fp = fopen_or_die(pathname, "r"); retval = fscanf(fp, "%d\n", ); + fclose(fp); if (retval != 1) errx(1, "%s: failed to parse", pathname); return pkg; -- 2.11.0
RE: [PATCH] tools/power x86_energy_perf_policy: fix resource leak on file descriptor
Acked-by: Len Brown -Original Message- From: Colin King [mailto:colin.k...@canonical.com] Sent: Wednesday, May 17, 2017 8:52 AM To: Brown, Len Cc: kernel-janit...@vger.kernel.org; linux-kernel@vger.kernel.org Subject: [PATCH] tools/power x86_energy_perf_policy: fix resource leak on file descriptor From: Colin Ian King Function get_pkg_num is leaking an open file, fix this with a fclose(). Detected with static analysis by cppcheck: [tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c:1115]: (error) Resource leak: fp Fixes: 4beec1d7519691 ("tools/power x86_energy_perf_policy: support HWP.EPP") Signed-off-by: Colin Ian King --- tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c index 65bbe627a425..26a5b5265290 100644 --- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c +++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c @@ -1110,6 +1110,7 @@ unsigned int get_pkg_num(int cpu) fp = fopen_or_die(pathname, "r"); retval = fscanf(fp, "%d\n", ); + fclose(fp); if (retval != 1) errx(1, "%s: failed to parse", pathname); return pkg; -- 2.11.0
RE: [PATCH] tools/power turbostat: turbostat.8 add missing column definitions
> On Saturday, March 04, 2017 06:36:29 PM Len Brown wrote: > > Applied -- Thanks! > > I guess I should include this into fixes too? > > Or are you going to send a pull request with it to me? When I apply it, you don't have to. I have a couple of other fixes being tested on my turbostat branch now, and this is in w/ those. thanks, -Len
RE: [PATCH] tools/power turbostat: turbostat.8 add missing column definitions
> On Saturday, March 04, 2017 06:36:29 PM Len Brown wrote: > > Applied -- Thanks! > > I guess I should include this into fixes too? > > Or are you going to send a pull request with it to me? When I apply it, you don't have to. I have a couple of other fixes being tested on my turbostat branch now, and this is in w/ those. thanks, -Len
RE: [PATCH 2/2] x86/tsc: Add additional Intel CPU models to crystal_khz whitelist
> + crystal_khz = 24000;/* 25.0 MHz */ I guess I prefer no comment over an incorrect comment.
RE: [PATCH 2/2] x86/tsc: Add additional Intel CPU models to crystal_khz whitelist
> + crystal_khz = 24000;/* 25.0 MHz */ I guess I prefer no comment over an incorrect comment.
RE: [PATCH] drivers/idle: make intel_idle.c driver more explicitly non-modular
> -Original Message- > From: rcoch...@linutronix.de [mailto:rcoch...@linutronix.de] > Sent: Tuesday, April 05, 2016 12:30 AM > To: Brown, Len > Cc: Gortmaker, Paul (Wind River); linux-kernel@vger.kernel.org; Len Brown; > linux...@vger.kernel.org > Subject: Re: [PATCH] drivers/idle: make intel_idle.c driver more > explicitly non-modular > > On Tue, Apr 05, 2016 at 04:20:47AM +, Brown, Len wrote: > > The first idle driver to register with cpuidle wins. > > > > intel_idle should always get the opportunity > > to probe and register before acpi_idle (processor_idle.c) > > > > When intel_idle was allowed to be modular, > > some distros build their kernel and loaded modules > > such that acpi_idle could probe first. In such > > a kernel, intel_idle became dead code. > > > > As intel_idle is a small driver, the q uick fix > > was to make it Y/N so that it would always probe > > before acpi_idle, no matter how acpi_idle > > is build and loaded. > > > > Yes, even though intel_idle is a tiny driver, I think > > it would be good to be able to unload it when it doesn't probe. > > And that means fixing the race with acpi_idle, right? > > > Today, it appears that acpi_idle (acpi/processor_idle.c) > > is compiled Y/N. > > So it, too, needs work? "needs" is somewhat subjective. Some may argue that this driver is so small, that an effort to save memory might be more effectively directed elsewhere. But if the goal is to save the memory consumed by this driver when the driver doesn't probe, then yes, it would have to be made modular. I don't remember what ACPI dependency made it non-modular. ACPI has some tricky initialization ordering issues that are BIOS dependent, and sometimes out of our control. > > No, I do not believe that cpuidle should bother > > supporting changing idle drivers at run-time. > > Huh? But you just said, "it would be good to be able to unload it > when it doesn't probe." being able to switch the registered driver at run-time does not require the driver to be modular. cheers, -Len
RE: [PATCH] drivers/idle: make intel_idle.c driver more explicitly non-modular
> -Original Message- > From: rcoch...@linutronix.de [mailto:rcoch...@linutronix.de] > Sent: Tuesday, April 05, 2016 12:30 AM > To: Brown, Len > Cc: Gortmaker, Paul (Wind River); linux-kernel@vger.kernel.org; Len Brown; > linux...@vger.kernel.org > Subject: Re: [PATCH] drivers/idle: make intel_idle.c driver more > explicitly non-modular > > On Tue, Apr 05, 2016 at 04:20:47AM +, Brown, Len wrote: > > The first idle driver to register with cpuidle wins. > > > > intel_idle should always get the opportunity > > to probe and register before acpi_idle (processor_idle.c) > > > > When intel_idle was allowed to be modular, > > some distros build their kernel and loaded modules > > such that acpi_idle could probe first. In such > > a kernel, intel_idle became dead code. > > > > As intel_idle is a small driver, the q uick fix > > was to make it Y/N so that it would always probe > > before acpi_idle, no matter how acpi_idle > > is build and loaded. > > > > Yes, even though intel_idle is a tiny driver, I think > > it would be good to be able to unload it when it doesn't probe. > > And that means fixing the race with acpi_idle, right? > > > Today, it appears that acpi_idle (acpi/processor_idle.c) > > is compiled Y/N. > > So it, too, needs work? "needs" is somewhat subjective. Some may argue that this driver is so small, that an effort to save memory might be more effectively directed elsewhere. But if the goal is to save the memory consumed by this driver when the driver doesn't probe, then yes, it would have to be made modular. I don't remember what ACPI dependency made it non-modular. ACPI has some tricky initialization ordering issues that are BIOS dependent, and sometimes out of our control. > > No, I do not believe that cpuidle should bother > > supporting changing idle drivers at run-time. > > Huh? But you just said, "it would be good to be able to unload it > when it doesn't probe." being able to switch the registered driver at run-time does not require the driver to be modular. cheers, -Len
RE: [PATCH] drivers/idle: make intel_idle.c driver more explicitly non-modular
The first idle driver to register with cpuidle wins. intel_idle should always get the opportunity to probe and register before acpi_idle (processor_idle.c) When intel_idle was allowed to be modular, some distros build their kernel and loaded modules such that acpi_idle could probe first. In such a kernel, intel_idle became dead code. As intel_idle is a small driver, the q uick fix was to make it Y/N so that it would always probe before acpi_idle, no matter how acpi_idle is build and loaded. Yes, even though intel_idle is a tiny driver, I think it would be good to be able to unload it when it doesn't probe. Today, it appears that acpi_idle (acpi/processor_idle.c) is compiled Y/N. thanks, -Len ps. No, I do not believe that cpuidle should bother supporting changing idle drivers at run-time.
RE: [PATCH] drivers/idle: make intel_idle.c driver more explicitly non-modular
The first idle driver to register with cpuidle wins. intel_idle should always get the opportunity to probe and register before acpi_idle (processor_idle.c) When intel_idle was allowed to be modular, some distros build their kernel and loaded modules such that acpi_idle could probe first. In such a kernel, intel_idle became dead code. As intel_idle is a small driver, the q uick fix was to make it Y/N so that it would always probe before acpi_idle, no matter how acpi_idle is build and loaded. Yes, even though intel_idle is a tiny driver, I think it would be good to be able to unload it when it doesn't probe. Today, it appears that acpi_idle (acpi/processor_idle.c) is compiled Y/N. thanks, -Len ps. No, I do not believe that cpuidle should bother supporting changing idle drivers at run-time.
RE: [PATCH 01/30] ACPICA: Linuxize: reduce divergences for 20160212 release
> > > > > > -acpi_status acpi_hw_read(u32 *value, struct acpi_generic_address > *reg) > > > +acpi_status acpi_hw_read(u32 *value, struct acpi_generic_address * > reg) > > > > The second argument * style appears the opposite of normal style > > and a different style than the first argument * style. > [Lv Zheng] > The file is drivers/acpi/acpica/hwregs.c, which is coming from ACPICA > upstream. > So this is a result of "ACPICA release". > In other words, this is a result of a "process". > In order to fix this, things need to be done in "ACPICA release scripts". > Which should be done in > https://github.com/acpica/acpica/blob/master/generate/linux/libacpica.sh. > Otherwise, "ACPICA release" process will require human intervention. > > So please leave this patch fragment as is. > It will be automatically fixed if the "ACPICA release" process is fixed. > And if you don't leave this fragment as is, the "ACPICA release" process > will get hurt. I disagree. The patch should be correct when it hits the Linux kernel tree. If the process is broken, then fix the process and re-send a fixed patch. Linux doesn't care if the process is a program that runs with the click of a button, or the result of 1000 engineering laboring day and night. Only the result matters, the result should be correct, and this patch is not correct. -Len
RE: [PATCH 01/30] ACPICA: Linuxize: reduce divergences for 20160212 release
> > > > > > -acpi_status acpi_hw_read(u32 *value, struct acpi_generic_address > *reg) > > > +acpi_status acpi_hw_read(u32 *value, struct acpi_generic_address * > reg) > > > > The second argument * style appears the opposite of normal style > > and a different style than the first argument * style. > [Lv Zheng] > The file is drivers/acpi/acpica/hwregs.c, which is coming from ACPICA > upstream. > So this is a result of "ACPICA release". > In other words, this is a result of a "process". > In order to fix this, things need to be done in "ACPICA release scripts". > Which should be done in > https://github.com/acpica/acpica/blob/master/generate/linux/libacpica.sh. > Otherwise, "ACPICA release" process will require human intervention. > > So please leave this patch fragment as is. > It will be automatically fixed if the "ACPICA release" process is fixed. > And if you don't leave this fragment as is, the "ACPICA release" process > will get hurt. I disagree. The patch should be correct when it hits the Linux kernel tree. If the process is broken, then fix the process and re-send a fixed patch. Linux doesn't care if the process is a program that runs with the click of a button, or the result of 1000 engineering laboring day and night. Only the result matters, the result should be correct, and this patch is not correct. -Len
RE: C1E auto-promotion suspend/resume
> By BIOS (1.2.3 on a Dell XPS 13 9350) seems to want to enable C1E > auto-promotion (ugh!), which results in this difference across > suspend/resume according to turbostat: > > -cpu3: MSR_IA32_POWER_CTL: 0x0024005d (C1E auto-promotion: DISabled) > +cpu3: MSR_IA32_POWER_CTL: 0x0024005f (C1E auto-promotion: ENabled) > > Should intel_idle learn to re-disable idle promotion on resume? Yes, it seems that way. Go ahead and send a patch, or file a bug at bugzilla.kernel.org and we'll get to it. thanks, -len
RE: C1E auto-promotion suspend/resume
> By BIOS (1.2.3 on a Dell XPS 13 9350) seems to want to enable C1E > auto-promotion (ugh!), which results in this difference across > suspend/resume according to turbostat: > > -cpu3: MSR_IA32_POWER_CTL: 0x0024005d (C1E auto-promotion: DISabled) > +cpu3: MSR_IA32_POWER_CTL: 0x0024005f (C1E auto-promotion: ENabled) > > Should intel_idle learn to re-disable idle promotion on resume? Yes, it seems that way. Go ahead and send a patch, or file a bug at bugzilla.kernel.org and we'll get to it. thanks, -len
RE: Skylake (XPS 13 9350) TSC is way off
> >> [0.00] tsc: PIT calibration matches HPET. 2 loops > >> [0.00] tsc: Detected 2399.975 MHz processor > >> [0.090897] TSC deadline timer enabled > >> [1.960034] tsc: Refined TSC clocksource calibration: 2400.007 MHz > turbostat version 4.10 10 Dec, 2015 - Len Brown > CPUID(0): GenuineIntel 22 CPUID levels; family:model:stepping 0x6:4e:3 > (6:78:3) > CPUID(1): SSE3 MONITOR EIST TM2 TSC MSR ACPI-TM TM > CPUID(6): APERF, DTS, PTM, HWP, HWPnotify, HWPwindow, HWPepp, No-HWPpkg, EPB > cpu1: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MONITOR) > CPUID(0x15): eax_crystal: 2 ebx_tsc: 200 ecx_crystal_hz: 0 > TSC: 2400 MHz (2400 Hz * 200 / 2 / 100) > CPUID(0x16): base_mhz: 2400 max_mhz: 2800 bus_mhz: 100 Both the initial and refined TSC calibration are right on the money -- 2400 MHz. Further, this system happens to also have a base frequency of 2400 MHz, so tsc_khz = cpu_khz = 2,400,000, exactly. It would stray from that value only based on the ppm error of the base 24 MHz crystal. Anything other than that value is error. -Len > RAPL: 17476 sec. Joule Counter Range, at 15 Watts > cpu1: MSR_PLATFORM_INFO: 0x4043df1011800 > 4 * 100 = 400 MHz max efficiency frequency > 24 * 100 = 2400 MHz base frequency > cpu1: MSR_IA32_POWER_CTL: 0x0024005d (C1E auto-promotion: DISabled) > cpu1: MSR_TURBO_RATIO_LIMIT: 0x1b1b1b1c > 27 * 100 = 2700 MHz max turbo 4 active cores > 27 * 100 = 2700 MHz max turbo 3 active cores > 27 * 100 = 2700 MHz max turbo 2 active cores > 28 * 100 = 2800 MHz max turbo 1 active cores > cpu1: MSR_CONFIG_TDP_NOMINAL: 0x0017 (base_ratio=7) > cpu1: MSR_CONFIG_TDP_LEVEL_1: 0x0008003c (PKG_MIN_PWR_LVL1=0 > PKG_MAX_PWR_LVL1=0 LVL1_RATIO=8 PKG_TDP_LVL1=60) > cpu1: MSR_CONFIG_TDP_LEVEL_2: 0x001800c8 (PKG_MIN_PWR_LVL2=0 > PKG_MAX_PWR_LVL2=0 LVL2_RATIO=8 PKG_TDP_LVL2=200) > cpu1: MSR_CONFIG_TDP_CONTROL: 0x ( lock=0) > cpu1: MSR_TURBO_ACTIVATION_RATIO: 0x (MAX_NON_TURBO_RATIO=0 > lock=0) > cpu1: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008006 (UNdemote-C3, > UNdemote-C1, demote-C3, demote-C1, locked: pkg-cstate-limit=6: pc8) > cpu0: MSR_PM_ENABLE: 0x0001 (HWP) > cpu0: MSR_HWP_CAPABILITIES: 0x0105171c (high 0x1c guar 0x17 eff 0x5 low > 0x1) > cpu0: MSR_HWP_REQUEST: 0x80001c04 (min 0x4 max 0x1c des 0x0 epp 0x80 > window 0x0 pkg 0x0) > cpu0: MSR_HWP_INTERRUPT: 0x0001 (EN_Guaranteed_Perf_Change, > Dis_Excursion_Min) > cpu0: MSR_HWP_STATUS: 0x (No-Guaranteed_Perf_Change, No- > Excursion_Min) > cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0006 (balanced) > cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.61 > Joules, 0.000977 sec.) > cpu0: MSR_PKG_POWER_INFO: 0x0078 (15 W TDP, RAPL 0 - 0 W, 0.00 > sec.) > cpu0: MSR_PKG_POWER_LIMIT: 0x4280c800dd8078 (UNlocked) > cpu0: PKG Limit #1: ENabled (15.00 Watts, 28.00 sec, clamp > ENabled) > cpu0: PKG Limit #2: ENabled (25.00 Watts, 0.002441* sec, clamp > DISabled) > cpu0: MSR_DRAM_POWER_LIMIT: 0x5400de (UNlocked) > cpu0: DRAM Limit: DISabled (0.00 Watts, 0.000977 sec, clamp DISabled) > cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x0064 (100 C) > cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x88340800 (48 C) > cpu0: MSR_IA32_THERM_STATUS: 0x88350800 (47 C +/- 1) > cpu1: MSR_IA32_THERM_STATUS: 0x88340800 (48 C +/- 1) > Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 > CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Totl%C0 Any%C0 GFX%C0 CPUGFX% > Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 Pkg%pc8 Pkg%pc9 Pk%pc10 PkgWatt > RAMWatt PKG_% RAM_% >- - 162.84 5492399 04.13 > 0.020.63 92.39 46 46 12.21 10.772.480.80 > 14.33 71.530.000.000.000.000.001.950.54 > 0.000.00 >0 0 111.92 5632399 02.48 > 0.020.80 94.78 46 46 12.21 10.782.480.80 > 14.34 71.550.000.000.000.000.001.950.54 > 0.000.00 >0 2 81.39 5502399 03.04 >1 1 132.04 6222400 07.48 > 0.020.45 90.01 45 >1 3 316.01 5202400 03.51 > 1.002561 sec > > In case it's at all useful, adjtimex -p says: > > mode: 0 >offset: 0 > frequency: 135641 > maxerror: 37498 > esterror: 1532 >status: 8192 > time_constant: 2 > precision: 1 > tolerance: 32768000 > tick: 1 > raw time: 1449098317s 671243180us = 1449098317.671243180 > > this suggests a rather small correction, so I really have no idea what > "Adjusting tsc more than 11% (8039115 vs 7759462)" means. > > John, you wrote this code. What does the error message mean? > > --Andy N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Skylake (XPS 13 9350) TSC is way off
+adrian > [0.00] tsc: PIT calibration matches HPET. 2 loops > [0.00] tsc: Detected 2399.975 MHz processor > [0.090897] TSC deadline timer enabled > [1.960034] tsc: Refined TSC clocksource calibration: 2400.007 MHz > [1.960039] clocksource: tsc: mask: 0x max_cycles: > 0x22983e30402, max_idle_ns: 440795260848 ns > [2.959936] clocksource: Switched to clocksource tsc > [ 87.168211] Adjusting tsc more than 11% (5941981 vs 7759439) > > This is more or less Linus' latest tree (v4.4-rc3 plus some unrelated > platform driver patches). Hi Andy, Thanks for the note. I'll send you a debug version of turbostat, off list, since the list will just block mail with that attachment. thanks, -Len
RE: Skylake (XPS 13 9350) TSC is way off
+adrian > [0.00] tsc: PIT calibration matches HPET. 2 loops > [0.00] tsc: Detected 2399.975 MHz processor > [0.090897] TSC deadline timer enabled > [1.960034] tsc: Refined TSC clocksource calibration: 2400.007 MHz > [1.960039] clocksource: tsc: mask: 0x max_cycles: > 0x22983e30402, max_idle_ns: 440795260848 ns > [2.959936] clocksource: Switched to clocksource tsc > [ 87.168211] Adjusting tsc more than 11% (5941981 vs 7759439) > > This is more or less Linus' latest tree (v4.4-rc3 plus some unrelated > platform driver patches). Hi Andy, Thanks for the note. I'll send you a debug version of turbostat, off list, since the list will just block mail with that attachment. thanks, -Len
RE: Skylake (XPS 13 9350) TSC is way off
> >> [0.00] tsc: PIT calibration matches HPET. 2 loops > >> [0.00] tsc: Detected 2399.975 MHz processor > >> [0.090897] TSC deadline timer enabled > >> [1.960034] tsc: Refined TSC clocksource calibration: 2400.007 MHz > turbostat version 4.10 10 Dec, 2015 - Len Brown> CPUID(0): GenuineIntel 22 CPUID levels; family:model:stepping 0x6:4e:3 > (6:78:3) > CPUID(1): SSE3 MONITOR EIST TM2 TSC MSR ACPI-TM TM > CPUID(6): APERF, DTS, PTM, HWP, HWPnotify, HWPwindow, HWPepp, No-HWPpkg, EPB > cpu1: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MONITOR) > CPUID(0x15): eax_crystal: 2 ebx_tsc: 200 ecx_crystal_hz: 0 > TSC: 2400 MHz (2400 Hz * 200 / 2 / 100) > CPUID(0x16): base_mhz: 2400 max_mhz: 2800 bus_mhz: 100 Both the initial and refined TSC calibration are right on the money -- 2400 MHz. Further, this system happens to also have a base frequency of 2400 MHz, so tsc_khz = cpu_khz = 2,400,000, exactly. It would stray from that value only based on the ppm error of the base 24 MHz crystal. Anything other than that value is error. -Len > RAPL: 17476 sec. Joule Counter Range, at 15 Watts > cpu1: MSR_PLATFORM_INFO: 0x4043df1011800 > 4 * 100 = 400 MHz max efficiency frequency > 24 * 100 = 2400 MHz base frequency > cpu1: MSR_IA32_POWER_CTL: 0x0024005d (C1E auto-promotion: DISabled) > cpu1: MSR_TURBO_RATIO_LIMIT: 0x1b1b1b1c > 27 * 100 = 2700 MHz max turbo 4 active cores > 27 * 100 = 2700 MHz max turbo 3 active cores > 27 * 100 = 2700 MHz max turbo 2 active cores > 28 * 100 = 2800 MHz max turbo 1 active cores > cpu1: MSR_CONFIG_TDP_NOMINAL: 0x0017 (base_ratio=7) > cpu1: MSR_CONFIG_TDP_LEVEL_1: 0x0008003c (PKG_MIN_PWR_LVL1=0 > PKG_MAX_PWR_LVL1=0 LVL1_RATIO=8 PKG_TDP_LVL1=60) > cpu1: MSR_CONFIG_TDP_LEVEL_2: 0x001800c8 (PKG_MIN_PWR_LVL2=0 > PKG_MAX_PWR_LVL2=0 LVL2_RATIO=8 PKG_TDP_LVL2=200) > cpu1: MSR_CONFIG_TDP_CONTROL: 0x ( lock=0) > cpu1: MSR_TURBO_ACTIVATION_RATIO: 0x (MAX_NON_TURBO_RATIO=0 > lock=0) > cpu1: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008006 (UNdemote-C3, > UNdemote-C1, demote-C3, demote-C1, locked: pkg-cstate-limit=6: pc8) > cpu0: MSR_PM_ENABLE: 0x0001 (HWP) > cpu0: MSR_HWP_CAPABILITIES: 0x0105171c (high 0x1c guar 0x17 eff 0x5 low > 0x1) > cpu0: MSR_HWP_REQUEST: 0x80001c04 (min 0x4 max 0x1c des 0x0 epp 0x80 > window 0x0 pkg 0x0) > cpu0: MSR_HWP_INTERRUPT: 0x0001 (EN_Guaranteed_Perf_Change, > Dis_Excursion_Min) > cpu0: MSR_HWP_STATUS: 0x (No-Guaranteed_Perf_Change, No- > Excursion_Min) > cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0006 (balanced) > cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.61 > Joules, 0.000977 sec.) > cpu0: MSR_PKG_POWER_INFO: 0x0078 (15 W TDP, RAPL 0 - 0 W, 0.00 > sec.) > cpu0: MSR_PKG_POWER_LIMIT: 0x4280c800dd8078 (UNlocked) > cpu0: PKG Limit #1: ENabled (15.00 Watts, 28.00 sec, clamp > ENabled) > cpu0: PKG Limit #2: ENabled (25.00 Watts, 0.002441* sec, clamp > DISabled) > cpu0: MSR_DRAM_POWER_LIMIT: 0x5400de (UNlocked) > cpu0: DRAM Limit: DISabled (0.00 Watts, 0.000977 sec, clamp DISabled) > cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x0064 (100 C) > cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x88340800 (48 C) > cpu0: MSR_IA32_THERM_STATUS: 0x88350800 (47 C +/- 1) > cpu1: MSR_IA32_THERM_STATUS: 0x88340800 (48 C +/- 1) > Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 > CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Totl%C0 Any%C0 GFX%C0 CPUGFX% > Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 Pkg%pc8 Pkg%pc9 Pk%pc10 PkgWatt > RAMWatt PKG_% RAM_% >- - 162.84 5492399 04.13 > 0.020.63 92.39 46 46 12.21 10.772.480.80 > 14.33 71.530.000.000.000.000.001.950.54 > 0.000.00 >0 0 111.92 5632399 02.48 > 0.020.80 94.78 46 46 12.21 10.782.480.80 > 14.34 71.550.000.000.000.000.001.950.54 > 0.000.00 >0 2 81.39 5502399 03.04 >1 1 132.04 6222400 07.48 > 0.020.45 90.01 45 >1 3 316.01 5202400 03.51 > 1.002561 sec > > In case it's at all useful, adjtimex -p says: > > mode: 0 >offset: 0 > frequency: 135641 > maxerror: 37498 > esterror: 1532 >status: 8192 > time_constant: 2 > precision: 1 > tolerance: 32768000 > tick: 1 > raw time: 1449098317s 671243180us = 1449098317.671243180 > > this suggests a rather small correction, so I really have no idea what > "Adjusting tsc more than 11% (8039115 vs 7759462)" means. > > John, you wrote this code. What does the error message mean? > > --Andy N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t
> Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle. > The two workaround that seem to help are "intel_idle.max_cstate=0" > or "nohz=off highres=off". > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but > even when moved to early_resume it didn't do anything. > > I have no idea what's wrong here, so let's just disable intel_idle > for these machines using a DMI match. Ville, It is great that several workarounds have been discovered. But it would be better to get a good idea of the root-cause before permanently ignoring the problem via a new black-list in the upstream kernel. Is it possible for you to file a bug at bugzilla.kernel.org against Product: power-management; component: intel_idle? In it, please put the following information. If this is a regression, the oldest kernel that broke. When booted with intel_idle, and then without: dmesg | grep idle grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* # turbostat --debug sleep 10 2> turbostat.out # cd /sys/devices/system/clocksource/clocksource0 grep . available_clocksource current_clocksource Other boot options to test: maxcpus=1 nohpet intel_idle.max_cstate=3 and if that fails intel_idle.max_cstate=2 and if that fails intel_idle.max_cstate=1 thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] intel_idle: Don't use on Lenovo Ideapad S10-3t
> Lenovo Ideapad S10-3t hangs coming out of S3 with intel_idle. > The two workaround that seem to help are "intel_idle.max_cstate=0" > or "nohz=off highres=off". > > At a first glance quirk_tigerpoint_bm_sts() seemed promising, but > even when moved to early_resume it didn't do anything. > > I have no idea what's wrong here, so let's just disable intel_idle > for these machines using a DMI match. Ville, It is great that several workarounds have been discovered. But it would be better to get a good idea of the root-cause before permanently ignoring the problem via a new black-list in the upstream kernel. Is it possible for you to file a bug at bugzilla.kernel.org against Product: power-management; component: intel_idle? In it, please put the following information. If this is a regression, the oldest kernel that broke. When booted with intel_idle, and then without: dmesg | grep idle grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* # turbostat --debug sleep 10 2> turbostat.out # cd /sys/devices/system/clocksource/clocksource0 grep . available_clocksource current_clocksource Other boot options to test: maxcpus=1 nohpet intel_idle.max_cstate=3 and if that fails intel_idle.max_cstate=2 and if that fails intel_idle.max_cstate=1 thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Re[6]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
try booting upstream with "cpu_init_udelay=1". If it works, then it actually implicates this commit: a9bcaa02a5104ace6a9d9e4a9cd9192a9e7744d6 ("x86/smpboot: Remove SIPI delays from cpu_up()") Unfortunately the commit message for that on is erroneous -- "cpu_init_udelay=1" is actually a NO-OP, because that matches the compiled-in default. Indeed, any non-zero value bug 1 should work. thanks, -Len
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> I did the revert in linux-stable (last tag being v4.3-rc4) gave a revert > description so it would be applied. > > built and tested. Result: did not help, still missing the second core. Same result here. upstream failed to bring up CPU #1 on 5/5 boots Revert "x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted" This reverts commit 656bba306827a44ed73b3f93f75bb3147de17fae. Still fails the same way. Adding "cpu_init_udelay=1" does not help. commence bisect... cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Re[4]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > > You have similar hardware: > > > > > > Shane: > > > > > > smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: > 06, > > model: 0f, stepping: 06) > > > > > > Donald: > > > > > > CPU : Intel Core 2 CPU 6600 @ 2.4GHz > > > > > > I think I can get ahold of a core2 6xxx box tomorrow. > > Intel(R) Core(TM)2 CPU E6800 @ 2.93GHz > > is working for me with latest upstream. > (It is on an Intel D975 XBX motherboard) Good news - I reproduced the failure on a similar box, an Intel D975xbx2: [0.00] Linux version 4.3.0-rc5+ (lenb@z87) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #375 SMP Thu Oct 15 18:17:04 EDT 2015 ... [0.084000] smpboot: CPU0: Intel(R) Core(TM)2 Quad CPU @ 2.66GHz (family: 0x6, model: 0xf, stepping: 0x7) [0.084000] Performance Events: PEBS fmt0-, 4-deep LBR, Core2 events, Intel PMU driver. [0.084000] perf_event_intel: PEBS disabled due to CPU errata [0.084000] ... version:2 [0.084000] ... bit width: 40 [0.084000] ... generic registers: 2 [0.084000] ... value mask: 00ff [0.084000] ... max period: 7fff [0.084000] ... fixed-purpose events: 3 [0.084000] ... event mask: 00070003 [0.084000] x86: Booting SMP configuration: [0.084000] node #0, CPUs: #1 [ 10.080003] smpboot: do_boot_cpu failed(-1) to wakeup CPU#1 [ 10.080175] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 10.080334] #2 #3 [ 10.084017] x86: Booted up 1 node, 3 CPUs [ 10.084120] smpboot: Total of 3 processors activated (16001.58 BogoMIPS)
RE: Re[4]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > You have similar hardware: > > > > Shane: > > > > smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: 06, > model: 0f, stepping: 06) > > > > Donald: > > > > CPU : Intel Core 2 CPU 6600 @ 2.4GHz > > > > I think I can get ahold of a core2 6xxx box tomorrow. Intel(R) Core(TM)2 CPU E6800 @ 2.93GHz is working for me with latest upstream. (It is on an Intel D975 XBX motherboard) please send me the .config you are using that fails and I'll try that. thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Re[4]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > > You have similar hardware: > > > > > > Shane: > > > > > > smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: > 06, > > model: 0f, stepping: 06) > > > > > > Donald: > > > > > > CPU : Intel Core 2 CPU 6600 @ 2.4GHz > > > > > > I think I can get ahold of a core2 6xxx box tomorrow. > > Intel(R) Core(TM)2 CPU E6800 @ 2.93GHz > > is working for me with latest upstream. > (It is on an Intel D975 XBX motherboard) Good news - I reproduced the failure on a similar box, an Intel D975xbx2: [0.00] Linux version 4.3.0-rc5+ (lenb@z87) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) ) #375 SMP Thu Oct 15 18:17:04 EDT 2015 ... [0.084000] smpboot: CPU0: Intel(R) Core(TM)2 Quad CPU @ 2.66GHz (family: 0x6, model: 0xf, stepping: 0x7) [0.084000] Performance Events: PEBS fmt0-, 4-deep LBR, Core2 events, Intel PMU driver. [0.084000] perf_event_intel: PEBS disabled due to CPU errata [0.084000] ... version:2 [0.084000] ... bit width: 40 [0.084000] ... generic registers: 2 [0.084000] ... value mask: 00ff [0.084000] ... max period: 7fff [0.084000] ... fixed-purpose events: 3 [0.084000] ... event mask: 00070003 [0.084000] x86: Booting SMP configuration: [0.084000] node #0, CPUs: #1 [ 10.080003] smpboot: do_boot_cpu failed(-1) to wakeup CPU#1 [ 10.080175] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 10.080334] #2 #3 [ 10.084017] x86: Booted up 1 node, 3 CPUs [ 10.084120] smpboot: Total of 3 processors activated (16001.58 BogoMIPS)
RE: Re[6]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
try booting upstream with "cpu_init_udelay=1". If it works, then it actually implicates this commit: a9bcaa02a5104ace6a9d9e4a9cd9192a9e7744d6 ("x86/smpboot: Remove SIPI delays from cpu_up()") Unfortunately the commit message for that on is erroneous -- "cpu_init_udelay=1" is actually a NO-OP, because that matches the compiled-in default. Indeed, any non-zero value bug 1 should work. thanks, -Len
RE: Re[4]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > You have similar hardware: > > > > Shane: > > > > smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: 06, > model: 0f, stepping: 06) > > > > Donald: > > > > CPU : Intel Core 2 CPU 6600 @ 2.4GHz > > > > I think I can get ahold of a core2 6xxx box tomorrow. Intel(R) Core(TM)2 CPU E6800 @ 2.93GHz is working for me with latest upstream. (It is on an Intel D975 XBX motherboard) please send me the .config you are using that fails and I'll try that. thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> I did the revert in linux-stable (last tag being v4.3-rc4) gave a revert > description so it would be applied. > > built and tested. Result: did not help, still missing the second core. Same result here. upstream failed to bring up CPU #1 on 5/5 boots Revert "x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted" This reverts commit 656bba306827a44ed73b3f93f75bb3147de17fae. Still fails the same way. Adding "cpu_init_udelay=1" does not help. commence bisect... cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > > Did you try reverting the "x86/smpboot: Remove > APIC.wait_for_init_deassert > > > and atomic init_deasserted" patch? > > > > Yes, please let me know if reverting that patch helps you too. > > How? Please send a patch or git cmd(s).I have the > git/stable/linux-stable.git on my PC. Thanks. git log calls it this: commit 656bba306827a44ed73b3f93f75bb3147de17fae Author: Len Brown Date: Sun Aug 16 11:45:48 2015 -0400 x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted So you want to simply do this: $ git revert 656bba306827a44ed73b3f93f75bb3147de17fae build and test. cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
Donald, Shane, Thanks for reporting this. > > I also tried suggested /vmlinuz-4.3.0-rc3 parameter in grub: > > "cpu_init_udelay=1" > > which did not help getting missing CPU back online. right, if the issue is caused by the patch below, that cmdline will not help. > Did you try reverting the "x86/smpboot: Remove APIC.wait_for_init_deassert > and atomic init_deasserted" patch? Yes, please let me know if reverting that patch helps you too. You have similar hardware: Shane: smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: 06, model: 0f, stepping: 06) Donald: CPU : Intel Core 2 CPU 6600 @ 2.4GHz I think I can get ahold of a core2 6xxx box tomorrow. Please send me your .config directly, and I'll see if I can reproduce the issue. thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
Donald, Shane, Thanks for reporting this. > > I also tried suggested /vmlinuz-4.3.0-rc3 parameter in grub: > > "cpu_init_udelay=1" > > which did not help getting missing CPU back online. right, if the issue is caused by the patch below, that cmdline will not help. > Did you try reverting the "x86/smpboot: Remove APIC.wait_for_init_deassert > and atomic init_deasserted" patch? Yes, please let me know if reverting that patch helps you too. You have similar hardware: Shane: smpboot: CPU0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz (fam: 06, model: 0f, stepping: 06) Donald: CPU : Intel Core 2 CPU 6600 @ 2.4GHz I think I can get ahold of a core2 6xxx box tomorrow. Please send me your .config directly, and I'll see if I can reproduce the issue. thanks, -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: Re[2]: 3.4-rc smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
> > > Did you try reverting the "x86/smpboot: Remove > APIC.wait_for_init_deassert > > > and atomic init_deasserted" patch? > > > > Yes, please let me know if reverting that patch helps you too. > > How? Please send a patch or git cmd(s).I have the > git/stable/linux-stable.git on my PC. Thanks. git log calls it this: commit 656bba306827a44ed73b3f93f75bb3147de17fae Author: Len BrownDate: Sun Aug 16 11:45:48 2015 -0400 x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted So you want to simply do this: $ git revert 656bba306827a44ed73b3f93f75bb3147de17fae build and test. cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] power: print function name of callbacks
have you used analyze_suspend? It used to parse this output, but that was abandoned when it cut over to using ftrace directly. https://01.org/suspendresume cheers, -Len > -Original Message- > From: Douglas Anderson [mailto:diand...@chromium.org] > Sent: Tuesday, September 22, 2015 1:27 PM > To: r...@rjwysocki.net > Cc: Dmitry Torokhov; Douglas Anderson; pa...@ucw.cz; Brown, Len; > gre...@linuxfoundation.org; linux...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: [PATCH] power: print function name of callbacks > > The printouts writen to the logs by suspend can be a bit opaque: it can > be hard to track them down to the actual function called. You might > see: > > calling rfkill1+ @ 19473, parent: phy0 > call rfkill1+ returned 0 after 1 usecs > calling phy0+ @ 19473, parent: mmc2:0001:1 > call phy0+ returned 0 after 19 usecs > > It's a bit hard to know what's actually happening. Instead, it's nice > to see: > > calling rfkill1+ @ 15793, parent: phy0, cb: rfkill_suspend > call rfkill1+ returned 0 after 1 usecs > calling phy0+ @ 15793, parent: mmc2:0001:1, cb: wiphy_suspend > [cfg80211] > call phy0+ returned 0 after 7 usecs > > That makes it very obvious what's going on. It also has the nice side > effect of making the suspend/resume spew a little more obvious, since > many resume functions have the word "resume" in the name: > > calling phy0+ @ 15793, parent: mmc2:0001:1, cb: wiphy_resume [cfg80211] > call phy0+ returned 0 after 12 usecs > calling rfkill1+ @ 15793, parent: phy0, cb: rfkill_resume > call rfkill1+ returned 0 after 1 usecs > > Signed-off-by: Douglas Anderson > --- > drivers/base/power/main.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > index 1710c26..e2b1f14 100644 > --- a/drivers/base/power/main.c > +++ b/drivers/base/power/main.c > @@ -188,14 +188,14 @@ void device_pm_move_last(struct device *dev) > list_move_tail(>power.entry, _list); > } > > -static ktime_t initcall_debug_start(struct device *dev) > +static ktime_t initcall_debug_start(struct device *dev, void *cb) > { > ktime_t calltime = ktime_set(0, 0); > > if (pm_print_times_enabled) { > - pr_info("calling %s+ @ %i, parent: %s\n", > + pr_info("calling %s+ @ %i, parent: %s, cb: %pf\n", > dev_name(dev), task_pid_nr(current), > - dev->parent ? dev_name(dev->parent) : "none"); > + dev->parent ? dev_name(dev->parent) : "none", cb); > calltime = ktime_get(); > } > > @@ -382,7 +382,7 @@ static int dpm_run_callback(pm_callback_t cb, struct > device *dev, > if (!cb) > return 0; > > - calltime = initcall_debug_start(dev); > + calltime = initcall_debug_start(dev, cb); > > pm_dev_dbg(dev, state, info); > trace_device_pm_callback_start(dev, info, state.event); > @@ -1324,7 +1324,7 @@ static int legacy_suspend(struct device *dev, > pm_message_t state, > int error; > ktime_t calltime; > > - calltime = initcall_debug_start(dev); > + calltime = initcall_debug_start(dev, cb); > > trace_device_pm_callback_start(dev, info, state.event); > error = cb(dev, state); > -- > 2.6.0.rc0.131.gf624c3d -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] power: print function name of callbacks
have you used analyze_suspend? It used to parse this output, but that was abandoned when it cut over to using ftrace directly. https://01.org/suspendresume cheers, -Len > -Original Message- > From: Douglas Anderson [mailto:diand...@chromium.org] > Sent: Tuesday, September 22, 2015 1:27 PM > To: r...@rjwysocki.net > Cc: Dmitry Torokhov; Douglas Anderson; pa...@ucw.cz; Brown, Len; > gre...@linuxfoundation.org; linux...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: [PATCH] power: print function name of callbacks > > The printouts writen to the logs by suspend can be a bit opaque: it can > be hard to track them down to the actual function called. You might > see: > > calling rfkill1+ @ 19473, parent: phy0 > call rfkill1+ returned 0 after 1 usecs > calling phy0+ @ 19473, parent: mmc2:0001:1 > call phy0+ returned 0 after 19 usecs > > It's a bit hard to know what's actually happening. Instead, it's nice > to see: > > calling rfkill1+ @ 15793, parent: phy0, cb: rfkill_suspend > call rfkill1+ returned 0 after 1 usecs > calling phy0+ @ 15793, parent: mmc2:0001:1, cb: wiphy_suspend > [cfg80211] > call phy0+ returned 0 after 7 usecs > > That makes it very obvious what's going on. It also has the nice side > effect of making the suspend/resume spew a little more obvious, since > many resume functions have the word "resume" in the name: > > calling phy0+ @ 15793, parent: mmc2:0001:1, cb: wiphy_resume [cfg80211] > call phy0+ returned 0 after 12 usecs > calling rfkill1+ @ 15793, parent: phy0, cb: rfkill_resume > call rfkill1+ returned 0 after 1 usecs > > Signed-off-by: Douglas Anderson <diand...@chromium.org> > --- > drivers/base/power/main.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > index 1710c26..e2b1f14 100644 > --- a/drivers/base/power/main.c > +++ b/drivers/base/power/main.c > @@ -188,14 +188,14 @@ void device_pm_move_last(struct device *dev) > list_move_tail(>power.entry, _list); > } > > -static ktime_t initcall_debug_start(struct device *dev) > +static ktime_t initcall_debug_start(struct device *dev, void *cb) > { > ktime_t calltime = ktime_set(0, 0); > > if (pm_print_times_enabled) { > - pr_info("calling %s+ @ %i, parent: %s\n", > + pr_info("calling %s+ @ %i, parent: %s, cb: %pf\n", > dev_name(dev), task_pid_nr(current), > - dev->parent ? dev_name(dev->parent) : "none"); > + dev->parent ? dev_name(dev->parent) : "none", cb); > calltime = ktime_get(); > } > > @@ -382,7 +382,7 @@ static int dpm_run_callback(pm_callback_t cb, struct > device *dev, > if (!cb) > return 0; > > - calltime = initcall_debug_start(dev); > + calltime = initcall_debug_start(dev, cb); > > pm_dev_dbg(dev, state, info); > trace_device_pm_callback_start(dev, info, state.event); > @@ -1324,7 +1324,7 @@ static int legacy_suspend(struct device *dev, > pm_message_t state, > int error; > ktime_t calltime; > > - calltime = initcall_debug_start(dev); > + calltime = initcall_debug_start(dev, cb); > > trace_device_pm_callback_start(dev, info, state.event); > error = cb(dev, state); > -- > 2.6.0.rc0.131.gf624c3d -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
> Who does enter suspend to ram multiple times a second? Only android One significant customer for fast suspend/resume is sufficient to justify Linux supporting fast suspend/resume. > Can you run android on mainline kernel? No This is something we need to work together to help fix. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
Who does enter suspend to ram multiple times a second? Only android One significant customer for fast suspend/resume is sufficient to justify Linux supporting fast suspend/resume. Can you run android on mainline kernel? No This is something we need to work together to help fix. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
> -Original Message- > From: Austin S Hemmelgarn [mailto:ahferro...@gmail.com] > Sent: Wednesday, July 15, 2015 10:07 AM > To: Pavel Machek; Len Brown > Cc: r...@rjwysocki.net; linux...@vger.kernel.org; linux- > ker...@vger.kernel.org; Brown, Len > Subject: Re: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional > > On 2015-07-15 02:43, Pavel Machek wrote: > > On Tue 2015-07-14 22:24:51, Len Brown wrote: > >> From: Len Brown > >> > >> The Linux kernel suspend path has traditionally invoked sys_sync(). > >> > >> But sys_sync() can be expensive, and some systems do not want > >> to pay the cost of sys_sync() on every suspend. > > > > Have you measured how expensive it can be, and why it is expensive? > How expensive it is can vary widely, but it pretty much boils down to > how much dirty data still needs written out, and how slow the storage it > needs written to is. There's not really much that can be done in the > kernel to change this, and most userspace suspend systems call sync > themselves during the suspend sequence. Right. And now, user-space gets is no longer forced to incur that delay on every suspend if they do not want it. Yes, have measured this under many conditions. The bottom line is that sys_sync() is rarely as fast as 1ms, and is sometimes as slow as hundreds of ms. >> Why do you need CONFIG parameter? So that an OS that doesn't want to change their user-space, can build a kernel that does what they want by default. Originally I had the config parameter remove this code entirely, which would achieve the same goal. But Rafael prefers the sysfs attribute always exist and the config simply set the default. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
-Original Message- From: Austin S Hemmelgarn [mailto:ahferro...@gmail.com] Sent: Wednesday, July 15, 2015 10:07 AM To: Pavel Machek; Len Brown Cc: r...@rjwysocki.net; linux...@vger.kernel.org; linux- ker...@vger.kernel.org; Brown, Len Subject: Re: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional On 2015-07-15 02:43, Pavel Machek wrote: On Tue 2015-07-14 22:24:51, Len Brown wrote: From: Len Brown len.br...@intel.com The Linux kernel suspend path has traditionally invoked sys_sync(). But sys_sync() can be expensive, and some systems do not want to pay the cost of sys_sync() on every suspend. Have you measured how expensive it can be, and why it is expensive? How expensive it is can vary widely, but it pretty much boils down to how much dirty data still needs written out, and how slow the storage it needs written to is. There's not really much that can be done in the kernel to change this, and most userspace suspend systems call sync themselves during the suspend sequence. Right. And now, user-space gets is no longer forced to incur that delay on every suspend if they do not want it. Yes, have measured this under many conditions. The bottom line is that sys_sync() is rarely as fast as 1ms, and is sometimes as slow as hundreds of ms. Why do you need CONFIG parameter? So that an OS that doesn't want to change their user-space, can build a kernel that does what they want by default. Originally I had the config parameter remove this code entirely, which would achieve the same goal. But Rafael prefers the sysfs attribute always exist and the config simply set the default. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] x86, msr: Allow read access to /dev/cpu/X/msr
> This driver only knows how to handle counters, though. I'm not sure > whether all of the MSRs that turbostat needs are counters. turbostat --debug dumps a lot of configuration MSRs that are not counters. "--debug" is not an obscure option, it is the only way that the turbostat is used by advanced users, since it shows not just how fast a system is running, but explains why. turbostat -M or -C 'address' will dump an arbitrary MSR at offset 'address' in hex or counter format. This allows somebody to use yesterday's turbostat application to dump an MSR that didn't exist when the application was written. (and since the MSR driver doesn't have specific MSR addresses hard-coded into it, that is permitted) > I knew that turbostat only did MSR reads FYI, There have been proposals for turbostat to do MSR writes, and I've resisted them because I like that multiple turbostats can run at different intervals and even different users, and not interfere with each other. One of the proposals was due to a short counter that tends to over-flow. That, I think would be better done in a central place in the kernel, though it shouldn't poll unless it is actually being used. The other is to be able to fix configuration bits that are recognized to be incorrect or non-optimal. BTW. I've had a discussion w/ LLNL about their needs, both for security and performance. For security, as concluded by this thread, a white list is the only way to go. I'm thinking a bit-vector of allowed MSR offsets... For performance, they absolutely can not afford a system call for every single MSR access. Here an ioctl to have the msr driver perform a vector of accesses in a single system call seems the way to go. I can prototype both of these using turbostat as the customer. -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH] x86, msr: Allow read access to /dev/cpu/X/msr
This driver only knows how to handle counters, though. I'm not sure whether all of the MSRs that turbostat needs are counters. turbostat --debug dumps a lot of configuration MSRs that are not counters. --debug is not an obscure option, it is the only way that the turbostat is used by advanced users, since it shows not just how fast a system is running, but explains why. turbostat -M or -C 'address' will dump an arbitrary MSR at offset 'address' in hex or counter format. This allows somebody to use yesterday's turbostat application to dump an MSR that didn't exist when the application was written. (and since the MSR driver doesn't have specific MSR addresses hard-coded into it, that is permitted) I knew that turbostat only did MSR reads FYI, There have been proposals for turbostat to do MSR writes, and I've resisted them because I like that multiple turbostats can run at different intervals and even different users, and not interfere with each other. One of the proposals was due to a short counter that tends to over-flow. That, I think would be better done in a central place in the kernel, though it shouldn't poll unless it is actually being used. The other is to be able to fix configuration bits that are recognized to be incorrect or non-optimal. BTW. I've had a discussion w/ LLNL about their needs, both for security and performance. For security, as concluded by this thread, a white list is the only way to go. I'm thinking a bit-vector of allowed MSR offsets... For performance, they absolutely can not afford a system call for every single MSR access. Here an ioctl to have the msr driver perform a vector of accesses in a single system call seems the way to go. I can prototype both of these using turbostat as the customer. -Len N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a��� 0��h���i
RE: [PATCH 2/2] turbostat, add set_base_cpu()
> -Original Message- > From: Prarit Bhargava [mailto:pra...@redhat.com] > Sent: Friday, May 22, 2015 6:30 PM > To: Brown, Len > Cc: linux-kernel@vger.kernel.org; linux...@vger.kernel.org; Semin, Andrey > Subject: Re: [PATCH 2/2] turbostat, add set_base_cpu() > > > > On 05/22/2015 11:55 AM, Brown, Len wrote: > >> +void set_base_cpu(void) > >> +{ > >> + int cpu; > >> + > >> + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { > >> + if (cpu_is_not_present(cpu)) > >> + continue; > >> + base_cpu = cpu; > >> + break; > >> + } > >> + > >> + if (base_cpu == -1) > >> + err(-ENODEV, "No valid cpus found"); > >> +} > > > > > > cpu0 hard-coding is indeed arbitrary. > > However, so is this proposed replacement, base_cpu. > > Either may not match where turbostat is currently running, > > and thus could provoke unnecessary cross-calls to get there. > > > > I think it would be better to ask getcpu(2) where we are already > running, > > and simply use that one. I think we can call it once and cache it, > > as you proposed, rather than multiple system calls. > > Any objection to sched_getcpu()? That way the code is simply > > base_cpu = sched_getcpu(); > > if (base_cpu == -1) > err(-ENODEV, "No valid cpus found"); Agreed, that is better than invoking the syscall directly, as we already are using the sched.h interface in this code. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] turbostat, add set_base_cpu()
-Original Message- From: Prarit Bhargava [mailto:pra...@redhat.com] Sent: Friday, May 22, 2015 6:30 PM To: Brown, Len Cc: linux-kernel@vger.kernel.org; linux...@vger.kernel.org; Semin, Andrey Subject: Re: [PATCH 2/2] turbostat, add set_base_cpu() On 05/22/2015 11:55 AM, Brown, Len wrote: +void set_base_cpu(void) +{ + int cpu; + + for (cpu = 0; cpu = topo.max_cpu_num; ++cpu) { + if (cpu_is_not_present(cpu)) + continue; + base_cpu = cpu; + break; + } + + if (base_cpu == -1) + err(-ENODEV, No valid cpus found); +} cpu0 hard-coding is indeed arbitrary. However, so is this proposed replacement, base_cpu. Either may not match where turbostat is currently running, and thus could provoke unnecessary cross-calls to get there. I think it would be better to ask getcpu(2) where we are already running, and simply use that one. I think we can call it once and cache it, as you proposed, rather than multiple system calls. Any objection to sched_getcpu()? That way the code is simply base_cpu = sched_getcpu(); if (base_cpu == -1) err(-ENODEV, No valid cpus found); Agreed, that is better than invoking the syscall directly, as we already are using the sched.h interface in this code. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] turbostat, add set_base_cpu()
> +void set_base_cpu(void) > +{ > + int cpu; > + > + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { > + if (cpu_is_not_present(cpu)) > + continue; > + base_cpu = cpu; > + break; > + } > + > + if (base_cpu == -1) > + err(-ENODEV, "No valid cpus found"); > +} cpu0 hard-coding is indeed arbitrary. However, so is this proposed replacement, base_cpu. Either may not match where turbostat is currently running, and thus could provoke unnecessary cross-calls to get there. I think it would be better to ask getcpu(2) where we are already running, and simply use that one. I think we can call it once and cache it, as you proposed, rather than multiple system calls. thanks, -Len ps. patches to turbostat should go to linux...@vger.kernel.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/2] turbostat, add set_base_cpu()
+void set_base_cpu(void) +{ + int cpu; + + for (cpu = 0; cpu = topo.max_cpu_num; ++cpu) { + if (cpu_is_not_present(cpu)) + continue; + base_cpu = cpu; + break; + } + + if (base_cpu == -1) + err(-ENODEV, No valid cpus found); +} cpu0 hard-coding is indeed arbitrary. However, so is this proposed replacement, base_cpu. Either may not match where turbostat is currently running, and thus could provoke unnecessary cross-calls to get there. I think it would be better to ask getcpu(2) where we are already running, and simply use that one. I think we can call it once and cache it, as you proposed, rather than multiple system calls. thanks, -Len ps. patches to turbostat should go to linux...@vger.kernel.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC] x86, perf: Add an aperfmperf driver
> I think that turbostat could do some of its work without being > root if we had a driver like this. Note that turbostat can be run as non-root this way: # setcap cap_sys_rawio=ep ./turbostat # chmod +r /dev/cpu/*/msr For the debug case, there are a number of MSRs that turbostat must access, so would still need permission for that case (which is the only case I use:-) > Thoughts? Would it make sense at all? Did I wire it up right? This is > the only PMU driver I've ever written, and it could have any number of > issues. APERF/MPERF, as with all per-thread MSRs, must be accessed from the local processor. I didn't see where this driver distinguishes the CPU. Also, I assume the intent is to return a snapshot, rather than sampling, yes? Note that turbostat binds itself to a remote CPU so that MSR reads are all local, then it binds to the next CPU etc. In the old days, we read everything without this binding, and the kernel overhead of the remote reads was too high, making it difficult to measure profoundly idle systems. cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC] x86, perf: Add an aperfmperf driver
I think that turbostat could do some of its work without being root if we had a driver like this. Note that turbostat can be run as non-root this way: # setcap cap_sys_rawio=ep ./turbostat # chmod +r /dev/cpu/*/msr For the debug case, there are a number of MSRs that turbostat must access, so would still need permission for that case (which is the only case I use:-) Thoughts? Would it make sense at all? Did I wire it up right? This is the only PMU driver I've ever written, and it could have any number of issues. APERF/MPERF, as with all per-thread MSRs, must be accessed from the local processor. I didn't see where this driver distinguishes the CPU. Also, I assume the intent is to return a snapshot, rather than sampling, yes? Note that turbostat binds itself to a remote CPU so that MSR reads are all local, then it binds to the next CPU etc. In the old days, we read everything without this binding, and the kernel overhead of the remote reads was too high, making it difficult to measure profoundly idle systems. cheers, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] x86: replace cpu_up hard-coded mdelay with variable
> What's the cutoff for 'modern hardware' - which CPUs stopped requiring > the delay? This is the topic of ongoing research, and I'm not ready to send the patch setting a new default until I've heard back from a few more HW people. Every system I've tested appears to work with delay 0. Were I to guess, I'd venture that every system that runs an X86_64 kernel might count as "modern" -- even the 2005 AMD Turion laptop I've got in the bone pile. > Also, is there any public document where the no-delay is specified for > 'modern hardware'? Unfortunately only the converse is true. There is an ancient document specifying that the delay may be necessary. cheers, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] x86: replace cpu_up hard-coded mdelay with variable
What's the cutoff for 'modern hardware' - which CPUs stopped requiring the delay? This is the topic of ongoing research, and I'm not ready to send the patch setting a new default until I've heard back from a few more HW people. Every system I've tested appears to work with delay 0. Were I to guess, I'd venture that every system that runs an X86_64 kernel might count as modern -- even the 2005 AMD Turion laptop I've got in the bone pile. Also, is there any public document where the no-delay is specified for 'modern hardware'? Unfortunately only the converse is true. There is an ancient document specifying that the delay may be necessary. cheers, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Kernel bug 60770
ISTR that I got interrupted when working on that one and never got back to it. Also, I think I had a puzzling power measurement result -- like I had measured a power benefit, and then i couldn't find a benefit. (No matter, should be a perf benefit even if ISO power) Next week I should have access to a core2 DT and Xeon and can take another look at this. thanks, -Len
RE: Kernel bug 60770
ISTR that I got interrupted when working on that one and never got back to it. Also, I think I had a puzzling power measurement result -- like I had measured a power benefit, and then i couldn't find a benefit. (No matter, should be a perf benefit even if ISO power) Next week I should have access to a core2 DT and Xeon and can take another look at this. thanks, -Len
RE: [tip:x86/urgent] x86 idle: Repair large-server 50-watt idle-power regression
Davidlohr, Thanks for the note. Ideally (on Linux in general, and on servers, in particular) we strive for the performance impact of power saving features to be small enough to be considered in "measurement noise". Your report for 160 core Westmere AIM numbers being hit at 10-25% shows 15% measurement noise? But even if true, this looks bad. Any chance you can re-run, with the following two tweaks, one at a time? I'd be curious if you can wrap the invocation in turbostat -v and capture that output to how what states we are seeing during the benchmark run. thanks, -Len #1: skip flush for C1 diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f80b700..6027d06 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -377,7 +377,7 @@ static int intel_idle(struct cpuidle_device *dev, if (!current_set_polling_and_test()) { - if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) + if ((eax > 0) && this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)_thread_info()->flags); __monitor((void *)_thread_info()->flags, 0, 0); #2: skip flush for C1 and C1E diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f80b700..6027d06 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -377,7 +377,7 @@ static int intel_idle(struct cpuidle_device *dev, if (!current_set_polling_and_test()) { - if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) + if ((eax > 1) && this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)_thread_info()->flags); __monitor((void *)_thread_info()->flags, 0, 0);
RE: [tip:x86/urgent] x86 idle: Repair large-server 50-watt idle-power regression
Davidlohr, Thanks for the note. Ideally (on Linux in general, and on servers, in particular) we strive for the performance impact of power saving features to be small enough to be considered in measurement noise. Your report for 160 core Westmere AIM numbers being hit at 10-25% shows 15% measurement noise? But even if true, this looks bad. Any chance you can re-run, with the following two tweaks, one at a time? I'd be curious if you can wrap the invocation in turbostat -v and capture that output to how what states we are seeing during the benchmark run. thanks, -Len #1: skip flush for C1 diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f80b700..6027d06 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -377,7 +377,7 @@ static int intel_idle(struct cpuidle_device *dev, if (!current_set_polling_and_test()) { - if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) + if ((eax 0) this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)current_thread_info()-flags); __monitor((void *)current_thread_info()-flags, 0, 0); #2: skip flush for C1 and C1E diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f80b700..6027d06 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -377,7 +377,7 @@ static int intel_idle(struct cpuidle_device *dev, if (!current_set_polling_and_test()) { - if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) + if ((eax 1) this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)current_thread_info()-flags); __monitor((void *)current_thread_info()-flags, 0, 0);
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
> [0.840668] intel_idle: lapic_timer_reliable_states 0x2 vs. > [0.877528] intel_idle: lapic_timer_reliable_states 0x This means CPUID.ARAT is set for the new board, and not set for the old board. You can observe that also in /proc/cpuinfo flags where you will likely also find a visible stepping difference between these two boards. Bring-up Avoton had ARAT disabled, and also had broken deep C-states. It looks like that is what your old board has. Throw it away and use only production steppings. If you are stuck w/ pre-production hardware, then you need to manually modify upstream Linux to make it happy -- since upstream Linux only cares about production hardware. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
> > I'd be interested in the acpi_idle output above for both the > > new and old boards to see if they are exporting different states > > on the two boards. > > Could be ; I can probably get access to the newer one again too, if > that will be useful. yes, please. > > > > dmidecode isn't useful in this case. The CPUID in /proc/cpuinfo > > may be useful if the problem turns out to be associated with > > some stepping. > > The dmidecode info I'd posted indicated that the steppings were > unnchanged. I can get the /proc/cpuinfo tomorrow, but I figured > the dmidecode stepping info was accurate. Is it not reliable? DMI is useful to get the BIOS version string, not much else. Here you want the output from the CPUID instruction, which is displayed as family/model/stepping in /proc/cpuinfo thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
I'd be interested in the acpi_idle output above for both the new and old boards to see if they are exporting different states on the two boards. Could be ; I can probably get access to the newer one again too, if that will be useful. yes, please. dmidecode isn't useful in this case. The CPUID in /proc/cpuinfo may be useful if the problem turns out to be associated with some stepping. The dmidecode info I'd posted indicated that the steppings were unnchanged. I can get the /proc/cpuinfo tomorrow, but I figured the dmidecode stepping info was accurate. Is it not reliable? DMI is useful to get the BIOS version string, not much else. Here you want the output from the CPUID instruction, which is displayed as family/model/stepping in /proc/cpuinfo thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
[0.840668] intel_idle: lapic_timer_reliable_states 0x2 vs. [0.877528] intel_idle: lapic_timer_reliable_states 0x This means CPUID.ARAT is set for the new board, and not set for the old board. You can observe that also in /proc/cpuinfo flags where you will likely also find a visible stepping difference between these two boards. Bring-up Avoton had ARAT disabled, and also had broken deep C-states. It looks like that is what your old board has. Throw it away and use only production steppings. If you are stuck w/ pre-production hardware, then you need to manually modify upstream Linux to make it happy -- since upstream Linux only cares about production hardware. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
> I've got an eval board with a 1.7GHz Avaton/C2000 that hangs at boot > shortly after the idle driver registration -- typically 1/2 dozen > dmesg lines later, around rtc init, or net stack init. Paul, Please boot the failing board with "intel_idle.max_cstate=0" to disable intel_idle entirely, and then show the C-states exported by acpi_idle, that predumably, are stable on both boards: dmesg | grep idle grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* Then go back and boot with "intel_idle.max_cstate=N" where N is incremented by 1 until when the system fails and note the largest N that still works. > The interesting part is that a nearly identical board, but with > different (newer/faster) CPU and newer BIOS doesn't have the hang. Possibly an electrical bug in the earlier board. Maybe they worked around it by disabling a C-state in ACPI and didn't test upstream Linux? I'd be interested in the acpi_idle output above for both the new and old boards to see if they are exporting different states on the two boards. dmidecode isn't useful in this case. The CPUID in /proc/cpuinfo may be useful if the problem turns out to be associated with some stepping. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Regression in intel_idle on Avaton/Rangely Mohon Peak board
I've got an eval board with a 1.7GHz Avaton/C2000 that hangs at boot shortly after the idle driver registration -- typically 1/2 dozen dmesg lines later, around rtc init, or net stack init. Paul, Please boot the failing board with intel_idle.max_cstate=0 to disable intel_idle entirely, and then show the C-states exported by acpi_idle, that predumably, are stable on both boards: dmesg | grep idle grep . /sys/devices/system/cpu/cpu0/cpuidle/*/* Then go back and boot with intel_idle.max_cstate=N where N is incremented by 1 until when the system fails and note the largest N that still works. The interesting part is that a nearly identical board, but with different (newer/faster) CPU and newer BIOS doesn't have the hang. Possibly an electrical bug in the earlier board. Maybe they worked around it by disabling a C-state in ACPI and didn't test upstream Linux? I'd be interested in the acpi_idle output above for both the new and old boards to see if they are exporting different states on the two boards. dmidecode isn't useful in this case. The CPUID in /proc/cpuinfo may be useful if the problem turns out to be associated with some stepping. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC PATCH 3/3] idle: store the idle state index in the struct rq
> And your point is? It is a bad idea for an individual CPU to track the C-state of another CPU, which can change the cycle after it was checked. We know it is a bad idea because we used to do it, until we realized code here can easily impact the performance critical path. In general, it is the OS's job to communicate constraints to the HW, and the HW to act on them. Not all HW is smart, so sometimes the OS has to do more hand-holding -- but less is more. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC PATCH 3/3] idle: store the idle state index in the struct rq
And your point is? It is a bad idea for an individual CPU to track the C-state of another CPU, which can change the cycle after it was checked. We know it is a bad idea because we used to do it, until we realized code here can easily impact the performance critical path. In general, it is the OS's job to communicate constraints to the HW, and the HW to act on them. Not all HW is smart, so sometimes the OS has to do more hand-holding -- but less is more. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC PATCH 3/3] idle: store the idle state index in the struct rq
> Right now (on ARM at least but I imagine this is pretty universal), the > biggest impact on information accuracy for a CPU depends on what the > other CPUs are doing. The most obvious example is cluster power down. > For a cluster to be powered down, all the CPUs sharing this cluster must > also be powered down. And all those CPUs must have agreed to a possible > cluster power down in advance as well. But it is not because an idle > CPU has agreed to the extra latency imposed by a cluster power down that > the cluster has actually powered down since another CPU in that cluster > might still be running, in which case the recorded latency information > for that idle CPU would be higher than it would be in practice at that > moment. That will not work. When a CPU goes idle, it uses the CURRENT criteria for entering that state. If the criteria change after it has entered the state, are you going to wake it up so it can re-evaluate? No. That is why the state must describe the worst case latency that CPU may see when waking from the state on THAT entry. That is why we use the package C-state numbers to describe core C-states on IA. -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC PATCH 3/3] idle: store the idle state index in the struct rq
Right now (on ARM at least but I imagine this is pretty universal), the biggest impact on information accuracy for a CPU depends on what the other CPUs are doing. The most obvious example is cluster power down. For a cluster to be powered down, all the CPUs sharing this cluster must also be powered down. And all those CPUs must have agreed to a possible cluster power down in advance as well. But it is not because an idle CPU has agreed to the extra latency imposed by a cluster power down that the cluster has actually powered down since another CPU in that cluster might still be running, in which case the recorded latency information for that idle CPU would be higher than it would be in practice at that moment. That will not work. When a CPU goes idle, it uses the CURRENT criteria for entering that state. If the criteria change after it has entered the state, are you going to wake it up so it can re-evaluate? No. That is why the state must describe the worst case latency that CPU may see when waking from the state on THAT entry. That is why we use the package C-state numbers to describe core C-states on IA. -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
> How about naming it as PM_SLEEP_FS_SYNC (and similarly in the sysfs > files > and variable names as well). Just to avoid confusion with > "synchronous/async". good point -- thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
> > + depends on PM_SLEEP > > this is actually a suspend specific feature, and it should depends on > SUSPEND instead? yup, will update. thanks, -Len
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
+ depends on PM_SLEEP this is actually a suspend specific feature, and it should depends on SUSPEND instead? yup, will update. thanks, -Len
RE: [PATCH 1/1] suspend: make sync() on suspend-to-RAM optional
How about naming it as PM_SLEEP_FS_SYNC (and similarly in the sysfs files and variable names as well). Just to avoid confusion with synchronous/async. good point -- thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] x86, acpi: Fix wrong checking condition in acpi_register_lapic().
Reviewed-by: Len Brown -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/1] x86, acpi: Fix wrong checking condition in acpi_register_lapic().
Reviewed-by: Len Brown len.br...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [GIT PULL] Some error injection fixes to queue for 3.11
> >> Pulled, thanks Tony! > >> > >> Len, are you fine with this route [tip:x86/ras tree] for the > >> drivers/acpi/apei/einj.c changes? > > > > Yes, the RAS guys basically own that code. > > These patches also got picked up by Rafael and are in his ACPI tree > too. I think the patches were applied identically, so there should not > be any merge conflicts when this all comes back together in the 3.11 > merge window. > > Rafael already had a chat about who will take future apei changes > so that we won't have this happen again. I think the only real problem with being in two places at once is if the patches _change_. When that happens, you need to get them _both_ changed, else merge conflict happens at the worst time. -L -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [GIT PULL] Some error injection fixes to queue for 3.11
+rafael > Pulled, thanks Tony! > > Len, are you fine with this route [tip:x86/ras tree] for the > drivers/acpi/apei/einj.c changes? Yes, the RAS guys basically own that code. thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [GIT PULL] Some error injection fixes to queue for 3.11
+rafael Pulled, thanks Tony! Len, are you fine with this route [tip:x86/ras tree] for the drivers/acpi/apei/einj.c changes? Yes, the RAS guys basically own that code. thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [GIT PULL] Some error injection fixes to queue for 3.11
Pulled, thanks Tony! Len, are you fine with this route [tip:x86/ras tree] for the drivers/acpi/apei/einj.c changes? Yes, the RAS guys basically own that code. These patches also got picked up by Rafael and are in his ACPI tree too. I think the patches were applied identically, so there should not be any merge conflicts when this all comes back together in the 3.11 merge window. Rafael already had a chat about who will take future apei changes so that we won't have this happen again. I think the only real problem with being in two places at once is if the patches _change_. When that happens, you need to get them _both_ changed, else merge conflict happens at the worst time. -L -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 09/16] ia64 idle: delete pm_idle
> > - idle = pm_idle; > > if (!idle) > > Hm, if I'm not mistaken idle will uninitialized at this point, so this > could quite likely lead to a crash. thanks, will fix. -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 09/16] ia64 idle: delete pm_idle
- idle = pm_idle; if (!idle) Hm, if I'm not mistaken idle will uninitialized at this point, so this could quite likely lead to a crash. thanks, will fix. -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] Add IVB model 3e support to /drivers/idle/intel_idle.c
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index > e872617..b0f6b4c 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -413,6 +413,7 @@ static const struct x86_cpu_id intel_idle_ids[] = { > ICPU(0x2a, idle_cpu_snb), > ICPU(0x2d, idle_cpu_snb), > ICPU(0x3a, idle_cpu_ivb), > + ICPU(0x3e, idle_cpu_ivb), > {} already in Linux-3.7 thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] Add IVB model 3e support to /drivers/idle/intel_idle.c
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index e872617..b0f6b4c 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -413,6 +413,7 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x2a, idle_cpu_snb), ICPU(0x2d, idle_cpu_snb), ICPU(0x3a, idle_cpu_ivb), + ICPU(0x3e, idle_cpu_ivb), {} already in Linux-3.7 thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.25-rc2-mm1: WARNING at arch/x86/mm/ioremap.c:129
>> evxfevnt-0091 [00] enable: Transition to >ACPI mode successful >> khelper used greatest stack depth: 3144 bytes left >> net_namespace: 304 bytes >> NET: Registered protocol family 16 >> ACPI: bus type pci registered >> khelper used greatest stack depth: 3032 bytes left >> PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1 >> PCI: Using configuration type 1 >> Setting up standard PCI resources >> evgpeblk-0956 [00] ev_create_gpe_block : GPE 00 to 0F >[_GPE] 2 regs on int 0x9 >> evgpeblk-1052 [00] ev_initialize_gpe_bloc: Found 4 Wake, >Enabled 0 Runtime GPEs in this block >> ACPI: EC: Look up EC in DSDT >> [ cut here ] >> WARNING: at arch/x86/mm/ioremap.c:129 __ioremap+0xc7/0x182() >> Modules linked in: >> Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #40 >> [] warn_on_slowpath+0x41/0x6d >> [] ? trace_hardirqs_on+0xb/0xd >> [] ? acpi_os_release_object+0x8/0xc >> [] ? acpi_ut_delete_object_desc+0x39/0x3e >> [] ? acpi_ut_delete_internal_obj+0x2c1/0x2c9 >> [] ? trace_hardirqs_on+0xb/0xd >> [] ? trace_hardirqs_on_caller+0xdf/0x100 >> [] ? trace_hardirqs_on+0xb/0xd >> [] __ioremap+0xc7/0x182 >> [] ioremap_nocache+0xa/0xc >> [] acpi_os_map_memory+0x11/0x1a >> [] acpi_ex_system_memory_space_handler+0xd3/0x228 >> [] ? acpi_ev_address_space_dispatch+0x142/0x1a8 >> [] ? acpi_ex_system_memory_space_handler+0x0/0x228 >> [] acpi_ev_address_space_dispatch+0x167/0x1a8 >> [] acpi_ex_access_region+0x1e4/0x270 >> [] acpi_ex_field_datum_io+0x153/0x2a1 >> [] ? cache_alloc_debugcheck_after+0xe9/0x165 >> [] acpi_ex_extract_from_field+0x91/0x224 >> [] ? acpi_ex_read_data_from_field+0x163/0x1b0 >> [] acpi_ex_read_data_from_field+0x180/0x1b0 >> [] acpi_ex_resolve_node_to_value+0x1aa/0x230 >> [] acpi_ex_resolve_to_value+0x270/0x2aa >> [] acpi_ex_resolve_operands+0x24e/0x52f > >Len: This WARN_ON says that ACPI is trying to call ioremap() >on memory that the e820_table >lists as "kernel owned". Do you know why ACPI would do this? >Would ACPI get upset if >the kernel would tell it to take a hike? Depends on the BIOS -- as it is the BIOS AML that is making this request. -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.25-rc2-mm1: WARNING at arch/x86/mm/ioremap.c:129
evxfevnt-0091 [00] enable: Transition to ACPI mode successful khelper used greatest stack depth: 3144 bytes left net_namespace: 304 bytes NET: Registered protocol family 16 ACPI: bus type pci registered khelper used greatest stack depth: 3032 bytes left PCI: PCI BIOS revision 2.10 entry at 0xf1180, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources evgpeblk-0956 [00] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0x9 evgpeblk-1052 [00] ev_initialize_gpe_bloc: Found 4 Wake, Enabled 0 Runtime GPEs in this block ACPI: EC: Look up EC in DSDT [ cut here ] WARNING: at arch/x86/mm/ioremap.c:129 __ioremap+0xc7/0x182() Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #40 [c0118955] warn_on_slowpath+0x41/0x6d [c0131d0a] ? trace_hardirqs_on+0xb/0xd [c01fdd89] ? acpi_os_release_object+0x8/0xc [c02188d4] ? acpi_ut_delete_object_desc+0x39/0x3e [c0217f05] ? acpi_ut_delete_internal_obj+0x2c1/0x2c9 [c0131d0a] ? trace_hardirqs_on+0xb/0xd [c0131cde] ? trace_hardirqs_on_caller+0xdf/0x100 [c0131d0a] ? trace_hardirqs_on+0xb/0xd [c01129c5] __ioremap+0xc7/0x182 [c0112a99] ioremap_nocache+0xa/0xc [c02a6877] acpi_os_map_memory+0x11/0x1a [c020b697] acpi_ex_system_memory_space_handler+0xd3/0x228 [c0203bd8] ? acpi_ev_address_space_dispatch+0x142/0x1a8 [c020b5c4] ? acpi_ex_system_memory_space_handler+0x0/0x228 [c0203bfd] acpi_ev_address_space_dispatch+0x167/0x1a8 [c02083dd] acpi_ex_access_region+0x1e4/0x270 [c02085bc] acpi_ex_field_datum_io+0x153/0x2a1 [c0158ac0] ? cache_alloc_debugcheck_after+0xe9/0x165 [c020879b] acpi_ex_extract_from_field+0x91/0x224 [c0206bcf] ? acpi_ex_read_data_from_field+0x163/0x1b0 [c0206bec] acpi_ex_read_data_from_field+0x180/0x1b0 [c020d256] acpi_ex_resolve_node_to_value+0x1aa/0x230 [c0207a32] acpi_ex_resolve_to_value+0x270/0x2aa [c0209e47] acpi_ex_resolve_operands+0x24e/0x52f Len: This WARN_ON says that ACPI is trying to call ioremap() on memory that the e820_table lists as kernel owned. Do you know why ACPI would do this? Would ACPI get upset if the kernel would tell it to take a hike? Depends on the BIOS -- as it is the BIOS AML that is making this request. -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] reverse CONFIG_ACPI_PROC_EVENT default
Thanks Hugh. -Len >-Original Message- >From: Hugh Dickins [mailto:[EMAIL PROTECTED] >Sent: Monday, August 27, 2007 11:05 AM >To: Linus Torvalds >Cc: Andrew Morton; Brown, Len; linux-kernel@vger.kernel.org >Subject: [PATCH] reverse CONFIG_ACPI_PROC_EVENT default > >Sigh. Again an ACPI assault on the Thinkpad's Fn+F4 to suspend to RAM. >The default and text for CONFIG_THINKPAD_ACPI_INPUT_ENABLED were fixed >in -rc3, but now 14e04fb34ffa82ee61ae69f98d8fca12d2e8e31c introduces >CONFIG_ACPI_PROC_EVENT default n to disable it again. Change default >to y, and add comment to make it clearer that n is for future distros. > >Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> > >--- 2.6.23-rc3-git10/drivers/acpi/Kconfig 2007-08-26 >18:10:15.0 +0100 >+++ linux/drivers/acpi/Kconfig 2007-08-26 18:59:16.0 +0100 >@@ -71,6 +71,7 @@ config ACPI_PROCFS > config ACPI_PROC_EVENT > bool "Deprecated /proc/acpi/event support" > depends on PROC_FS >+ default y > ---help--- > A user-space daemon, acpi, typically read /proc/acpi/event > and handled all ACPI sub-system generated events. >@@ -78,10 +79,13 @@ config ACPI_PROC_EVENT > These events are now delivered to user-space via > either the input layer, or as netlink events. > >-This build option enables the old code for for legacy >+This build option enables the old code for legacy > user-space implementation. After some time, this will > be moved under CONFIG_ACPI_PROCFS, and then deleted. > >+Say Y here to retain the old behaviour. Say N if your >+user-space is newer than kernel 2.6.23 (September 2007). >+ > config ACPI_AC > tristate "AC Adapter" > depends on X86 > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] reverse CONFIG_ACPI_PROC_EVENT default
Thanks Hugh. -Len -Original Message- From: Hugh Dickins [mailto:[EMAIL PROTECTED] Sent: Monday, August 27, 2007 11:05 AM To: Linus Torvalds Cc: Andrew Morton; Brown, Len; linux-kernel@vger.kernel.org Subject: [PATCH] reverse CONFIG_ACPI_PROC_EVENT default Sigh. Again an ACPI assault on the Thinkpad's Fn+F4 to suspend to RAM. The default and text for CONFIG_THINKPAD_ACPI_INPUT_ENABLED were fixed in -rc3, but now 14e04fb34ffa82ee61ae69f98d8fca12d2e8e31c introduces CONFIG_ACPI_PROC_EVENT default n to disable it again. Change default to y, and add comment to make it clearer that n is for future distros. Signed-off-by: Hugh Dickins [EMAIL PROTECTED] --- 2.6.23-rc3-git10/drivers/acpi/Kconfig 2007-08-26 18:10:15.0 +0100 +++ linux/drivers/acpi/Kconfig 2007-08-26 18:59:16.0 +0100 @@ -71,6 +71,7 @@ config ACPI_PROCFS config ACPI_PROC_EVENT bool Deprecated /proc/acpi/event support depends on PROC_FS + default y ---help--- A user-space daemon, acpi, typically read /proc/acpi/event and handled all ACPI sub-system generated events. @@ -78,10 +79,13 @@ config ACPI_PROC_EVENT These events are now delivered to user-space via either the input layer, or as netlink events. -This build option enables the old code for for legacy +This build option enables the old code for legacy user-space implementation. After some time, this will be moved under CONFIG_ACPI_PROCFS, and then deleted. +Say Y here to retain the old behaviour. Say N if your +user-space is newer than kernel 2.6.23 (September 2007). + config ACPI_AC tristate AC Adapter depends on X86 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Git Patch] ACPI: Fix a warning of discarding qualifiers from pointer target type
>On Tue, Aug 21, 2007 at 07:22:51PM +0400, Alexey Starikovskiy wrote: >> Yes, that is the proper solution with a single drawback: >> it touches ACPICA dual-licensed code and would take ages to commit, >> and Len would probably ask you to give permission to >re-license it under >> BSD. > >The latter is certainly not a problem (assuming that it's non-trivial >enough to be copyrightable, in the first place). The dual license is at the top of the file. I just need to know when you touch one of these dual-license files that Intel has your permission to ship your change to folks we send the same file to under the BSD-style non-GPL license. eg. BSD, Solaris and HPUX uses this same ACPICA files -- though in a different format. I've actually never had anybody say no when I asked. And I've had some people who play laywers on TV tell me this is not strictly necessary, but other people who play laywers on TV tell me it is best to ask, so I do. thanks, -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Git Patch] ACPI: Fix a warning of discarding qualifiers from pointer target type
On Tue, Aug 21, 2007 at 07:22:51PM +0400, Alexey Starikovskiy wrote: Yes, that is the proper solution with a single drawback: it touches ACPICA dual-licensed code and would take ages to commit, and Len would probably ask you to give permission to re-license it under BSD. The latter is certainly not a problem (assuming that it's non-trivial enough to be copyrightable, in the first place). The dual license is at the top of the file. I just need to know when you touch one of these dual-license files that Intel has your permission to ship your change to folks we send the same file to under the BSD-style non-GPL license. eg. BSD, Solaris and HPUX uses this same ACPICA files -- though in a different format. I've actually never had anybody say no when I asked. And I've had some people who play laywers on TV tell me this is not strictly necessary, but other people who play laywers on TV tell me it is best to ask, so I do. thanks, -Len - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH][ACPI][BUTTON] remove procfs-interface
The last time I tried to remove this code, we discovered that people were using it to query the lid (button) status. -Len >-Original Message- >From: Richard Hughes [mailto:[EMAIL PROTECTED] >Sent: Thursday, July 12, 2007 6:26 AM >To: Zhang, Rui >Cc: Henne; Arjan van de Ven; Brown, Len; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org >Subject: RE: [PATCH][ACPI][BUTTON] remove procfs-interface > >On Thu, 2007-07-12 at 17:46 +0800, Zhang, Rui wrote: >> >> I'm not sure if the button sysfs I/F is already finished. >> We'd better make a double check. :) > >We need a button sysfs interface? What's wrong with just using input? > >Richard. > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH][ACPI][BUTTON] remove procfs-interface
The last time I tried to remove this code, we discovered that people were using it to query the lid (button) status. -Len -Original Message- From: Richard Hughes [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 6:26 AM To: Zhang, Rui Cc: Henne; Arjan van de Ven; Brown, Len; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: RE: [PATCH][ACPI][BUTTON] remove procfs-interface On Thu, 2007-07-12 at 17:46 +0800, Zhang, Rui wrote: I'm not sure if the button sysfs I/F is already finished. We'd better make a double check. :) We need a button sysfs interface? What's wrong with just using input? Richard. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: drivers/video/output.c
>Asides from git-bisect failing me again[1], what gives with this file? it supports output switching, which didn't make it into 2.6.21 -- probably will be 2.6.22. -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/