Re: i915.ko WC writes are slow after ea8596bb2d8d379
* Chris Wilson wrote: > > > A bisection pointed to > > > > > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > > > Author: Masami Hiramatsu > > > Date: Thu Jul 18 20:47:53 2013 +0900 > > > > > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > > > functions > > > > > > of which the active ingredient was just > > > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > index b32ebf9..f4001e0 100644 > > > --- a/arch/x86/Kconfig > > > +++ b/arch/x86/Kconfig > > > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > > > > > config HAVE_TEXT_POKE_SMP > > > bool > > > - select STOP_MACHINE if SMP Ouch... This is certainly an educative example of how pure 'code removal' patches can have unintended side effects. Is there a full fix patch available, and is anyone pushing that to Linus? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
* Andy Lutomirski wrote: > On Wed, Nov 18, 2015 at 6:48 AM, Chris Wilson > wrote: > > Although > > > > diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h > > index d2abbdb..ff4f029 100644 > > --- a/include/linux/stop_machine.h > > +++ b/include/linux/stop_machine.h > > @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask > > *cpumask, > > * grabbing every spinlock (and more). So the "read" side to such a > > * lock is anything which disables preemption. > > */ > > -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) > > +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) > > [...] > > This seems much better. Having a set of stop_machine functions around > that don't work depending on config seems dangerous. Agreed. Acked-by: Ingo Molnar Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
* Andy Lutomirskiwrote: > On Wed, Nov 18, 2015 at 6:48 AM, Chris Wilson > wrote: > > Although > > > > diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h > > index d2abbdb..ff4f029 100644 > > --- a/include/linux/stop_machine.h > > +++ b/include/linux/stop_machine.h > > @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask > > *cpumask, > > * grabbing every spinlock (and more). So the "read" side to such a > > * lock is anything which disables preemption. > > */ > > -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) > > +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) > > [...] > > This seems much better. Having a set of stop_machine functions around > that don't work depending on config seems dangerous. Agreed. Acked-by: Ingo Molnar Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
* Chris Wilsonwrote: > > > A bisection pointed to > > > > > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > > > Author: Masami Hiramatsu > > > Date: Thu Jul 18 20:47:53 2013 +0900 > > > > > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > > > functions > > > > > > of which the active ingredient was just > > > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > > index b32ebf9..f4001e0 100644 > > > --- a/arch/x86/Kconfig > > > +++ b/arch/x86/Kconfig > > > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > > > > > config HAVE_TEXT_POKE_SMP > > > bool > > > - select STOP_MACHINE if SMP Ouch... This is certainly an educative example of how pure 'code removal' patches can have unintended side effects. Is there a full fix patch available, and is anyone pushing that to Linus? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Nov 18, 2015 at 6:48 AM, Chris Wilson wrote: > Although > > diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h > index d2abbdb..ff4f029 100644 > --- a/include/linux/stop_machine.h > +++ b/include/linux/stop_machine.h > @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask > *cpumask, > * grabbing every spinlock (and more). So the "read" side to such a > * lock is anything which disables preemption. > */ > -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) > +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) [...] This seems much better. Having a set of stop_machine functions around that don't work depending on config seems dangerous. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 05:10:59AM -0500, Chuck Ebbert wrote: > On Wed, 8 Oct 2014 10:03:36 +0100 > Chris Wilson wrote: > > > > > I ran into a problem on a Sandybridge i5-2500s whilst measuring the > > performance of GTT write-combining access. I found subsequent runs were > > about 10-40x slower than the first. For example, > > > > igt/gem_gtt_speed: > > > > Time to read 16k through a GTT map: 325.285µs > > Time to write 16k through a GTT map: 4.729µs > > Time to clear 16k through a GTT map: 4.584µs > > Time to clear 16k through a cached GTT map: 1.342µs > > > > on the second run became: > > > > Time to read 16k through a GTT map: 332.148µs > > Time to write 16k through a GTT map:209.411µs > > Time to clear 16k through a GTT map: 56.460µs > > Time to clear 16k through a cached GTT map: 50.897µs > > > > Naively I would say that we lost the wc on our ioremap. > > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > > runs. > > > > A bisection pointed to > > > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > > Author: Masami Hiramatsu > > Date: Thu Jul 18 20:47:53 2013 +0900 > > > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > > functions > > > > of which the active ingredient was just > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index b32ebf9..f4001e0 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > > > config HAVE_TEXT_POKE_SMP > > bool > > - select STOP_MACHINE if SMP > > > > config X86_DEV_DMA_OPS > > bool > > > > and adding that back into the current build, e.g. > > Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of > sync and your results depend on which CPU the test runs on? (From the other reply, it did and is still required). I have run into other issues where stop_machine() tries to only do a irq-disabled callback on the local CPU as opposed to halting all CPUs and running the callback universally. My understanding is that the root cause of the issue is: diff --git a/init/Kconfig b/init/Kconfig index af09b4f..8235e0b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1993,8 +1993,7 @@ config INIT_ALL_POSSIBLE config STOP_MACHINE bool - default y - depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU + default y if SMP || HOTPLUG_CPU help Need stop_machine() primitive. Although diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h index d2abbdb..ff4f029 100644 --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask *cpumask, * grabbing every spinlock (and more). So the "read" side to such a * lock is anything which disables preemption. */ -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) /** * stop_machine: freeze the machine on all CPUs and run this function @@ -128,7 +128,7 @@ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus); int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data, const struct cpumask *cpus); -#else /* CONFIG_STOP_MACHINE && CONFIG_SMP */ +#else /* CONFIG_SMP */ static inline int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) @@ -153,5 +153,5 @@ static inline int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data, return __stop_machine(fn, data, cpus); } -#endif /* CONFIG_STOP_MACHINE && CONFIG_SMP */ +#endif /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */ #endif /* _LINUX_STOP_MACHINE */ diff --git a/init/Kconfig b/init/Kconfig index af09b4f..44600a8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1991,13 +1991,6 @@ config INIT_ALL_POSSIBLE it was better to provide this option than to break all the archs and have several arch maintainers pursuing me down dark alleys. -config STOP_MACHINE - bool - default y - depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU - help - Need stop_machine() primitive. - source "block/Kconfig" config PREEMPT_NOTIFIERS diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index fd643d8..2dd1f306 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -513,7 +513,7 @@ static int __init cpu_stop_init(void) } early_initcall(cpu_stop_init); -#ifdef CONFIG_STOP_MACHINE +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) { @@ -613,4 +613,4 @@ int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data, return ret ?: done.ret; } -#endif /* CONFIG_STOP_MACHINE
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Nov 18, 2015 at 6:48 AM, Chris Wilsonwrote: > Although > > diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h > index d2abbdb..ff4f029 100644 > --- a/include/linux/stop_machine.h > +++ b/include/linux/stop_machine.h > @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask > *cpumask, > * grabbing every spinlock (and more). So the "read" side to such a > * lock is anything which disables preemption. > */ > -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) > +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) [...] This seems much better. Having a set of stop_machine functions around that don't work depending on config seems dangerous. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 05:10:59AM -0500, Chuck Ebbert wrote: > On Wed, 8 Oct 2014 10:03:36 +0100 > Chris Wilsonwrote: > > > > > I ran into a problem on a Sandybridge i5-2500s whilst measuring the > > performance of GTT write-combining access. I found subsequent runs were > > about 10-40x slower than the first. For example, > > > > igt/gem_gtt_speed: > > > > Time to read 16k through a GTT map: 325.285µs > > Time to write 16k through a GTT map: 4.729µs > > Time to clear 16k through a GTT map: 4.584µs > > Time to clear 16k through a cached GTT map: 1.342µs > > > > on the second run became: > > > > Time to read 16k through a GTT map: 332.148µs > > Time to write 16k through a GTT map:209.411µs > > Time to clear 16k through a GTT map: 56.460µs > > Time to clear 16k through a cached GTT map: 50.897µs > > > > Naively I would say that we lost the wc on our ioremap. > > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > > runs. > > > > A bisection pointed to > > > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > > Author: Masami Hiramatsu > > Date: Thu Jul 18 20:47:53 2013 +0900 > > > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > > functions > > > > of which the active ingredient was just > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index b32ebf9..f4001e0 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > > > config HAVE_TEXT_POKE_SMP > > bool > > - select STOP_MACHINE if SMP > > > > config X86_DEV_DMA_OPS > > bool > > > > and adding that back into the current build, e.g. > > Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of > sync and your results depend on which CPU the test runs on? (From the other reply, it did and is still required). I have run into other issues where stop_machine() tries to only do a irq-disabled callback on the local CPU as opposed to halting all CPUs and running the callback universally. My understanding is that the root cause of the issue is: diff --git a/init/Kconfig b/init/Kconfig index af09b4f..8235e0b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1993,8 +1993,7 @@ config INIT_ALL_POSSIBLE config STOP_MACHINE bool - default y - depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU + default y if SMP || HOTPLUG_CPU help Need stop_machine() primitive. Although diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h index d2abbdb..ff4f029 100644 --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -97,7 +97,7 @@ static inline int try_stop_cpus(const struct cpumask *cpumask, * grabbing every spinlock (and more). So the "read" side to such a * lock is anything which disables preemption. */ -#if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) /** * stop_machine: freeze the machine on all CPUs and run this function @@ -128,7 +128,7 @@ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus); int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data, const struct cpumask *cpus); -#else /* CONFIG_STOP_MACHINE && CONFIG_SMP */ +#else /* CONFIG_SMP */ static inline int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) @@ -153,5 +153,5 @@ static inline int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data, return __stop_machine(fn, data, cpus); } -#endif /* CONFIG_STOP_MACHINE && CONFIG_SMP */ +#endif /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */ #endif /* _LINUX_STOP_MACHINE */ diff --git a/init/Kconfig b/init/Kconfig index af09b4f..44600a8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1991,13 +1991,6 @@ config INIT_ALL_POSSIBLE it was better to provide this option than to break all the archs and have several arch maintainers pursuing me down dark alleys. -config STOP_MACHINE - bool - default y - depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU - help - Need stop_machine() primitive. - source "block/Kconfig" config PREEMPT_NOTIFIERS diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index fd643d8..2dd1f306 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -513,7 +513,7 @@ static int __init cpu_stop_init(void) } early_initcall(cpu_stop_init); -#ifdef CONFIG_STOP_MACHINE +#if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU) int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) { @@ -613,4 +613,4 @@ int stop_machine_from_inactive_cpu(int (*fn)(void *), void *data,
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, Oct 09, 2014 at 09:46:37AM -0500, Chuck Ebbert wrote: > Well they're all the same. > > Hmm, x86info is not dumping all the variable MTRRs. You have 10, but > it only prints the first 8. I don't know if it will show anything > different, but can you try fixing it with this patch? Source (https://github.com/dankamongmen/x86info) was slightly different, but I followed the drift. tldr: 8,9 appear to be identical on all cpus as well. $ sudo ./x86info --mtrr --all-cpus x86info v1.31pre Found 4 CPUs. CPU #1: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Core i7 (SandyBridge) Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a wc:1 fix:1 vcnt:10 MTRRphysBase0 (0x200): 0x0006 (physbase:0x00 type: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase1 (0x202): 0x8006 (physbase:0x08 type: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase2 (0x204): 0x8e00 (physbase:0x08e000 type: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask:0xffe000 valid:1) MTRRphysBase3 (0x206): 0x8d00 (physbase:0x08d000 type: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase4 (0x208): 0x00010006 (physbase:0x10 type: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase5 (0x20a): 0x00017000 (physbase:0x17 type: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase:0x16f000 type: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase:0x16e800 type: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask:0xfff800 valid:1) MTRRphysBase8 (0x210): 0x00016e60 (physbase:0x16e600 type: 0x00 (uncacheable)) MTRRphysMask8 (0x211): 0x000fffe00800 (physmask:0xfffe00 valid:1) MTRRphysBase9 (0x212): 0x (physbase:0x00 type: 0x00 (uncacheable)) MTRRphysMask9 (0x213): 0x (physmask:0x00 valid:0) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259): 0x MTRRfix4K_C8000 (0x269): 0x0505050505050505 MTRRfix4K_D 0x26a: 0x MTRRfix4K_D8000 0x26b: 0x MTRRfix4K_E 0x26c: 0x MTRRfix4K_E8000 0x26d: 0x0505050505050505 MTRRfix4K_F 0x26e: 0x0505050505050505 MTRRfix4K_F8000 0x26f: 0x0505050505050505 MTRRdefType (0x2ff): 0x0c00 (fixed-range flag:1 enable flag:1 default type:0x00 (uncacheable)) -- CPU #2: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Core i7 (SandyBridge) Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a wc:1 fix:1 vcnt:10 MTRRphysBase0 (0x200): 0x0006 (physbase:0x00 type: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase1 (0x202): 0x8006 (physbase:0x08 type: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase2 (0x204): 0x8e00 (physbase:0x08e000 type: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask:0xffe000 valid:1) MTRRphysBase3 (0x206): 0x8d00 (physbase:0x08d000 type: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase4 (0x208): 0x00010006 (physbase:0x10 type: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase5 (0x20a): 0x00017000 (physbase:0x17 type: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase:0x16f000 type: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase:0x16e800 type: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask:0xfff800 valid:1) MTRRphysBase8 (0x210): 0x00016e60 (physbase:0x16e600 type: 0x00 (uncacheable)) MTRRphysMask8 (0x211): 0x000fffe00800 (physmask:0xfffe00 valid:1) MTRRphysBase9 (0x212): 0x (physbase:0x00 type: 0x00 (uncacheable)) MTRRphysMask9 (0x213): 0x (physmask:0x00 valid:0) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, 9 Oct 2014 14:00:47 +0100 Chris Wilson wrote: > On Thu, Oct 09, 2014 at 07:44:16AM -0500, Chuck Ebbert wrote: > > Could you try installing x86info and running "x86info --mtrr > > --all-cpus" while running the broken kernel? > > # /opt/xorg/src/intel-gpu-tools/tests/gem_gtt_speed > IGT-Version: 1.8-g32a0308 (x86_64) (Linux: 3.17.0+ x86_64) > Time to read 16k through a GTT map: 318.643µs > Time to write 16k through a GTT map:203.103µs > Time to clear 16k through a GTT map: 53.098µs > Time to clear 16k through a cached GTT map: 49.925µs > > (i.e. bad kernel) > > # x86info --mtrr --all-cpus > x86info v1.30. Dave Jones 2001-2011 > Feedback to . > > Found 4 CPUs. > CPU #1: > Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 > Type: 0 (Original OEM) > CPU Model (x86info's best guess): Unknown model. > Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ > 3.30GHz > > MTRR registers: > MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: > 0x1, vcnt field: 0x0a (10)) > MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type > field: 0x06 (write-back)) > MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid > flag: 1) > MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type > field: 0x06 (write-back)) > MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid > flag: 1) > MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type > field: 0x00 (uncacheable)) > MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid > flag: 1) > MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type > field: 0x00 (uncacheable)) > MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid > flag: 1) > MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type > field: 0x06 (write-back)) > MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid > flag: 1) > MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type > field: 0x00 (uncacheable)) > MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid > flag: 1) > MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type > field: 0x00 (uncacheable)) > MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid > flag: 1) > MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type > field: 0x00 (uncacheable)) > MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid > flag: 1) > MTRRfix64K_0 (0x250): 0x0606060606060606 > MTRRfix16K_8 (0x258): 0x0606060606060606 > MTRRfix16K_A (0x259): 0x > MTRRfix4K_C8000 (0x269): 0x0505050505050505 > MTRRfix4K_D 0x26a: 0x > MTRRfix4K_D8000 0x26b: 0x > MTRRfix4K_E 0x26c: 0x > MTRRfix4K_E8000 0x26d: 0x0505050505050505 > MTRRfix4K_F 0x26e: 0x0505050505050505 > MTRRfix4K_F8000 0x26f: 0x0505050505050505 > MTRRdefType (0x2ff): 0x0c00 (fixed-range flag: 0x1, mtrr flag: > 0x1, type field: 0x00 (uncacheable)) > Well they're all the same. Hmm, x86info is not dumping all the variable MTRRs. You have 10, but it only prints the first 8. I don't know if it will show anything different, but can you try fixing it with this patch? --- a/mtrr.c +++ b/mtrr.c @@ -75,19 +75,23 @@ printf("0x%016llx\n", val); } -static void decode_mtrrcap(int cpu, int msr) +unsigned int decode_mtrrcap(int cpu, int msr) { unsigned long long val; + unsigned int vcnt = 0; int ret; ret = mtrr_value(cpu,msr,); if (ret) { + vcnt = (unsigned int)(val & IA32_MTRRCAP_VCNT); printf("0x%016llx ", val); printf("(smrr flag: 0x%01x, ",(unsigned int) (val & IA32_MTRRCAP_SMRR) >> 11 ); printf("wc flag: 0x%01x, ",(unsigned int) (val_MTRRCAP_WC) >> 10); printf("fix flag: 0x%01x, ",(unsigned int) (val_MTRRCAP_FIX) >> 8); - printf("vcnt field: 0x%02x (%d))\n",(unsigned int) (val_MTRRCAP_VCNT) , (int) (val_MTRRCAP_VCNT)); + printf("vcnt field: 0x%02x (%u))\n", vcnt, vcnt); } + + return vcnt; } static void decode_mtrr_deftype(int cpu, int msr) @@ -142,7 +146,7 @@ void dump_mtrrs(struct cpudata *cpu) { unsigned long long val = 0; - unsigned int i; + unsigned int i, vcnt; if (!(cpu->flags_edx & (X86_FEATURE_MTRR))) return; @@ -157,11 +161,11 @@ printf("MTRR registers:\n"); printf("MTRRcap (0xfe): "); - decode_mtrrcap(cpu->number, 0xfe); + vcnt = decode_mtrrcap(cpu->number, 0xfe); set_max_phy_addr(cpu); - for (i = 0; i < 16; i+=2) { + for (i = 0; i < 2 * vcnt; i += 2) { printf("MTRRphysBase%u (0x%x): ", i/2, (unsigned int)
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, Oct 09, 2014 at 07:44:16AM -0500, Chuck Ebbert wrote: > Could you try installing x86info and running "x86info --mtrr > --all-cpus" while running the broken kernel? # /opt/xorg/src/intel-gpu-tools/tests/gem_gtt_speed IGT-Version: 1.8-g32a0308 (x86_64) (Linux: 3.17.0+ x86_64) Time to read 16k through a GTT map: 318.643µs Time to write 16k through a GTT map:203.103µs Time to clear 16k through a GTT map: 53.098µs Time to clear 16k through a cached GTT map: 49.925µs (i.e. bad kernel) # x86info --mtrr --all-cpus x86info v1.30. Dave Jones 2001-2011 Feedback to . Found 4 CPUs. CPU #1: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Unknown model. Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: 0x1, vcnt field: 0x0a (10)) MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type field: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type field: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type field: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid flag: 1) MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type field: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type field: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type field: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type field: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type field: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid flag: 1) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259): 0x MTRRfix4K_C8000 (0x269): 0x0505050505050505 MTRRfix4K_D 0x26a: 0x MTRRfix4K_D8000 0x26b: 0x MTRRfix4K_E 0x26c: 0x MTRRfix4K_E8000 0x26d: 0x0505050505050505 MTRRfix4K_F 0x26e: 0x0505050505050505 MTRRfix4K_F8000 0x26f: 0x0505050505050505 MTRRdefType (0x2ff): 0x0c00 (fixed-range flag: 0x1, mtrr flag: 0x1, type field: 0x00 (uncacheable)) -- CPU #2: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Unknown model. Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: 0x1, vcnt field: 0x0a (10)) MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type field: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type field: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type field: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid flag: 1) MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type field: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type field: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type field: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type field: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type field: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid flag: 1) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259):
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, 9 Oct 2014 07:53:31 +0100 Chris Wilson wrote: > # cat /proc/mtrr > reg00: base=0x0 (0MB), size= 2048MB, count=1: write-back > reg01: base=0x08000 ( 2048MB), size= 256MB, count=1: write-back > reg02: base=0x08e00 ( 2272MB), size= 32MB, count=1: uncachable > reg03: base=0x08d00 ( 2256MB), size= 16MB, count=1: uncachable > reg04: base=0x1 ( 4096MB), size= 2048MB, count=1: write-back > reg05: base=0x17000 ( 5888MB), size= 256MB, count=1: uncachable > reg06: base=0x16f00 ( 5872MB), size= 16MB, count=1: uncachable > reg07: base=0x16e80 ( 5864MB), size=8MB, count=1: uncachable > reg08: base=0x16e60 ( 5862MB), size=2MB, count=1: uncachable > Well that's what the kernel thinks is in every CPU. Could you try installing x86info and running "x86info --mtrr --all-cpus" while running the broken kernel? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 02:36:49PM -0700, H. Peter Anvin wrote: > On 10/08/2014 12:49 PM, Chris Wilson wrote: > > > > Indeed, this appears to be the explanation. (And here I thought PAT > > superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its > > GTT quite a while ago.) > > > > Replacing the stop_machine there with on_each_cpu does the trick: > > > > It should, but there seem to be quite a few drivers which still muck > with MTRRs. However, i915 is not one of them, it calls > io_mapping_create_wc() followed by arch_phys_wc_add(), so I'm wondering > what the heck is going on here. This system also have a radeon GPU. Disabling it (not building in the module) makes no difference to the wc speed. > > Naively I would say that we lost the wc on our ioremap. > > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > > runs. > > Could you tell me what the above looks like? # cat /sys/kernel/debug/x86/pat_memtype_list PAT memtype list: write-back @ 0x8cf34000-0x8cf43000 write-back @ 0x8cf4d000-0x8cf4e000 write-back @ 0x8cf4d000-0x8cf5 write-back @ 0x8cf5-0x8cf51000 write-back @ 0x8cf51000-0x8cf52000 write-back @ 0x8cf52000-0x8cf53000 write-back @ 0x8cf53000-0x8cf55000 write-back @ 0x8cf55000-0x8cf56000 write-back @ 0x8cf9d000-0x8cf9e000 write-back @ 0x8cf9f000-0x8cfa write-back @ 0x8cffc000-0x8cffd000 uncached-minus @ 0x8fc0-0x8fe0 write-combining @ 0x8fe0-0x9000 uncached-minus @ 0x9022-0x9024 uncached-minus @ 0x9030-0x9032 uncached-minus @ 0x9034-0x90341000 uncached-minus @ 0x9038-0x90381000 write-combining @ 0xa000-0xc000 write-combining @ 0xa0139000-0xa0159000 write-combining @ 0xa0159000-0xa0179000 write-combining @ 0xa0179000-0xa0199000 write-combining @ 0xc004-0xc025e000 write-combining @ 0xc025e000-0xc045e000 write-combining @ 0xc045e000-0xc045f000 write-combining @ 0xc045f000-0xc075f000 uncached-minus @ 0xf800-0xfc00 uncached-minus @ 0xfed0-0xfed01000 uncached-minus @ 0xfed1-0xfed16000 uncached-minus @ 0xfed1f000-0xfed2 (identical for good/bad runs) # cat /proc/mtrr reg00: base=0x0 (0MB), size= 2048MB, count=1: write-back reg01: base=0x08000 ( 2048MB), size= 256MB, count=1: write-back reg02: base=0x08e00 ( 2272MB), size= 32MB, count=1: uncachable reg03: base=0x08d00 ( 2256MB), size= 16MB, count=1: uncachable reg04: base=0x1 ( 4096MB), size= 2048MB, count=1: write-back reg05: base=0x17000 ( 5888MB), size= 256MB, count=1: uncachable reg06: base=0x16f00 ( 5872MB), size= 16MB, count=1: uncachable reg07: base=0x16e80 ( 5864MB), size=8MB, count=1: uncachable reg08: base=0x16e60 ( 5862MB), size=2MB, count=1: uncachable # cat /proc/iomem: -0fff : reserved 1000-0009bbff : System RAM 0009bc00-0009 : reserved 000a-000b : PCI Bus :00 000c-000cdfff : Video ROM 000d-000d3fff : PCI Bus :00 000d4000-000d7fff : PCI Bus :00 000d8000-000dbfff : PCI Bus :00 000dc000-000d : PCI Bus :00 000e-000f : reserved 000e-000e3fff : PCI Bus :00 000e4000-000e7fff : PCI Bus :00 000f-000f : System ROM 0010-1fff : System RAM 0100-0161981b : Kernel code 0161981c-01ca20ff : Kernel data 01dac000-01e2dfff : Kernel bss 2000-201f : reserved 2000-201f : pnp 00:05 2020-3fff : System RAM 4000-401f : reserved 4000-401f : pnp 00:05 4020-8ccd2fff : System RAM 8ccd3000-8cd66fff : reserved 8cd67000-8cfe6fff : ACPI Non-volatile Storage 8cfe7000-8cffefff : ACPI Tables 8cfff000-8cff : System RAM 8d00-8f9f : reserved 8da0-8f9f : Graphics Stolen Memory 8fa0-feaf : PCI Bus :00 8fa0-8fa00fff : pnp 00:03 8fc0-8fff : :00:02.0 9000-900f : PCI Bus :04 9000-900f : PCI Bus :05 9000-90003fff : :05:00.0 9001-900107ff : :05:00.0 9010-901f : PCI Bus :03 9010-90101fff : :03:00.0 9020-902f : PCI Bus :01 9020-9021 : :01:00.0 9022-9023 : :01:00.0 9024-90243fff : :01:00.1 9030-9031 : :00:19.0 9030-9031 : e1000e 9033-903300ff : :00:1f.3 9034-903407ff : :00:1f.2 9034-903407ff : ahci 9035-903503ff : :00:1d.0 9036-90363fff : :00:1b.0 9037-903703ff : :00:1a.0 9038-90380fff : :00:19.0 9038-90380fff : e1000e 9039-90390fff : :00:16.3 903a-903a000f : :00:16.0 a000-bfff : :00:02.0 c000-cfff : PCI Bus :01 c000-cfff : :01:00.0 f800-fbff : PCI MMCONFIG [bus 00-3f] f800-fbff : reserved f800-fbff : pnp 00:03 fec0-fec00fff : reserved fec0-fec003ff : IOAPIC 0 fed0-fed003ff : HPET 0 fed0-fed003ff : PNP0103:00
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 02:36:49PM -0700, H. Peter Anvin wrote: On 10/08/2014 12:49 PM, Chris Wilson wrote: Indeed, this appears to be the explanation. (And here I thought PAT superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its GTT quite a while ago.) Replacing the stop_machine there with on_each_cpu does the trick: It should, but there seem to be quite a few drivers which still muck with MTRRs. However, i915 is not one of them, it calls io_mapping_create_wc() followed by arch_phys_wc_add(), so I'm wondering what the heck is going on here. This system also have a radeon GPU. Disabling it (not building in the module) makes no difference to the wc speed. Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. Could you tell me what the above looks like? # cat /sys/kernel/debug/x86/pat_memtype_list PAT memtype list: write-back @ 0x8cf34000-0x8cf43000 write-back @ 0x8cf4d000-0x8cf4e000 write-back @ 0x8cf4d000-0x8cf5 write-back @ 0x8cf5-0x8cf51000 write-back @ 0x8cf51000-0x8cf52000 write-back @ 0x8cf52000-0x8cf53000 write-back @ 0x8cf53000-0x8cf55000 write-back @ 0x8cf55000-0x8cf56000 write-back @ 0x8cf9d000-0x8cf9e000 write-back @ 0x8cf9f000-0x8cfa write-back @ 0x8cffc000-0x8cffd000 uncached-minus @ 0x8fc0-0x8fe0 write-combining @ 0x8fe0-0x9000 uncached-minus @ 0x9022-0x9024 uncached-minus @ 0x9030-0x9032 uncached-minus @ 0x9034-0x90341000 uncached-minus @ 0x9038-0x90381000 write-combining @ 0xa000-0xc000 write-combining @ 0xa0139000-0xa0159000 write-combining @ 0xa0159000-0xa0179000 write-combining @ 0xa0179000-0xa0199000 write-combining @ 0xc004-0xc025e000 write-combining @ 0xc025e000-0xc045e000 write-combining @ 0xc045e000-0xc045f000 write-combining @ 0xc045f000-0xc075f000 uncached-minus @ 0xf800-0xfc00 uncached-minus @ 0xfed0-0xfed01000 uncached-minus @ 0xfed1-0xfed16000 uncached-minus @ 0xfed1f000-0xfed2 (identical for good/bad runs) # cat /proc/mtrr reg00: base=0x0 (0MB), size= 2048MB, count=1: write-back reg01: base=0x08000 ( 2048MB), size= 256MB, count=1: write-back reg02: base=0x08e00 ( 2272MB), size= 32MB, count=1: uncachable reg03: base=0x08d00 ( 2256MB), size= 16MB, count=1: uncachable reg04: base=0x1 ( 4096MB), size= 2048MB, count=1: write-back reg05: base=0x17000 ( 5888MB), size= 256MB, count=1: uncachable reg06: base=0x16f00 ( 5872MB), size= 16MB, count=1: uncachable reg07: base=0x16e80 ( 5864MB), size=8MB, count=1: uncachable reg08: base=0x16e60 ( 5862MB), size=2MB, count=1: uncachable # cat /proc/iomem: -0fff : reserved 1000-0009bbff : System RAM 0009bc00-0009 : reserved 000a-000b : PCI Bus :00 000c-000cdfff : Video ROM 000d-000d3fff : PCI Bus :00 000d4000-000d7fff : PCI Bus :00 000d8000-000dbfff : PCI Bus :00 000dc000-000d : PCI Bus :00 000e-000f : reserved 000e-000e3fff : PCI Bus :00 000e4000-000e7fff : PCI Bus :00 000f-000f : System ROM 0010-1fff : System RAM 0100-0161981b : Kernel code 0161981c-01ca20ff : Kernel data 01dac000-01e2dfff : Kernel bss 2000-201f : reserved 2000-201f : pnp 00:05 2020-3fff : System RAM 4000-401f : reserved 4000-401f : pnp 00:05 4020-8ccd2fff : System RAM 8ccd3000-8cd66fff : reserved 8cd67000-8cfe6fff : ACPI Non-volatile Storage 8cfe7000-8cffefff : ACPI Tables 8cfff000-8cff : System RAM 8d00-8f9f : reserved 8da0-8f9f : Graphics Stolen Memory 8fa0-feaf : PCI Bus :00 8fa0-8fa00fff : pnp 00:03 8fc0-8fff : :00:02.0 9000-900f : PCI Bus :04 9000-900f : PCI Bus :05 9000-90003fff : :05:00.0 9001-900107ff : :05:00.0 9010-901f : PCI Bus :03 9010-90101fff : :03:00.0 9020-902f : PCI Bus :01 9020-9021 : :01:00.0 9022-9023 : :01:00.0 9024-90243fff : :01:00.1 9030-9031 : :00:19.0 9030-9031 : e1000e 9033-903300ff : :00:1f.3 9034-903407ff : :00:1f.2 9034-903407ff : ahci 9035-903503ff : :00:1d.0 9036-90363fff : :00:1b.0 9037-903703ff : :00:1a.0 9038-90380fff : :00:19.0 9038-90380fff : e1000e 9039-90390fff : :00:16.3 903a-903a000f : :00:16.0 a000-bfff : :00:02.0 c000-cfff : PCI Bus :01 c000-cfff : :01:00.0 f800-fbff : PCI MMCONFIG [bus 00-3f] f800-fbff : reserved f800-fbff : pnp 00:03 fec0-fec00fff : reserved fec0-fec003ff : IOAPIC 0 fed0-fed003ff : HPET 0 fed0-fed003ff : PNP0103:00 fed1-fed13fff : reserved
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, 9 Oct 2014 07:53:31 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: # cat /proc/mtrr reg00: base=0x0 (0MB), size= 2048MB, count=1: write-back reg01: base=0x08000 ( 2048MB), size= 256MB, count=1: write-back reg02: base=0x08e00 ( 2272MB), size= 32MB, count=1: uncachable reg03: base=0x08d00 ( 2256MB), size= 16MB, count=1: uncachable reg04: base=0x1 ( 4096MB), size= 2048MB, count=1: write-back reg05: base=0x17000 ( 5888MB), size= 256MB, count=1: uncachable reg06: base=0x16f00 ( 5872MB), size= 16MB, count=1: uncachable reg07: base=0x16e80 ( 5864MB), size=8MB, count=1: uncachable reg08: base=0x16e60 ( 5862MB), size=2MB, count=1: uncachable Well that's what the kernel thinks is in every CPU. Could you try installing x86info and running x86info --mtrr --all-cpus while running the broken kernel? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, Oct 09, 2014 at 07:44:16AM -0500, Chuck Ebbert wrote: Could you try installing x86info and running x86info --mtrr --all-cpus while running the broken kernel? # /opt/xorg/src/intel-gpu-tools/tests/gem_gtt_speed IGT-Version: 1.8-g32a0308 (x86_64) (Linux: 3.17.0+ x86_64) Time to read 16k through a GTT map: 318.643µs Time to write 16k through a GTT map:203.103µs Time to clear 16k through a GTT map: 53.098µs Time to clear 16k through a cached GTT map: 49.925µs (i.e. bad kernel) # x86info --mtrr --all-cpus x86info v1.30. Dave Jones 2001-2011 Feedback to da...@redhat.com. Found 4 CPUs. CPU #1: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Unknown model. Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: 0x1, vcnt field: 0x0a (10)) MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type field: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type field: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type field: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid flag: 1) MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type field: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type field: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type field: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type field: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type field: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid flag: 1) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259): 0x MTRRfix4K_C8000 (0x269): 0x0505050505050505 MTRRfix4K_D 0x26a: 0x MTRRfix4K_D8000 0x26b: 0x MTRRfix4K_E 0x26c: 0x MTRRfix4K_E8000 0x26d: 0x0505050505050505 MTRRfix4K_F 0x26e: 0x0505050505050505 MTRRfix4K_F8000 0x26f: 0x0505050505050505 MTRRdefType (0x2ff): 0x0c00 (fixed-range flag: 0x1, mtrr flag: 0x1, type field: 0x00 (uncacheable)) -- CPU #2: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Unknown model. Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: 0x1, vcnt field: 0x0a (10)) MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type field: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type field: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type field: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid flag: 1) MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type field: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type field: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type field: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type field: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type field: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid flag: 1) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, 9 Oct 2014 14:00:47 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: On Thu, Oct 09, 2014 at 07:44:16AM -0500, Chuck Ebbert wrote: Could you try installing x86info and running x86info --mtrr --all-cpus while running the broken kernel? # /opt/xorg/src/intel-gpu-tools/tests/gem_gtt_speed IGT-Version: 1.8-g32a0308 (x86_64) (Linux: 3.17.0+ x86_64) Time to read 16k through a GTT map: 318.643µs Time to write 16k through a GTT map:203.103µs Time to clear 16k through a GTT map: 53.098µs Time to clear 16k through a cached GTT map: 49.925µs (i.e. bad kernel) # x86info --mtrr --all-cpus x86info v1.30. Dave Jones 2001-2011 Feedback to da...@redhat.com. Found 4 CPUs. CPU #1: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Unknown model. Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a (smrr flag: 0x1, wc flag: 0x1, fix flag: 0x1, vcnt field: 0x0a (10)) MTRRphysBase0 (0x200): 0x0006 (physbase field:0x00 type field: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase1 (0x202): 0x8006 (physbase field:0x08 type field: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase2 (0x204): 0x8e00 (physbase field:0x08e000 type field: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask field:0xffe000 valid flag: 1) MTRRphysBase3 (0x206): 0x8d00 (physbase field:0x08d000 type field: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase4 (0x208): 0x00010006 (physbase field:0x10 type field: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask field:0xf8 valid flag: 1) MTRRphysBase5 (0x20a): 0x00017000 (physbase field:0x17 type field: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask field:0xff valid flag: 1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase field:0x16f000 type field: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask field:0xfff000 valid flag: 1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase field:0x16e800 type field: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask field:0xfff800 valid flag: 1) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259): 0x MTRRfix4K_C8000 (0x269): 0x0505050505050505 MTRRfix4K_D 0x26a: 0x MTRRfix4K_D8000 0x26b: 0x MTRRfix4K_E 0x26c: 0x MTRRfix4K_E8000 0x26d: 0x0505050505050505 MTRRfix4K_F 0x26e: 0x0505050505050505 MTRRfix4K_F8000 0x26f: 0x0505050505050505 MTRRdefType (0x2ff): 0x0c00 (fixed-range flag: 0x1, mtrr flag: 0x1, type field: 0x00 (uncacheable)) snip Well they're all the same. Hmm, x86info is not dumping all the variable MTRRs. You have 10, but it only prints the first 8. I don't know if it will show anything different, but can you try fixing it with this patch? --- a/mtrr.c +++ b/mtrr.c @@ -75,19 +75,23 @@ printf(0x%016llx\n, val); } -static void decode_mtrrcap(int cpu, int msr) +unsigned int decode_mtrrcap(int cpu, int msr) { unsigned long long val; + unsigned int vcnt = 0; int ret; ret = mtrr_value(cpu,msr,val); if (ret) { + vcnt = (unsigned int)(val IA32_MTRRCAP_VCNT); printf(0x%016llx , val); printf((smrr flag: 0x%01x, ,(unsigned int) (val IA32_MTRRCAP_SMRR) 11 ); printf(wc flag: 0x%01x, ,(unsigned int) (valIA32_MTRRCAP_WC) 10); printf(fix flag: 0x%01x, ,(unsigned int) (valIA32_MTRRCAP_FIX) 8); - printf(vcnt field: 0x%02x (%d))\n,(unsigned int) (valIA32_MTRRCAP_VCNT) , (int) (valIA32_MTRRCAP_VCNT)); + printf(vcnt field: 0x%02x (%u))\n, vcnt, vcnt); } + + return vcnt; } static void decode_mtrr_deftype(int cpu, int msr) @@ -142,7 +146,7 @@ void dump_mtrrs(struct cpudata *cpu) { unsigned long long val = 0; - unsigned int i; + unsigned int i, vcnt; if (!(cpu-flags_edx (X86_FEATURE_MTRR))) return; @@ -157,11 +161,11 @@ printf(MTRR registers:\n); printf(MTRRcap (0xfe): ); - decode_mtrrcap(cpu-number, 0xfe); + vcnt = decode_mtrrcap(cpu-number, 0xfe); set_max_phy_addr(cpu); - for (i = 0; i 16; i+=2) { + for (i = 0; i 2 * vcnt; i += 2) { printf(MTRRphysBase%u (0x%x): , i/2, (unsigned int) 0x200+i);
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Thu, Oct 09, 2014 at 09:46:37AM -0500, Chuck Ebbert wrote: Well they're all the same. Hmm, x86info is not dumping all the variable MTRRs. You have 10, but it only prints the first 8. I don't know if it will show anything different, but can you try fixing it with this patch? Source (https://github.com/dankamongmen/x86info) was slightly different, but I followed the drift. tldr: 8,9 appear to be identical on all cpus as well. $ sudo ./x86info --mtrr --all-cpus x86info v1.31pre Found 4 CPUs. CPU #1: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Core i7 (SandyBridge) Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a wc:1 fix:1 vcnt:10 MTRRphysBase0 (0x200): 0x0006 (physbase:0x00 type: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase1 (0x202): 0x8006 (physbase:0x08 type: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase2 (0x204): 0x8e00 (physbase:0x08e000 type: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask:0xffe000 valid:1) MTRRphysBase3 (0x206): 0x8d00 (physbase:0x08d000 type: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase4 (0x208): 0x00010006 (physbase:0x10 type: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase5 (0x20a): 0x00017000 (physbase:0x17 type: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase:0x16f000 type: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase:0x16e800 type: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask:0xfff800 valid:1) MTRRphysBase8 (0x210): 0x00016e60 (physbase:0x16e600 type: 0x00 (uncacheable)) MTRRphysMask8 (0x211): 0x000fffe00800 (physmask:0xfffe00 valid:1) MTRRphysBase9 (0x212): 0x (physbase:0x00 type: 0x00 (uncacheable)) MTRRphysMask9 (0x213): 0x (physmask:0x00 valid:0) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A (0x259): 0x MTRRfix4K_C8000 (0x269): 0x0505050505050505 MTRRfix4K_D 0x26a: 0x MTRRfix4K_D8000 0x26b: 0x MTRRfix4K_E 0x26c: 0x MTRRfix4K_E8000 0x26d: 0x0505050505050505 MTRRfix4K_F 0x26e: 0x0505050505050505 MTRRfix4K_F8000 0x26f: 0x0505050505050505 MTRRdefType (0x2ff): 0x0c00 (fixed-range flag:1 enable flag:1 default type:0x00 (uncacheable)) -- CPU #2: Extended Family: 0 Extended Model: 2 Family: 6 Model: 42 Stepping: 7 Type: 0 (Original OEM) CPU Model (x86info's best guess): Core i7 (SandyBridge) Processor name string (BIOS programmed): Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz MTRR registers: MTRRcap (0xfe): 0x0d0a wc:1 fix:1 vcnt:10 MTRRphysBase0 (0x200): 0x0006 (physbase:0x00 type: 0x06 (write-back)) MTRRphysMask0 (0x201): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase1 (0x202): 0x8006 (physbase:0x08 type: 0x06 (write-back)) MTRRphysMask1 (0x203): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase2 (0x204): 0x8e00 (physbase:0x08e000 type: 0x00 (uncacheable)) MTRRphysMask2 (0x205): 0x000ffe000800 (physmask:0xffe000 valid:1) MTRRphysBase3 (0x206): 0x8d00 (physbase:0x08d000 type: 0x00 (uncacheable)) MTRRphysMask3 (0x207): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase4 (0x208): 0x00010006 (physbase:0x10 type: 0x06 (write-back)) MTRRphysMask4 (0x209): 0x000f8800 (physmask:0xf8 valid:1) MTRRphysBase5 (0x20a): 0x00017000 (physbase:0x17 type: 0x00 (uncacheable)) MTRRphysMask5 (0x20b): 0x000ff800 (physmask:0xff valid:1) MTRRphysBase6 (0x20c): 0x00016f00 (physbase:0x16f000 type: 0x00 (uncacheable)) MTRRphysMask6 (0x20d): 0x000fff000800 (physmask:0xfff000 valid:1) MTRRphysBase7 (0x20e): 0x00016e80 (physbase:0x16e800 type: 0x00 (uncacheable)) MTRRphysMask7 (0x20f): 0x000fff800800 (physmask:0xfff800 valid:1) MTRRphysBase8 (0x210): 0x00016e60 (physbase:0x16e600 type: 0x00 (uncacheable)) MTRRphysMask8 (0x211): 0x000fffe00800 (physmask:0xfffe00 valid:1) MTRRphysBase9 (0x212): 0x (physbase:0x00 type: 0x00 (uncacheable)) MTRRphysMask9 (0x213): 0x (physmask:0x00 valid:0) MTRRfix64K_0 (0x250): 0x0606060606060606 MTRRfix16K_8 (0x258): 0x0606060606060606 MTRRfix16K_A
Re: i915.ko WC writes are slow after ea8596bb2d8d379
(2014/10/09 2:47), Chuck Ebbert wrote: > On Wed, 8 Oct 2014 10:03:36 +0100 > Chris Wilson wrote: > >> and adding that back into the current build, e.g. >> >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index 3632743..48a8a69 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -87,6 +87,7 @@ config X86 >> select HAVE_USER_RETURN_NOTIFIER >> select ARCH_BINFMT_ELF_RANDOMIZE_PIE >> select HAVE_ARCH_JUMP_LABEL >> + select STOP_MACHINE >> select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE >> select SPARSE_IRQ >> select GENERIC_FIND_FIRST_BIT >> >> fixes the regression. >> > > Looking closer at this, it seems most configs work by accident, > because they have MOD_UNLOAD and/or HOTPLUG_CPU enabled. I take it > you disabled both of those? stop_machine() is called from all kinds > of places and almost none of them make sure STOP_MACHINE is selected. I guess most of them expects stop_machine() is not a configurable feature... If some of them requires stop_machine(), it should enable it on its kconfig entry (including ftrace, kprobes). > $ find -name Kconf\* | xargs grep STOP_MACHINE > ./init/Kconfig:config STOP_MACHINE > > All these places use stop_machine(): > > mm/page_alloc.c, line 3886 > drivers/xen/manage.c, line 130 > drivers/char/hw_random/intel-rng.c, line 373 > arch/powerpc/mm/numa.c: > line 1616 > line 1623 > arch/powerpc/platforms/powernv/subcore.c, line 324 > arch/arm/kernel/kprobes.c, line 165 > arch/arm/kernel/patch.c: > line 64 > line 71 > arch/s390/kernel/jump_label.c, line 61 > arch/s390/kernel/kprobes.c: > line 311 > line 320 > arch/s390/kernel/time.c: > line 820 > line 1590 > arch/x86/kernel/cpu/mtrr/main.c, line 231 > arch/arm64/kernel/insn.c, line 181 > kernel/time/timekeeping.c, line 892 > kernel/trace/ftrace.c, line 2219 > kernel/module.c: > line 770 > line 1861 > BTW, as I sent a series of patches, the last two can be removed. https://lkml.org/lkml/2014/8/25/142 Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On 10/08/2014 12:49 PM, Chris Wilson wrote: > > Indeed, this appears to be the explanation. (And here I thought PAT > superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its > GTT quite a while ago.) > > Replacing the stop_machine there with on_each_cpu does the trick: > It should, but there seem to be quite a few drivers which still muck with MTRRs. However, i915 is not one of them, it calls io_mapping_create_wc() followed by arch_phys_wc_add(), so I'm wondering what the heck is going on here. > Naively I would say that we lost the wc on our ioremap. > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > runs. Could you tell me what the above looks like? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 05:10:59AM -0500, Chuck Ebbert wrote: > On Wed, 8 Oct 2014 10:03:36 +0100 > Chris Wilson wrote: > > > > > I ran into a problem on a Sandybridge i5-2500s whilst measuring the > > performance of GTT write-combining access. I found subsequent runs were > > about 10-40x slower than the first. For example, > > > > igt/gem_gtt_speed: > > > > Time to read 16k through a GTT map: 325.285µs > > Time to write 16k through a GTT map: 4.729µs > > Time to clear 16k through a GTT map: 4.584µs > > Time to clear 16k through a cached GTT map: 1.342µs > > > > on the second run became: > > > > Time to read 16k through a GTT map: 332.148µs > > Time to write 16k through a GTT map:209.411µs > > Time to clear 16k through a GTT map: 56.460µs > > Time to clear 16k through a cached GTT map: 50.897µs > > > > Naively I would say that we lost the wc on our ioremap. > > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > > runs. > > > > A bisection pointed to > > > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > > Author: Masami Hiramatsu > > Date: Thu Jul 18 20:47:53 2013 +0900 > > > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > > functions > > > > of which the active ingredient was just > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index b32ebf9..f4001e0 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > > > config HAVE_TEXT_POKE_SMP > > bool > > - select STOP_MACHINE if SMP > > > > config X86_DEV_DMA_OPS > > bool > > > > and adding that back into the current build, e.g. > > Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of > sync and your results depend on which CPU the test runs on? Indeed, this appears to be the explanation. (And here I thought PAT superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its GTT quite a while ago.) Replacing the stop_machine there with on_each_cpu does the trick: diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c index f961de9..c0e37d5 100644 --- a/arch/x86/kernel/cpu/mtrr/main.c +++ b/arch/x86/kernel/cpu/mtrr/main.c @@ -151,7 +151,7 @@ struct set_mtrr_data { * * Returns nothing. */ -static int mtrr_rendezvous_handler(void *info) +static void mtrr_rendezvous_handler(void *info) { struct set_mtrr_data *data = info; @@ -174,7 +174,6 @@ static int mtrr_rendezvous_handler(void *info) } else if (mtrr_aps_delayed_init || !cpu_online(smp_processor_id())) { mtrr_if->set_all(); } - return 0; } static inline int types_compatible(mtrr_type type1, mtrr_type type2) @@ -228,7 +227,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ .smp_type = type }; - stop_machine(mtrr_rendezvous_handler, , cpu_online_mask); + on_each_cpu_mask(cpu_online_mask, mtrr_rendezvous_handler, , true); } static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base, @@ -240,8 +239,7 @@ static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base, .smp_type = type }; - stop_machine_from_inactive_cpu(mtrr_rendezvous_handler, , - cpu_callout_mask); + on_each_cpu_mask(cpu_callout_mask, mtrr_rendezvous_handler, , true); } /** -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson wrote: > and adding that back into the current build, e.g. > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 3632743..48a8a69 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -87,6 +87,7 @@ config X86 > select HAVE_USER_RETURN_NOTIFIER > select ARCH_BINFMT_ELF_RANDOMIZE_PIE > select HAVE_ARCH_JUMP_LABEL > + select STOP_MACHINE > select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE > select SPARSE_IRQ > select GENERIC_FIND_FIRST_BIT > > fixes the regression. > Looking closer at this, it seems most configs work by accident, because they have MOD_UNLOAD and/or HOTPLUG_CPU enabled. I take it you disabled both of those? stop_machine() is called from all kinds of places and almost none of them make sure STOP_MACHINE is selected. $ find -name Kconf\* | xargs grep STOP_MACHINE ./init/Kconfig:config STOP_MACHINE All these places use stop_machine(): mm/page_alloc.c, line 3886 drivers/xen/manage.c, line 130 drivers/char/hw_random/intel-rng.c, line 373 arch/powerpc/mm/numa.c: line 1616 line 1623 arch/powerpc/platforms/powernv/subcore.c, line 324 arch/arm/kernel/kprobes.c, line 165 arch/arm/kernel/patch.c: line 64 line 71 arch/s390/kernel/jump_label.c, line 61 arch/s390/kernel/kprobes.c: line 311 line 320 arch/s390/kernel/time.c: line 820 line 1590 arch/x86/kernel/cpu/mtrr/main.c, line 231 arch/arm64/kernel/insn.c, line 181 kernel/time/timekeeping.c, line 892 kernel/trace/ftrace.c, line 2219 kernel/module.c: line 770 line 1861 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson wrote: > > I ran into a problem on a Sandybridge i5-2500s whilst measuring the > performance of GTT write-combining access. I found subsequent runs were > about 10-40x slower than the first. For example, > > igt/gem_gtt_speed: > > Time to read 16k through a GTT map: 325.285µs > Time to write 16k through a GTT map: 4.729µs > Time to clear 16k through a GTT map: 4.584µs > Time to clear 16k through a cached GTT map: 1.342µs > > on the second run became: > > Time to read 16k through a GTT map: 332.148µs > Time to write 16k through a GTT map:209.411µs > Time to clear 16k through a GTT map: 56.460µs > Time to clear 16k through a cached GTT map: 50.897µs > > Naively I would say that we lost the wc on our ioremap. > /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated > runs. > > A bisection pointed to > > commit ea8596bb2d8d37957f3e92db9511c50801689180 > Author: Masami Hiramatsu > Date: Thu Jul 18 20:47:53 2013 +0900 > > kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() > functions > > of which the active ingredient was just > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index b32ebf9..f4001e0 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP > > config HAVE_TEXT_POKE_SMP > bool > - select STOP_MACHINE if SMP > > config X86_DEV_DMA_OPS > bool > > and adding that back into the current build, e.g. Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of sync and your results depend on which CPU the test runs on? > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 3632743..48a8a69 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -87,6 +87,7 @@ config X86 > select HAVE_USER_RETURN_NOTIFIER > select ARCH_BINFMT_ELF_RANDOMIZE_PIE > select HAVE_ARCH_JUMP_LABEL > + select STOP_MACHINE > select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE > select SPARSE_IRQ > select GENERIC_FIND_FIRST_BIT > > fixes the regression. > > For the record, this kernel build doesn't use modules, which seems relevant > in light of ea8596bb2 "fixes a Kconfig dependency issue on STOP_MACHINE > in the case of CONFIG_SMP && !CONFIG_MODULE_UNLOAD". -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
i915.ko WC writes are slow after ea8596bb2d8d379
I ran into a problem on a Sandybridge i5-2500s whilst measuring the performance of GTT write-combining access. I found subsequent runs were about 10-40x slower than the first. For example, igt/gem_gtt_speed: Time to read 16k through a GTT map: 325.285µs Time to write 16k through a GTT map: 4.729µs Time to clear 16k through a GTT map: 4.584µs Time to clear 16k through a cached GTT map: 1.342µs on the second run became: Time to read 16k through a GTT map: 332.148µs Time to write 16k through a GTT map:209.411µs Time to clear 16k through a GTT map: 56.460µs Time to clear 16k through a cached GTT map: 50.897µs Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. A bisection pointed to commit ea8596bb2d8d37957f3e92db9511c50801689180 Author: Masami Hiramatsu Date: Thu Jul 18 20:47:53 2013 +0900 kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions of which the active ingredient was just diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b32ebf9..f4001e0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP config HAVE_TEXT_POKE_SMP bool - select STOP_MACHINE if SMP config X86_DEV_DMA_OPS bool and adding that back into the current build, e.g. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3632743..48a8a69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select HAVE_USER_RETURN_NOTIFIER select ARCH_BINFMT_ELF_RANDOMIZE_PIE select HAVE_ARCH_JUMP_LABEL + select STOP_MACHINE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select SPARSE_IRQ select GENERIC_FIND_FIRST_BIT fixes the regression. For the record, this kernel build doesn't use modules, which seems relevant in light of ea8596bb2 "fixes a Kconfig dependency issue on STOP_MACHINE in the case of CONFIG_SMP && !CONFIG_MODULE_UNLOAD". -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
i915.ko WC writes are slow after ea8596bb2d8d379
I ran into a problem on a Sandybridge i5-2500s whilst measuring the performance of GTT write-combining access. I found subsequent runs were about 10-40x slower than the first. For example, igt/gem_gtt_speed: Time to read 16k through a GTT map: 325.285µs Time to write 16k through a GTT map: 4.729µs Time to clear 16k through a GTT map: 4.584µs Time to clear 16k through a cached GTT map: 1.342µs on the second run became: Time to read 16k through a GTT map: 332.148µs Time to write 16k through a GTT map:209.411µs Time to clear 16k through a GTT map: 56.460µs Time to clear 16k through a cached GTT map: 50.897µs Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. A bisection pointed to commit ea8596bb2d8d37957f3e92db9511c50801689180 Author: Masami Hiramatsu masami.hiramatsu...@hitachi.com Date: Thu Jul 18 20:47:53 2013 +0900 kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions of which the active ingredient was just diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b32ebf9..f4001e0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP config HAVE_TEXT_POKE_SMP bool - select STOP_MACHINE if SMP config X86_DEV_DMA_OPS bool and adding that back into the current build, e.g. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3632743..48a8a69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select HAVE_USER_RETURN_NOTIFIER select ARCH_BINFMT_ELF_RANDOMIZE_PIE select HAVE_ARCH_JUMP_LABEL + select STOP_MACHINE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select SPARSE_IRQ select GENERIC_FIND_FIRST_BIT fixes the regression. For the record, this kernel build doesn't use modules, which seems relevant in light of ea8596bb2 fixes a Kconfig dependency issue on STOP_MACHINE in the case of CONFIG_SMP !CONFIG_MODULE_UNLOAD. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: I ran into a problem on a Sandybridge i5-2500s whilst measuring the performance of GTT write-combining access. I found subsequent runs were about 10-40x slower than the first. For example, igt/gem_gtt_speed: Time to read 16k through a GTT map: 325.285µs Time to write 16k through a GTT map: 4.729µs Time to clear 16k through a GTT map: 4.584µs Time to clear 16k through a cached GTT map: 1.342µs on the second run became: Time to read 16k through a GTT map: 332.148µs Time to write 16k through a GTT map:209.411µs Time to clear 16k through a GTT map: 56.460µs Time to clear 16k through a cached GTT map: 50.897µs Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. A bisection pointed to commit ea8596bb2d8d37957f3e92db9511c50801689180 Author: Masami Hiramatsu masami.hiramatsu...@hitachi.com Date: Thu Jul 18 20:47:53 2013 +0900 kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions of which the active ingredient was just diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b32ebf9..f4001e0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP config HAVE_TEXT_POKE_SMP bool - select STOP_MACHINE if SMP config X86_DEV_DMA_OPS bool and adding that back into the current build, e.g. Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of sync and your results depend on which CPU the test runs on? diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3632743..48a8a69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select HAVE_USER_RETURN_NOTIFIER select ARCH_BINFMT_ELF_RANDOMIZE_PIE select HAVE_ARCH_JUMP_LABEL + select STOP_MACHINE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select SPARSE_IRQ select GENERIC_FIND_FIRST_BIT fixes the regression. For the record, this kernel build doesn't use modules, which seems relevant in light of ea8596bb2 fixes a Kconfig dependency issue on STOP_MACHINE in the case of CONFIG_SMP !CONFIG_MODULE_UNLOAD. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: and adding that back into the current build, e.g. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3632743..48a8a69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select HAVE_USER_RETURN_NOTIFIER select ARCH_BINFMT_ELF_RANDOMIZE_PIE select HAVE_ARCH_JUMP_LABEL + select STOP_MACHINE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select SPARSE_IRQ select GENERIC_FIND_FIRST_BIT fixes the regression. Looking closer at this, it seems most configs work by accident, because they have MOD_UNLOAD and/or HOTPLUG_CPU enabled. I take it you disabled both of those? stop_machine() is called from all kinds of places and almost none of them make sure STOP_MACHINE is selected. $ find -name Kconf\* | xargs grep STOP_MACHINE ./init/Kconfig:config STOP_MACHINE All these places use stop_machine(): mm/page_alloc.c, line 3886 drivers/xen/manage.c, line 130 drivers/char/hw_random/intel-rng.c, line 373 arch/powerpc/mm/numa.c: line 1616 line 1623 arch/powerpc/platforms/powernv/subcore.c, line 324 arch/arm/kernel/kprobes.c, line 165 arch/arm/kernel/patch.c: line 64 line 71 arch/s390/kernel/jump_label.c, line 61 arch/s390/kernel/kprobes.c: line 311 line 320 arch/s390/kernel/time.c: line 820 line 1590 arch/x86/kernel/cpu/mtrr/main.c, line 231 arch/arm64/kernel/insn.c, line 181 kernel/time/timekeeping.c, line 892 kernel/trace/ftrace.c, line 2219 kernel/module.c: line 770 line 1861 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On Wed, Oct 08, 2014 at 05:10:59AM -0500, Chuck Ebbert wrote: On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: I ran into a problem on a Sandybridge i5-2500s whilst measuring the performance of GTT write-combining access. I found subsequent runs were about 10-40x slower than the first. For example, igt/gem_gtt_speed: Time to read 16k through a GTT map: 325.285µs Time to write 16k through a GTT map: 4.729µs Time to clear 16k through a GTT map: 4.584µs Time to clear 16k through a cached GTT map: 1.342µs on the second run became: Time to read 16k through a GTT map: 332.148µs Time to write 16k through a GTT map:209.411µs Time to clear 16k through a GTT map: 56.460µs Time to clear 16k through a cached GTT map: 50.897µs Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. A bisection pointed to commit ea8596bb2d8d37957f3e92db9511c50801689180 Author: Masami Hiramatsu masami.hiramatsu...@hitachi.com Date: Thu Jul 18 20:47:53 2013 +0900 kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions of which the active ingredient was just diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b32ebf9..f4001e0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2334,7 +2334,6 @@ config HAVE_ATOMIC_IOMAP config HAVE_TEXT_POKE_SMP bool - select STOP_MACHINE if SMP config X86_DEV_DMA_OPS bool and adding that back into the current build, e.g. Hmm, set_mtrr() uses stop_machine(). I wonder if your MTRRs are out of sync and your results depend on which CPU the test runs on? Indeed, this appears to be the explanation. (And here I thought PAT superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its GTT quite a while ago.) Replacing the stop_machine there with on_each_cpu does the trick: diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c index f961de9..c0e37d5 100644 --- a/arch/x86/kernel/cpu/mtrr/main.c +++ b/arch/x86/kernel/cpu/mtrr/main.c @@ -151,7 +151,7 @@ struct set_mtrr_data { * * Returns nothing. */ -static int mtrr_rendezvous_handler(void *info) +static void mtrr_rendezvous_handler(void *info) { struct set_mtrr_data *data = info; @@ -174,7 +174,6 @@ static int mtrr_rendezvous_handler(void *info) } else if (mtrr_aps_delayed_init || !cpu_online(smp_processor_id())) { mtrr_if-set_all(); } - return 0; } static inline int types_compatible(mtrr_type type1, mtrr_type type2) @@ -228,7 +227,7 @@ set_mtrr(unsigned int reg, unsigned long base, unsigned long size, mtrr_type typ .smp_type = type }; - stop_machine(mtrr_rendezvous_handler, data, cpu_online_mask); + on_each_cpu_mask(cpu_online_mask, mtrr_rendezvous_handler, data, true); } static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base, @@ -240,8 +239,7 @@ static void set_mtrr_from_inactive_cpu(unsigned int reg, unsigned long base, .smp_type = type }; - stop_machine_from_inactive_cpu(mtrr_rendezvous_handler, data, - cpu_callout_mask); + on_each_cpu_mask(cpu_callout_mask, mtrr_rendezvous_handler, data, true); } /** -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
On 10/08/2014 12:49 PM, Chris Wilson wrote: Indeed, this appears to be the explanation. (And here I thought PAT superseded mtrrs - i915.ko stopped trying to use assign an mtrr for its GTT quite a while ago.) Replacing the stop_machine there with on_each_cpu does the trick: It should, but there seem to be quite a few drivers which still muck with MTRRs. However, i915 is not one of them, it calls io_mapping_create_wc() followed by arch_phys_wc_add(), so I'm wondering what the heck is going on here. Naively I would say that we lost the wc on our ioremap. /sys/kernel/debug/x86/pat_memtype_list remained the same across repeated runs. Could you tell me what the above looks like? -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i915.ko WC writes are slow after ea8596bb2d8d379
(2014/10/09 2:47), Chuck Ebbert wrote: On Wed, 8 Oct 2014 10:03:36 +0100 Chris Wilson ch...@chris-wilson.co.uk wrote: and adding that back into the current build, e.g. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3632743..48a8a69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select HAVE_USER_RETURN_NOTIFIER select ARCH_BINFMT_ELF_RANDOMIZE_PIE select HAVE_ARCH_JUMP_LABEL + select STOP_MACHINE select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select SPARSE_IRQ select GENERIC_FIND_FIRST_BIT fixes the regression. Looking closer at this, it seems most configs work by accident, because they have MOD_UNLOAD and/or HOTPLUG_CPU enabled. I take it you disabled both of those? stop_machine() is called from all kinds of places and almost none of them make sure STOP_MACHINE is selected. I guess most of them expects stop_machine() is not a configurable feature... If some of them requires stop_machine(), it should enable it on its kconfig entry (including ftrace, kprobes). $ find -name Kconf\* | xargs grep STOP_MACHINE ./init/Kconfig:config STOP_MACHINE All these places use stop_machine(): mm/page_alloc.c, line 3886 drivers/xen/manage.c, line 130 drivers/char/hw_random/intel-rng.c, line 373 arch/powerpc/mm/numa.c: line 1616 line 1623 arch/powerpc/platforms/powernv/subcore.c, line 324 arch/arm/kernel/kprobes.c, line 165 arch/arm/kernel/patch.c: line 64 line 71 arch/s390/kernel/jump_label.c, line 61 arch/s390/kernel/kprobes.c: line 311 line 320 arch/s390/kernel/time.c: line 820 line 1590 arch/x86/kernel/cpu/mtrr/main.c, line 231 arch/arm64/kernel/insn.c, line 181 kernel/time/timekeeping.c, line 892 kernel/trace/ftrace.c, line 2219 kernel/module.c: line 770 line 1861 BTW, as I sent a series of patches, the last two can be removed. https://lkml.org/lkml/2014/8/25/142 Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/