Re: [ANNOUNCE] v5.9.1-rt18
On 10/27/20 1:22 AM, Sebastian Andrzej Siewior wrote: On 2020-10-26 23:53:20 [-0700], Fernando Lopez-Lezcano wrote: Maybe I'm doing something wrong but I get a compilation error (see below) when trying to do a debug build (building rpm packages for Fedora). 5.9.1 + rt19... Builds fine otherwise... If you could remove CONFIG_TEST_LOCKUP then it should work. I will think of something. Thanks much, I should have figured this out for myself :-( Just t busy. The compilation process went ahead (not finished yet), let me know if there is a proper patch. No hurry... Thanks! -- Fernando
Re: [ANNOUNCE] v5.9.1-rt18
On 10/21/20 6:14 AM, Sebastian Andrzej Siewior wrote: On 2020-10-21 14:53:27 [+0200], To Thomas Gleixner wrote: Dear RT folks! I'm pleased to announce the v5.9.1-rt18 patch set. Maybe I'm doing something wrong but I get a compilation error (see below) when trying to do a debug build (building rpm packages for Fedora). 5.9.1 + rt19... Builds fine otherwise... Thanks, -- Fernando + make -s 'HOSTCFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -fcommon -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'HOSTLDFLAGS=-Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' ARCH=x86_64 KCFLAGS= WITH_GCOV=0 -j4 modules BUILDSTDERR: In file included from : BUILDSTDERR: lib/test_lockup.c: In function 'test_lockup_init': BUILDSTDERR: lib/test_lockup.c:484:31: error: 'spinlock_t' {aka 'struct spinlock'} has no member named 'rlock'; did you mean 'lock'? BUILDSTDERR: 484 | offsetof(spinlock_t, rlock.magic), BUILDSTDERR: | ^ BUILDSTDERR: ././include/linux/compiler_types.h:135:57: note: in definition of macro '__compiler_offsetof' BUILDSTDERR: 135 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b) BUILDSTDERR: | ^ BUILDSTDERR: lib/test_lockup.c:484:10: note: in expansion of macro 'offsetof' BUILDSTDERR: 484 | offsetof(spinlock_t, rlock.magic), BUILDSTDERR: | ^~~~ BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 'rwlock_t' {aka 'struct rt_rw_lock'} has no member named 'magic' BUILDSTDERR: 135 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b) BUILDSTDERR: | ^~ BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro '__compiler_offsetof' BUILDSTDERR:17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER) BUILDSTDERR: |^~~ BUILDSTDERR: lib/test_lockup.c:487:10: note: in expansion of macro 'offsetof' BUILDSTDERR: 487 | offsetof(rwlock_t, magic), BUILDSTDERR: | ^~~~ BUILDSTDERR: lib/test_lockup.c:488:10: error: 'RWLOCK_MAGIC' undeclared (first use in this function); did you mean 'STACK_MAGIC'? BUILDSTDERR: 488 | RWLOCK_MAGIC) || BUILDSTDERR: | ^~~~ BUILDSTDERR: | STACK_MAGIC BUILDSTDERR: lib/test_lockup.c:488:10: note: each undeclared identifier is reported only once for each function it appears in BUILDSTDERR: In file included from : BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 'struct mutex' has no member named 'wait_lock' BUILDSTDERR: 135 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b) BUILDSTDERR: | ^~ BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro '__compiler_offsetof' BUILDSTDERR:17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER) BUILDSTDERR: |^~~ BUILDSTDERR: lib/test_lockup.c:490:10: note: in expansion of macro 'offsetof' BUILDSTDERR: 490 | offsetof(struct mutex, wait_lock.rlock.magic), BUILDSTDERR: | ^~~~ BUILDSTDERR: ././include/linux/compiler_types.h:135:35: error: 'struct rw_semaphore' has no member named 'wait_lock' BUILDSTDERR: 135 | #define __compiler_offsetof(a, b) __builtin_offsetof(a, b) BUILDSTDERR: | ^~ BUILDSTDERR: ./include/linux/stddef.h:17:32: note: in expansion of macro '__compiler_offsetof' BUILDSTDERR:17 | #define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER) BUILDSTDERR: |^~~ BUILDSTDERR: lib/test_lockup.c:493:10: note: in expansion of macro 'offsetof' BUILDSTDERR: 493 | offsetof(struct rw_semaphore, wait_lock.magic), BUILDSTDERR: | ^~~~ BUILDSTDERR: make[1]: *** [scripts/Makefile.build:283: lib/test_lockup.o] Error 1 BUILDSTDERR: make: *** [Makefile:1784: lib] Error 2 BUILDSTDERR: make: *** Waiting for unfinished jobs Changes since v5.9.1-rt17: - Update the migrate-disable series by Peter Zijlstra to v3. Include also fixes discussed in the thread. - UP builds did not boot since the replace of the migrate-disable code. Reported by Christian Egger. Fixed as a part of v3 by Peter Zijlstra. - Rebase the printk code on top of the ringer buffer designed for printk which was merged in the v5.10 merge window. Patches by John Ogness. Known issues - It has been pointed out that due to changes to the printk
Re: [ANNOUNCE] v4.13.10-rt3 (possible recursive locking warning)
On 10/27/2017 03:27 PM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.13.10-rt3 patch set. Thanks!! Wonderful! I'm seeing this (old Lenovo T510 running Fedora 26): [ 54.942022] [ 54.942023] WARNING: possible recursive locking detected [ 54.942026] 4.13.10-200.rt3.1.fc26.ccrma.x86_64+rt #1 Not tainted [ 54.942026] [ 54.942028] csd-sound/1392 is trying to acquire lock: [ 54.942029] (>wait_lock){-.}, at: [] rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942038] but task is already holding lock: [ 54.942039] (>wait_lock){-.}, at: [] futex_lock_pi+0x269/0x4b0 [ 54.942044] other info that might help us debug this: [ 54.942045] Possible unsafe locking scenario: [ 54.942045]CPU0 [ 54.942045] [ 54.942046] lock(>wait_lock); [ 54.942046] lock(>wait_lock); [ 54.942047] *** DEADLOCK *** [ 54.942047] May be due to missing lock nesting notation [ 54.942048] 1 lock held by csd-sound/1392: [ 54.942049] #0: (>wait_lock){-.}, at: [] futex_lock_pi+0x269/0x4b0 [ 54.942051] stack backtrace: [ 54.942053] CPU: 2 PID: 1392 Comm: csd-sound Not tainted 4.13.10-200.rt3.1.fc26.ccrma.x86_64+rt #1 [ 54.942054] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 [ 54.942055] Call Trace: [ 54.942059] dump_stack+0x8e/0xd6 [ 54.942065] __lock_acquire+0x72f/0x13b0 [ 54.942071] ? sched_clock+0x9/0x10 [ 54.942074] ? futex_lock_pi+0x269/0x4b0 [ 54.942076] lock_acquire+0xa3/0x250 [ 54.942077] ? lock_acquire+0xa3/0x250 [ 54.942079] ? rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942080] ? reacquire_held_locks+0xf8/0x180 [ 54.942083] _raw_spin_lock_irqsave+0x4d/0x90 [ 54.942084] ? rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942085] rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942087] rt_spin_unlock+0x2a/0x40 [ 54.942089] futex_lock_pi+0x277/0x4b0 [ 54.942090] ? futex_wait_queue_me+0x100/0x170 [ 54.942092] ? futex_wait+0x227/0x250 [ 54.942096] do_futex+0x304/0xc20 [ 54.942099] ? wake_up_new_task+0x1ec/0x370 [ 54.942102] ? _do_fork+0x176/0x750 [ 54.942104] ? up_read+0x2a/0x30 [ 54.942106] SyS_futex+0x13b/0x180 [ 54.942110] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 54.942113] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 54.942116] RIP: 0033:0x7fe500f2d7b2 [ 54.942116] RSP: 002b:7ffd13017110 EFLAGS: 0246 ORIG_RAX: 00ca [ 54.942117] RAX: ffda RBX: 7fe4e7df7700 RCX: 7fe500f2d7b2 [ 54.942118] RDX: 0001 RSI: 0086 RDI: 557e090dd3f0 [ 54.942119] RBP: 7ffd13017280 R08: R09: 0001 [ 54.942119] R10: R11: 0246 R12: [ 54.942120] R13: 7ffd13017210 R14: 7fe4e7df79c0 R15: Best, -- Fernando Changes since v4.13.10-rt2: - A dcache related live lock could occur. The writer could get preempted within the critical section and the reader would spin to see the update completed. This update would never complete if the writer was preempted by a reader with a higher priority. Reported by Oleg Karfich. - The tpm_tis driver can cause latency spikes (~400us) after multiple writes to the chip is followed by a read operation. This read causes a flush of all the cached writes to the chip and is blocking the CPU until the operation completes. Reported and patched by Haris Okanovic. - The upgrade to v4.13-RT broke the zram driver. Patched by Mike Galbraith. - Tom Zanussi's "tracing: Inter-event (e.g. latency) support" patchset has been update to v3. - The static SRCU notifier wasn't compiling with SRCU_TINY. Reported by kbuild test robot.
Re: [ANNOUNCE] v4.13.10-rt3 (possible recursive locking warning)
On 10/27/2017 03:27 PM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.13.10-rt3 patch set. Thanks!! Wonderful! I'm seeing this (old Lenovo T510 running Fedora 26): [ 54.942022] [ 54.942023] WARNING: possible recursive locking detected [ 54.942026] 4.13.10-200.rt3.1.fc26.ccrma.x86_64+rt #1 Not tainted [ 54.942026] [ 54.942028] csd-sound/1392 is trying to acquire lock: [ 54.942029] (>wait_lock){-.}, at: [] rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942038] but task is already holding lock: [ 54.942039] (>wait_lock){-.}, at: [] futex_lock_pi+0x269/0x4b0 [ 54.942044] other info that might help us debug this: [ 54.942045] Possible unsafe locking scenario: [ 54.942045]CPU0 [ 54.942045] [ 54.942046] lock(>wait_lock); [ 54.942046] lock(>wait_lock); [ 54.942047] *** DEADLOCK *** [ 54.942047] May be due to missing lock nesting notation [ 54.942048] 1 lock held by csd-sound/1392: [ 54.942049] #0: (>wait_lock){-.}, at: [] futex_lock_pi+0x269/0x4b0 [ 54.942051] stack backtrace: [ 54.942053] CPU: 2 PID: 1392 Comm: csd-sound Not tainted 4.13.10-200.rt3.1.fc26.ccrma.x86_64+rt #1 [ 54.942054] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 [ 54.942055] Call Trace: [ 54.942059] dump_stack+0x8e/0xd6 [ 54.942065] __lock_acquire+0x72f/0x13b0 [ 54.942071] ? sched_clock+0x9/0x10 [ 54.942074] ? futex_lock_pi+0x269/0x4b0 [ 54.942076] lock_acquire+0xa3/0x250 [ 54.942077] ? lock_acquire+0xa3/0x250 [ 54.942079] ? rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942080] ? reacquire_held_locks+0xf8/0x180 [ 54.942083] _raw_spin_lock_irqsave+0x4d/0x90 [ 54.942084] ? rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942085] rt_spin_lock_slowunlock+0x4d/0xa0 [ 54.942087] rt_spin_unlock+0x2a/0x40 [ 54.942089] futex_lock_pi+0x277/0x4b0 [ 54.942090] ? futex_wait_queue_me+0x100/0x170 [ 54.942092] ? futex_wait+0x227/0x250 [ 54.942096] do_futex+0x304/0xc20 [ 54.942099] ? wake_up_new_task+0x1ec/0x370 [ 54.942102] ? _do_fork+0x176/0x750 [ 54.942104] ? up_read+0x2a/0x30 [ 54.942106] SyS_futex+0x13b/0x180 [ 54.942110] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 54.942113] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 54.942116] RIP: 0033:0x7fe500f2d7b2 [ 54.942116] RSP: 002b:7ffd13017110 EFLAGS: 0246 ORIG_RAX: 00ca [ 54.942117] RAX: ffda RBX: 7fe4e7df7700 RCX: 7fe500f2d7b2 [ 54.942118] RDX: 0001 RSI: 0086 RDI: 557e090dd3f0 [ 54.942119] RBP: 7ffd13017280 R08: R09: 0001 [ 54.942119] R10: R11: 0246 R12: [ 54.942120] R13: 7ffd13017210 R14: 7fe4e7df79c0 R15: Best, -- Fernando Changes since v4.13.10-rt2: - A dcache related live lock could occur. The writer could get preempted within the critical section and the reader would spin to see the update completed. This update would never complete if the writer was preempted by a reader with a higher priority. Reported by Oleg Karfich. - The tpm_tis driver can cause latency spikes (~400us) after multiple writes to the chip is followed by a read operation. This read causes a flush of all the cached writes to the chip and is blocking the CPU until the operation completes. Reported and patched by Haris Okanovic. - The upgrade to v4.13-RT broke the zram driver. Patched by Mike Galbraith. - Tom Zanussi's "tracing: Inter-event (e.g. latency) support" patchset has been update to v3. - The static SRCU notifier wasn't compiling with SRCU_TINY. Reported by kbuild test robot.
Re: [ANNOUNCE] 4.1.3-rt3 - xmit queue timeout, oops, rcu stalls
On 07/25/2015 03:32 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.1.3-rt3 patch set. ... I've had a few hangs with nothing left behind to debug... but today I find this: (NOTE: I'm attaching a file with the details, I don't know if my mailer will mangled these lines) Aug 5 10:46:18 localhost kernel: [ 2343.673560] WARNING: CPU: 3 PID: 43 at net/sched/sch_generic.c:303 dev_watchdog+0x26f/0x280() Aug 5 10:46:18 localhost kernel: [ 2343.673561] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out and then: Aug 5 10:46:18 localhost kernel: [ 2343.673679] e1000e :04:00.0 eth1: Reset adapter unexpectedly Aug 5 10:46:30 localhost kernel: [ 2355.706987] ata5.00: exception Emask 0x40 SAct 0x0 SErr 0x80800 action 0x6 frozen Aug 5 10:46:30 localhost kernel: [ 2355.706990] ata5: SError: { HostInt 10B8B } Aug 5 10:46:30 localhost kernel: [ 2355.707003] ata5.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in Aug 5 10:46:30 localhost kernel: [ 2355.707003] Get event status notification 4a 01 00 00 10 00 00 00 08 00res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x44 (timeout) Aug 5 10:46:30 localhost kernel: [ 2355.707005] ata5.00: status: { DRDY } Aug 5 10:46:30 localhost kernel: [ 2355.707007] ata5: hard resetting link same one but later in the log: Aug 5 10:46:18 localhost kernel: WARNING: CPU: 3 PID: 43 at net/sched/sch_generic.c:303 dev_watchdog+0x26f/0x280() Aug 5 10:46:18 localhost kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out Things apparently keep working and then: Aug 5 11:58:36 localhost kernel: [ 6678.122596] Network Receive[2409]: segfault at 28 ip 003c4c293ca9 sp 7fb6f64dbb58 error 6 in libc-2.18.so[3c4c20+1b4000] Aug 5 11:58:36 localhost kernel: Network Receive[2409]: segfault at 28 ip 003c4c293ca9 sp 7fb6f64dbb58 error 6 in libc-2.18.so[3c4c20+1b4000] Aug 5 11:58:36 localhost kernel: timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large: Aug 5 11:58:36 localhost kernel: 'hpet' wd_now: 47ebf654 wd_last: c0debfe6 mask: Aug 5 11:58:36 localhost kernel: 'tsc' cs_now: 154f6e564f7d cs_last: 7784d315c59 mask: Aug 5 11:58:36 localhost systemd: Starting dnf makecache... Aug 5 11:58:36 localhost kernel: [ 6678.123233] timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large: Aug 5 11:58:36 localhost kernel: [ 6678.123237] 'hpet' wd_now: 47ebf654 wd_last: c0debfe6 mask: Aug 5 11:58:36 localhost kernel: [ 6678.123238] 'tsc' cs_now: 154f6e564f7d cs_last: 7784d315c59 mask: Aug 5 11:58:36 localhost kernel: [ 6678.146207] Switched to clocksource hpet Aug 5 11:58:36 localhost kernel: Switched to clocksource hpet Aug 5 11:58:36 localhost kernel: [ 6678.150087] BUG: unable to handle kernel NULL pointer dereference at 0ea0 Aug 5 11:58:36 localhost kernel: [ 6678.150097] IP: [] nfs40_discover_server_trunking+0x5e/0x110 [nfsv4] Aug 5 11:58:36 localhost kernel: [ 6678.150098] PGD 7f3c83067 PUD 7f46fb067 PMD 0 Aug 5 11:58:36 localhost kernel: [ 6678.150099] Oops: [#1] PREEMPT SMP And eventually (later) get a ton of these: Aug 5 11:59:36 localhost kernel: [ 6738.107181] INFO: rcu_preempt detected stalls on CPUs/tasks: {} (detected by 3, t=60002 jiffies, g=37092, c=37091, q=0) Aug 5 11:59:36 localhost kernel: [ 6738.107183] All QSes seen, last rcu_preempt kthread activity 1 (4301410925-4301410924), jiffies_till_next_fqs=3, root ->qsmask 0x0 So something is left in a not good state... -- Fernando messages.gz Description: GNU Zip compressed data
Re: [ANNOUNCE] 4.1.3-rt3 - xmit queue timeout, oops, rcu stalls
On 07/25/2015 03:32 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.1.3-rt3 patch set. ... I've had a few hangs with nothing left behind to debug... but today I find this: (NOTE: I'm attaching a file with the details, I don't know if my mailer will mangled these lines) Aug 5 10:46:18 localhost kernel: [ 2343.673560] WARNING: CPU: 3 PID: 43 at net/sched/sch_generic.c:303 dev_watchdog+0x26f/0x280() Aug 5 10:46:18 localhost kernel: [ 2343.673561] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out and then: Aug 5 10:46:18 localhost kernel: [ 2343.673679] e1000e :04:00.0 eth1: Reset adapter unexpectedly Aug 5 10:46:30 localhost kernel: [ 2355.706987] ata5.00: exception Emask 0x40 SAct 0x0 SErr 0x80800 action 0x6 frozen Aug 5 10:46:30 localhost kernel: [ 2355.706990] ata5: SError: { HostInt 10B8B } Aug 5 10:46:30 localhost kernel: [ 2355.707003] ata5.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in Aug 5 10:46:30 localhost kernel: [ 2355.707003] Get event status notification 4a 01 00 00 10 00 00 00 08 00res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x44 (timeout) Aug 5 10:46:30 localhost kernel: [ 2355.707005] ata5.00: status: { DRDY } Aug 5 10:46:30 localhost kernel: [ 2355.707007] ata5: hard resetting link same one but later in the log: Aug 5 10:46:18 localhost kernel: WARNING: CPU: 3 PID: 43 at net/sched/sch_generic.c:303 dev_watchdog+0x26f/0x280() Aug 5 10:46:18 localhost kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out Things apparently keep working and then: Aug 5 11:58:36 localhost kernel: [ 6678.122596] Network Receive[2409]: segfault at 28 ip 003c4c293ca9 sp 7fb6f64dbb58 error 6 in libc-2.18.so[3c4c20+1b4000] Aug 5 11:58:36 localhost kernel: Network Receive[2409]: segfault at 28 ip 003c4c293ca9 sp 7fb6f64dbb58 error 6 in libc-2.18.so[3c4c20+1b4000] Aug 5 11:58:36 localhost kernel: timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large: Aug 5 11:58:36 localhost kernel: 'hpet' wd_now: 47ebf654 wd_last: c0debfe6 mask: Aug 5 11:58:36 localhost kernel: 'tsc' cs_now: 154f6e564f7d cs_last: 7784d315c59 mask: Aug 5 11:58:36 localhost systemd: Starting dnf makecache... Aug 5 11:58:36 localhost kernel: [ 6678.123233] timekeeping watchdog: Marking clocksource 'tsc' as unstable, because the skew is too large: Aug 5 11:58:36 localhost kernel: [ 6678.123237] 'hpet' wd_now: 47ebf654 wd_last: c0debfe6 mask: Aug 5 11:58:36 localhost kernel: [ 6678.123238] 'tsc' cs_now: 154f6e564f7d cs_last: 7784d315c59 mask: Aug 5 11:58:36 localhost kernel: [ 6678.146207] Switched to clocksource hpet Aug 5 11:58:36 localhost kernel: Switched to clocksource hpet Aug 5 11:58:36 localhost kernel: [ 6678.150087] BUG: unable to handle kernel NULL pointer dereference at 0ea0 Aug 5 11:58:36 localhost kernel: [ 6678.150097] IP: [a05d922e] nfs40_discover_server_trunking+0x5e/0x110 [nfsv4] Aug 5 11:58:36 localhost kernel: [ 6678.150098] PGD 7f3c83067 PUD 7f46fb067 PMD 0 Aug 5 11:58:36 localhost kernel: [ 6678.150099] Oops: [#1] PREEMPT SMP And eventually (later) get a ton of these: Aug 5 11:59:36 localhost kernel: [ 6738.107181] INFO: rcu_preempt detected stalls on CPUs/tasks: {} (detected by 3, t=60002 jiffies, g=37092, c=37091, q=0) Aug 5 11:59:36 localhost kernel: [ 6738.107183] All QSes seen, last rcu_preempt kthread activity 1 (4301410925-4301410924), jiffies_till_next_fqs=3, root -qsmask 0x0 So something is left in a not good state... -- Fernando messages.gz Description: GNU Zip compressed data
Re: [ANNOUNCE] 4.0.4-rt1
On 06/09/2015 03:05 PM, Pavel Vasilyev wrote: 09.06.2015 19:45, Fernando Lopez-Lezcano пишет: This is still happening, about once a day. John Dulaney help me set up a crash kernel dump (thanks!) so now I have a kernel core dump for this one, Asus,Fedora,CGROUPS, iptables,snd_ac97,radeon,raid1,kvm - this realtime system? :D :-P Yup. I have been using rt for many years - and packaging it - for (very) low latency sound processing. Linux + rt + jackd + rtirq + threaded irqs + jack clients, everything with the right priorities. Runs very nicely unless you hit an issue like the one I'm asking about[*]. Usually running snd_hdspm with RME hardware when in concert situations (this one with Asus mobo is my - quite old by now - desktop at work, but I'm also having the same problem in my Lenovo laptop). -- Fernando [*] for example a whole concert for a 24.8 3D sound system with a remote ethernet driven D/A and running all the time with 64 frame x 2 buffers at 48KHz. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 4.0.4-rt1
On 05/28/2015 06:56 PM, Fernando Lopez-Lezcano wrote: Oh well. Second time the machine hangs in two days in the same way (otherwise very stable running 3.18.x-rty) (this is a bumblebee + bbswitch graphics laptop - argh, if I had known better...) May 28 18:49:21 localhost kernel: [ cut here ] May 28 18:49:21 localhost kernel: kernel BUG at mm/memcontrol.c:5848! May 28 18:49:21 localhost kernel: invalid opcode: [#1] PREEMPT SMP ... This is still happening, about once a day. John Dulaney help me set up a crash kernel dump (thanks!) so now I have a kernel core dump for this one, I'll try to post more details tomorrow. Let me know if there is anything in particular I should look at. I'm attaching another backtrace on a different machine (older desktop), slightly different but same end result. -- Fernando Jun 9 03:49:19 localhost kernel: [ cut here ] Jun 9 03:49:19 localhost kernel: kernel BUG at mm/memcontrol.c:5848! Jun 9 03:49:19 localhost kernel: invalid opcode: [#1] PREEMPT SMP Jun 9 03:49:19 localhost kernel: Modules linked in: bnep bluetooth rfkill fuse tun act_police cls_basic cls_flow cls_fw cls_u32 sch_tbf sch_prio sch_htb sch_hfsc sch_ingress sch_sfq xt_CHECKSUM ipt_rpfilter xt_statistic xt_CT nf_log_ipv4 nf_log_common xt_LOG xt_connlimit xt_realm xt_addrtype xt_comment xt_recent xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ipt_ECN ipt_CLUSTERIP ipt_ah xt_set ip_set nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane ebtable_nat nf_conntrack_tftp ebtables nf_conntrack_sip nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 Jun 9 03:49:19 localhost kernel: nf_conntrack_ftp xt_TPROXY xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy xt_pkttype xt_physdev br_netfilter bridge stp llc xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY xt_AUDIT xt_state iptable_raw iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_mangle nfnetlink nfsv3 nfs_acl auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables w83627ehf hwmon_vid iTCO_wdt iTCO_vendor_support gpio_ich raid1 coretemp kvm_intel kvm hid_logitech_hidpp serio_raw snd_ice1712 snd_cs8427 snd_i2c snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_mpu401_uart Jun 9 03:49:19 localhost kernel: snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd lpc_ich soundcore i2c_i801 mfd_core shpchp acpi_cpufreq binfmt_misc hid_logitech_dj ata_generic firewire_ohci radeon pata_acpi firewire_core crc_itu_t sata_sil24 i2c_algo_bit drm_kms_helper sky2 r8169 pata_marvell mii ttm drm Jun 9 03:49:19 localhost kernel: CPU: 2 PID: 10424 Comm: prelink Not tainted 4.0.4-100.rt1.3.fc20.ccrma.x86_64+rt #1 Jun 9 03:49:19 localhost kernel: Hardware name: System manufacturer P5K/EPU/P5K/EPU, BIOS 060406/19/2008 Jun 9 03:49:19 localhost kernel: task: 8802225e1520 ti: 8801256c8000 task.ti: 8801256c8000 Jun 9 03:49:19 localhost kernel: RIP: 0010:[] [] mem_cgroup_swapout+0x102/0x110 Jun 9 03:49:19 localhost kernel: RSP: 0018:8801256cb288 EFLAGS: 00010202 Jun 9 03:49:19 localhost kernel: RAX: 0246 RBX: ea0008be7ec0 RCX: Jun 9 03:49:19 localhost kernel: RDX: RSI: 8801256cb248 RDI: 820236d0 Jun 9 03:49:19 localhost kernel: RBP: 8801256cb298 R08: ea0008be7ee0 R09: 8801256cb498 Jun 9 03:49:19 localhost kernel: R10: 8801256cbfd8 R11: 0002 R12: 880227013800 Jun 9 03:49:19 localhost kernel: R13: 81c652b8 R14: 0001 R15: 81c652a0 Jun 9 03:49:19 localhost kernel: FS: 01ad8900(0063) GS:88022fd0() knlGS: Jun 9 03:49:19 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 9 03:49:19 localhost kernel: CR2: 02131878 CR3: 00014a4f3000 CR4: 07e0 Jun 9 03:49:19 localhost kernel: Stack: Jun 9 03:49:19 localhost kernel: ea0008be7ec0 000c2a0d 8801256cb2d8 811b60c7 Jun 9 03:49:19 localhost kernel: 8801256cb738 ea0008be7ec0 8801256cb4b0 Jun 9 03:49:19 localhost kernel: ea0008be7ee0 81c652a0 8801256cb418 811b924f Jun 9 03:49:19 localhost kernel: Call Trace: Jun 9 03:49:19 localhost kernel: [] __remove_mapping+0x107/0x180 Jun 9 03:49:19 localhost kernel: [] shrink_page_list+0x7df/0xb50 Jun 9 03:49:19 localhost kernel
Re: [ANNOUNCE] 4.0.4-rt1
On 05/28/2015 06:56 PM, Fernando Lopez-Lezcano wrote: Oh well. Second time the machine hangs in two days in the same way (otherwise very stable running 3.18.x-rty) (this is a bumblebee + bbswitch graphics laptop - argh, if I had known better...) May 28 18:49:21 localhost kernel: [ cut here ] May 28 18:49:21 localhost kernel: kernel BUG at mm/memcontrol.c:5848! May 28 18:49:21 localhost kernel: invalid opcode: [#1] PREEMPT SMP ... This is still happening, about once a day. John Dulaney help me set up a crash kernel dump (thanks!) so now I have a kernel core dump for this one, I'll try to post more details tomorrow. Let me know if there is anything in particular I should look at. I'm attaching another backtrace on a different machine (older desktop), slightly different but same end result. -- Fernando Jun 9 03:49:19 localhost kernel: [ cut here ] Jun 9 03:49:19 localhost kernel: kernel BUG at mm/memcontrol.c:5848! Jun 9 03:49:19 localhost kernel: invalid opcode: [#1] PREEMPT SMP Jun 9 03:49:19 localhost kernel: Modules linked in: bnep bluetooth rfkill fuse tun act_police cls_basic cls_flow cls_fw cls_u32 sch_tbf sch_prio sch_htb sch_hfsc sch_ingress sch_sfq xt_CHECKSUM ipt_rpfilter xt_statistic xt_CT nf_log_ipv4 nf_log_common xt_LOG xt_connlimit xt_realm xt_addrtype xt_comment xt_recent xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ipt_ECN ipt_CLUSTERIP ipt_ah xt_set ip_set nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane ebtable_nat nf_conntrack_tftp ebtables nf_conntrack_sip nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 Jun 9 03:49:19 localhost kernel: nf_conntrack_ftp xt_TPROXY xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy xt_pkttype xt_physdev br_netfilter bridge stp llc xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY xt_AUDIT xt_state iptable_raw iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_mangle nfnetlink nfsv3 nfs_acl auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables w83627ehf hwmon_vid iTCO_wdt iTCO_vendor_support gpio_ich raid1 coretemp kvm_intel kvm hid_logitech_hidpp serio_raw snd_ice1712 snd_cs8427 snd_i2c snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_mpu401_uart Jun 9 03:49:19 localhost kernel: snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd lpc_ich soundcore i2c_i801 mfd_core shpchp acpi_cpufreq binfmt_misc hid_logitech_dj ata_generic firewire_ohci radeon pata_acpi firewire_core crc_itu_t sata_sil24 i2c_algo_bit drm_kms_helper sky2 r8169 pata_marvell mii ttm drm Jun 9 03:49:19 localhost kernel: CPU: 2 PID: 10424 Comm: prelink Not tainted 4.0.4-100.rt1.3.fc20.ccrma.x86_64+rt #1 Jun 9 03:49:19 localhost kernel: Hardware name: System manufacturer P5K/EPU/P5K/EPU, BIOS 060406/19/2008 Jun 9 03:49:19 localhost kernel: task: 8802225e1520 ti: 8801256c8000 task.ti: 8801256c8000 Jun 9 03:49:19 localhost kernel: RIP: 0010:[812112c2] [812112c2] mem_cgroup_swapout+0x102/0x110 Jun 9 03:49:19 localhost kernel: RSP: 0018:8801256cb288 EFLAGS: 00010202 Jun 9 03:49:19 localhost kernel: RAX: 0246 RBX: ea0008be7ec0 RCX: Jun 9 03:49:19 localhost kernel: RDX: RSI: 8801256cb248 RDI: 820236d0 Jun 9 03:49:19 localhost kernel: RBP: 8801256cb298 R08: ea0008be7ee0 R09: 8801256cb498 Jun 9 03:49:19 localhost kernel: R10: 8801256cbfd8 R11: 0002 R12: 880227013800 Jun 9 03:49:19 localhost kernel: R13: 81c652b8 R14: 0001 R15: 81c652a0 Jun 9 03:49:19 localhost kernel: FS: 01ad8900(0063) GS:88022fd0() knlGS: Jun 9 03:49:19 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 Jun 9 03:49:19 localhost kernel: CR2: 02131878 CR3: 00014a4f3000 CR4: 07e0 Jun 9 03:49:19 localhost kernel: Stack: Jun 9 03:49:19 localhost kernel: ea0008be7ec0 000c2a0d 8801256cb2d8 811b60c7 Jun 9 03:49:19 localhost kernel: 8801256cb738 ea0008be7ec0 8801256cb4b0 Jun 9 03:49:19 localhost kernel: ea0008be7ee0 81c652a0 8801256cb418 811b924f Jun 9 03:49:19 localhost kernel: Call Trace: Jun 9 03:49:19 localhost kernel: [811b60c7] __remove_mapping+0x107/0x180 Jun 9 03:49:19 localhost kernel: [811b924f] shrink_page_list
Re: [ANNOUNCE] 4.0.4-rt1
On 06/09/2015 03:05 PM, Pavel Vasilyev wrote: 09.06.2015 19:45, Fernando Lopez-Lezcano пишет: This is still happening, about once a day. John Dulaney help me set up a crash kernel dump (thanks!) so now I have a kernel core dump for this one, Asus,Fedora,CGROUPS, iptables,snd_ac97,radeon,raid1,kvm - this realtime system? :D :-P Yup. I have been using rt for many years - and packaging it - for (very) low latency sound processing. Linux + rt + jackd + rtirq + threaded irqs + jack clients, everything with the right priorities. Runs very nicely unless you hit an issue like the one I'm asking about[*]. Usually running snd_hdspm with RME hardware when in concert situations (this one with Asus mobo is my - quite old by now - desktop at work, but I'm also having the same problem in my Lenovo laptop). -- Fernando [*] for example a whole concert for a 24.8 3D sound system with a remote ethernet driven D/A and running all the time with 64 frame x 2 buffers at 48KHz. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 4.0.4-rt1
On 05/26/2015 12:41 PM, Fernando Lopez-Lezcano wrote: On 05/26/2015 08:43 AM, Clark Williams wrote: On Tue, 26 May 2015 11:19:24 -0400 Steven Rostedt wrote: On Tue, 26 May 2015 08:48:02 -0500 Clark Williams wrote: Change the WARN_ON to WARN_ON_NORT Do we have a WARN_ON_NORT? I see a WARN_ON_NONRT, but not a WARN_ON_NORT. Does this compile? -- Steve Sigh. Of course not. Reupdated patch (and yes this one compiles): Thanks! Seems to have fixed the problem (of course!) So far so good and nothing weird in the output of dmesg Oh well. Second time the machine hangs in two days in the same way (otherwise very stable running 3.18.x-rty) (this is a bumblebee + bbswitch graphics laptop - argh, if I had known better...) May 28 18:49:21 localhost kernel: [ cut here ] May 28 18:49:21 localhost kernel: kernel BUG at mm/memcontrol.c:5848! May 28 18:49:21 localhost kernel: invalid opcode: [#1] PREEMPT SMP May 28 18:49:21 localhost kernel: Modules linked in: ccm rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reje\ ct_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_fi\ lter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnep bbswitch(OE) vfat fat iTCO_wdt iTCO_vendor_support arc4 intel\ _rapl iosf_mbi coretemp kvm_intel kvm uvcvideo crct10dif_pclmul videobuf2_vmalloc crc32_pclmul crc32c_intel videobuf2_core videobuf2_memops ghash_clmulni_intel v4l2_common iwlmvm videodev media mac80211 \ btusb bluetooth iwlwifi May 28 18:49:21 localhost kernel: snd_hda_codec_realtek snd_hda_codec_hdmi lpc_ich snd_hda_codec_generic cfg80211 mfd_core i2c_i801 snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_se\ q_device snd_pcm thinkpad_acpi snd_timer rfkill tpm_tis snd mei_me ie31200_edac tpm mei edac_core soundcore shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc serio_raw i915 sdhci_pci i2c_algo_bit sdhci \ e1000e drm_kms_helper mmc_core ptp drm pps_core wmi video May 28 18:49:21 localhost kernel: CPU: 4 PID: 134 Comm: kswapd0 Tainted: G OE 4.0.4-201.rt1.3.fc21.ccrma.x86_64+rt #1 May 28 18:49:21 localhost kernel: Hardware name: LENOVO 20BGCTO1WW/20BGCTO1WW, BIOS GNET65WW (2.13 ) 06/20/2014 May 28 18:49:21 localhost kernel: task: 88046650e9a0 ti: 88046657c000 task.ti: 88046657c000 May 28 18:49:21 localhost kernel: RIP: 0010:[] [] mem_cgroup_swapout+0x118/0x120 May 28 18:49:21 localhost kernel: RSP: 0018:88046657f998 EFLAGS: 00010202 May 28 18:49:21 localhost kernel: RAX: 0246 RBX: ea0011150e80 RCX: May 28 18:49:21 localhost kernel: RDX: 0006a980 RSI: 0001 RDI: 88046d413800 May 28 18:49:21 localhost kernel: RBP: 88046657f9a8 R08: 81c68700 R09: 88046657fba8 May 28 18:49:21 localhost kernel: R10: 000a R11: 88046657ffd8 R12: 88046d413800 May 28 18:49:21 localhost kernel: R13: 81c68718 R14: 0001 R15: 88046657faa8 May 28 18:49:21 localhost kernel: FS: () GS:88046da0() knlGS: May 28 18:49:21 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 May 28 18:49:21 localhost kernel: CR2: 7efd07eb8000 CR3: 01c0e000 CR4: 001407e0 May 28 18:49:21 localhost kernel: Stack: May 28 18:49:21 localhost kernel: ea0011150e80 00196218 88046657f9e8 811b8d8f May 28 18:49:21 localhost kernel: 88046657fe48 ea0011150e80 88046657fbc0 May 28 18:49:21 localhost kernel: ea0011150ea0 88046657faa8 88046657fb28 811bba3f May 28 18:49:21 localhost kernel: Call Trace: May 28 18:49:21 localhost kernel: [] __remove_mapping+0x12f/0x1a0 May 28 18:49:21 localhost kernel: [] shrink_page_list+0x5ef/0xc30 May 28 18:49:21 localhost kernel: [] shrink_inactive_list+0x1e9/0x630 May 28 18:49:21 localhost kernel: [] shrink_lruvec+0x62c/0x830 May 28 18:49:21 localhost kernel: [] ? __switch_to+0x150/0x610 May 28 18:49:21 localhost kernel: [] shrink_zone+0xf4/0x2d0 May 28 18:49:21 localhost kernel: [] kswapd+0x587/0xa80 May 28 18:49:21 localhost kernel: [] ? mem_cgroup_shrink_node_zone+0x1f0/0x1f0 May 28 18:49:21 localhost kernel: [] kthread+0xca/0xe0 May 28 18:49:21 localhost kernel: [] ? kthread_worker_fn+0x180/0x180 May 28 18:49:21 localhost kernel: [] ret_from_fork+0x58/0x90 May 28 18:49:21 localhost kernel: [] ? kthread_worker_fn+0x180/0x180 May 28 18:49:21 localhost kernel: Code: a6 81 48 89 df e8 a9 ce fb ff 0f 0b 0f 1f 80 00 00 00 00 48 c7 c6 f3 b7 a6 81 48 89 df e8 91 ce fb ff 0f 0b 0f 1f 80 00 00 00 00 <0f> 0b 66 0f 1f 44 00 00
Re: [ANNOUNCE] 4.0.4-rt1
On 05/26/2015 12:41 PM, Fernando Lopez-Lezcano wrote: On 05/26/2015 08:43 AM, Clark Williams wrote: On Tue, 26 May 2015 11:19:24 -0400 Steven Rostedt rost...@goodmis.org wrote: On Tue, 26 May 2015 08:48:02 -0500 Clark Williams willi...@redhat.com wrote: Change the WARN_ON to WARN_ON_NORT Do we have a WARN_ON_NORT? I see a WARN_ON_NONRT, but not a WARN_ON_NORT. Does this compile? -- Steve Sigh. Of course not. Reupdated patch (and yes this one compiles): Thanks! Seems to have fixed the problem (of course!) So far so good and nothing weird in the output of dmesg Oh well. Second time the machine hangs in two days in the same way (otherwise very stable running 3.18.x-rty) (this is a bumblebee + bbswitch graphics laptop - argh, if I had known better...) May 28 18:49:21 localhost kernel: [ cut here ] May 28 18:49:21 localhost kernel: kernel BUG at mm/memcontrol.c:5848! May 28 18:49:21 localhost kernel: invalid opcode: [#1] PREEMPT SMP May 28 18:49:21 localhost kernel: Modules linked in: ccm rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reje\ ct_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_fi\ lter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnep bbswitch(OE) vfat fat iTCO_wdt iTCO_vendor_support arc4 intel\ _rapl iosf_mbi coretemp kvm_intel kvm uvcvideo crct10dif_pclmul videobuf2_vmalloc crc32_pclmul crc32c_intel videobuf2_core videobuf2_memops ghash_clmulni_intel v4l2_common iwlmvm videodev media mac80211 \ btusb bluetooth iwlwifi May 28 18:49:21 localhost kernel: snd_hda_codec_realtek snd_hda_codec_hdmi lpc_ich snd_hda_codec_generic cfg80211 mfd_core i2c_i801 snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_se\ q_device snd_pcm thinkpad_acpi snd_timer rfkill tpm_tis snd mei_me ie31200_edac tpm mei edac_core soundcore shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc serio_raw i915 sdhci_pci i2c_algo_bit sdhci \ e1000e drm_kms_helper mmc_core ptp drm pps_core wmi video May 28 18:49:21 localhost kernel: CPU: 4 PID: 134 Comm: kswapd0 Tainted: G OE 4.0.4-201.rt1.3.fc21.ccrma.x86_64+rt #1 May 28 18:49:21 localhost kernel: Hardware name: LENOVO 20BGCTO1WW/20BGCTO1WW, BIOS GNET65WW (2.13 ) 06/20/2014 May 28 18:49:21 localhost kernel: task: 88046650e9a0 ti: 88046657c000 task.ti: 88046657c000 May 28 18:49:21 localhost kernel: RIP: 0010:[812164b8] [812164b8] mem_cgroup_swapout+0x118/0x120 May 28 18:49:21 localhost kernel: RSP: 0018:88046657f998 EFLAGS: 00010202 May 28 18:49:21 localhost kernel: RAX: 0246 RBX: ea0011150e80 RCX: May 28 18:49:21 localhost kernel: RDX: 0006a980 RSI: 0001 RDI: 88046d413800 May 28 18:49:21 localhost kernel: RBP: 88046657f9a8 R08: 81c68700 R09: 88046657fba8 May 28 18:49:21 localhost kernel: R10: 000a R11: 88046657ffd8 R12: 88046d413800 May 28 18:49:21 localhost kernel: R13: 81c68718 R14: 0001 R15: 88046657faa8 May 28 18:49:21 localhost kernel: FS: () GS:88046da0() knlGS: May 28 18:49:21 localhost kernel: CS: 0010 DS: ES: CR0: 80050033 May 28 18:49:21 localhost kernel: CR2: 7efd07eb8000 CR3: 01c0e000 CR4: 001407e0 May 28 18:49:21 localhost kernel: Stack: May 28 18:49:21 localhost kernel: ea0011150e80 00196218 88046657f9e8 811b8d8f May 28 18:49:21 localhost kernel: 88046657fe48 ea0011150e80 88046657fbc0 May 28 18:49:21 localhost kernel: ea0011150ea0 88046657faa8 88046657fb28 811bba3f May 28 18:49:21 localhost kernel: Call Trace: May 28 18:49:21 localhost kernel: [811b8d8f] __remove_mapping+0x12f/0x1a0 May 28 18:49:21 localhost kernel: [811bba3f] shrink_page_list+0x5ef/0xc30 May 28 18:49:21 localhost kernel: [811bc709] shrink_inactive_list+0x1e9/0x630 May 28 18:49:21 localhost kernel: [811bd50c] shrink_lruvec+0x62c/0x830 May 28 18:49:21 localhost kernel: [81014610] ? __switch_to+0x150/0x610 May 28 18:49:21 localhost kernel: [811bd804] shrink_zone+0xf4/0x2d0 May 28 18:49:21 localhost kernel: [811bec37] kswapd+0x587/0xa80 May 28 18:49:21 localhost kernel: [811be6b0] ? mem_cgroup_shrink_node_zone+0x1f0/0x1f0 May 28 18:49:21 localhost kernel: [810bf4ba] kthread+0xca/0xe0 May 28 18:49:21 localhost kernel: [810bf3f0] ? kthread_worker_fn+0x180/0x180 May 28 18:49:21 localhost kernel: [817a2098] ret_from_fork+0x58/0x90 May 28 18:49:21 localhost
Re: [ANNOUNCE] 4.0.4-rt1
On 05/26/2015 08:43 AM, Clark Williams wrote: On Tue, 26 May 2015 11:19:24 -0400 Steven Rostedt wrote: On Tue, 26 May 2015 08:48:02 -0500 Clark Williams wrote: Change the WARN_ON to WARN_ON_NORT Do we have a WARN_ON_NORT? I see a WARN_ON_NONRT, but not a WARN_ON_NORT. Does this compile? -- Steve Sigh. Of course not. Reupdated patch (and yes this one compiles): Thanks! Seems to have fixed the problem (of course!) So far so good and nothing weird in the output of dmesg -- Fernando From: Clark Williams Date: Thu, 21 May 2015 12:51:53 -0500 Subject: [PATCH] [rt] i915: bogus warning from i915 when running on PREEMPT_RT The i915 driver has a 'WARN_ON(!in_interrupt())' in the display handler, which whines constanly on the RT kernel (since the interrupt is actually handled in a threaded handler and not actual interrupt context). Change the WARN_ON to WARN_ON_NORT Signed-off-by: Clark Williams --- drivers/gpu/drm/i915/intel_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index f75173c20f47..30b1d16caa0d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9745,7 +9745,7 @@ void intel_check_page_flip(struct drm_device *dev, int pipe) struct drm_crtc *crtc = dev_priv->pipe_to_crtc_mapping[pipe]; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - WARN_ON(!in_interrupt()); + WARN_ON_NONRT(!in_interrupt()); if (crtc == NULL) return; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 4.0.4-rt1
On 05/26/2015 08:43 AM, Clark Williams wrote: On Tue, 26 May 2015 11:19:24 -0400 Steven Rostedt rost...@goodmis.org wrote: On Tue, 26 May 2015 08:48:02 -0500 Clark Williams willi...@redhat.com wrote: Change the WARN_ON to WARN_ON_NORT Do we have a WARN_ON_NORT? I see a WARN_ON_NONRT, but not a WARN_ON_NORT. Does this compile? -- Steve Sigh. Of course not. Reupdated patch (and yes this one compiles): Thanks! Seems to have fixed the problem (of course!) So far so good and nothing weird in the output of dmesg -- Fernando From: Clark Williams willi...@redhat.com Date: Thu, 21 May 2015 12:51:53 -0500 Subject: [PATCH] [rt] i915: bogus warning from i915 when running on PREEMPT_RT The i915 driver has a 'WARN_ON(!in_interrupt())' in the display handler, which whines constanly on the RT kernel (since the interrupt is actually handled in a threaded handler and not actual interrupt context). Change the WARN_ON to WARN_ON_NORT Signed-off-by: Clark Williams willi...@redhat.com --- drivers/gpu/drm/i915/intel_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index f75173c20f47..30b1d16caa0d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9745,7 +9745,7 @@ void intel_check_page_flip(struct drm_device *dev, int pipe) struct drm_crtc *crtc = dev_priv-pipe_to_crtc_mapping[pipe]; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - WARN_ON(!in_interrupt()); + WARN_ON_NONRT(!in_interrupt()); if (crtc == NULL) return; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 4.0.4-rt1
On 05/19/2015 02:39 PM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.0.4-rt1 patch set. Great!! Changes since v3.18.13-rt10 - Rebase to v4.0. - David Hildenbrand's series of decouple of preempt_disable from pagefault_disable is part of the series. While doing the v4.0 I stumbled upon a few things. Therefore I plan to reorder the -RT queue and merge patches where possible. Also I intend to drop PREEMPT_RTB and PREEMPT_RT_BASE unless there is need for it… ... I had to do this to get it to build (looks like it is not rt specific, probably just a typo in mainline): --- linux-4.0/sound/soc/intel/sst/sst.c~2015-04-12 15:12:50.0 -0700 +++ linux-4.0/sound/soc/intel/sst/sst.c 2015-05-23 21:51:46.0 -0700 @@ -368,8 +368,8 @@ * initialize by FW or driver when firmware is loaded */ spin_lock_irqsave(>ipc_spin_lock, irq_flags); - sst_shim_write64(shim, SST_IMRX, shim_regs->imrx), - sst_shim_write64(shim, SST_CSR, shim_regs->csr), + sst_shim_write64(shim, SST_IMRX, shim_regs->imrx); + sst_shim_write64(shim, SST_CSR, shim_regs->csr); spin_unlock_irqrestore(>ipc_spin_lock, irq_flags); } On a desktop with an i7-3770k it seems to run fine (but I have not have time to test for latency problems). On my laptop, a lenovo w540, I get this continuously - so it is not really usable at this point: May 24 13:51:41 localhost kernel: [ cut here ] May 24 13:51:41 localhost kernel: WARNING: CPU: 5 PID: 361 at drivers/gpu/drm/i915/intel_display.c:9748 intel_check_page_flip+0xaa/0xf0 [i915]() May 24 13:51:41 localhost kernel: WARN_ON(!in_interrupt()) May 24 13:51:41 localhost kernel: Modules linked in: May 24 13:51:41 localhost kernel: rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJ\ ECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mang\ le ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security\ iptable_raw bnep bbswitch(OE) vfat fat iTCO_wdt iTCO_vendor_support arc4 intel_rapl iosf_mbi coretemp kvm_intel kvm uvcvideo crct10dif_pclmul videobuf2_vmalloc videobuf\ 2_core crc32_pclmul crc32c_intel videobuf2_memops ghash_clmulni_intel v4l2_common videodev media iwlmvm btusb serio_raw mac80211 bluetooth snd_hda_codec_realtek May 24 13:51:41 localhost kernel: snd_hda_codec_hdmi snd_hda_codec_generic iwlwifi snd_hda_intel sdhci_pci snd_hda_controller cfg80211 sdhci snd_hda_codec mmc_core snd_h\ wdep lpc_ich snd_seq mei_me i2c_i801 mfd_core snd_seq_device mei snd_pcm thinkpad_acpi snd_timer ie31200_edac snd shpchp edac_core soundcore tpm_tis rfkill tpm nfsd auth\ _rpcgss nfs_acl lockd grace sunrpc i915 i2c_algo_bit e1000e drm_kms_helper ptp drm pps_core wmi video May 24 13:51:41 localhost kernel: CPU: 5 PID: 361 Comm: irq/30-i915 Tainted: GW OE 4.0.4-201.rt1.2.fc21.ccrma.x86_64+rt #1 May 24 13:51:41 localhost kernel: Hardware name: LENOVO 20BGCTO1WW/20BGCTO1WW, BIOS GNET65WW (2.13 ) 06/20/2014 May 24 13:51:41 localhost kernel: 1f35af7b 8804651afc78 8179c0b9 May 24 13:51:41 localhost kernel: 8804651afcd0 8804651afcb8 8109ee1a May 24 13:51:41 localhost kernel: 8804651afcb8 88046638c000 880469dd7800 0001 May 24 13:51:41 localhost kernel: Call Trace: May 24 13:51:41 localhost kernel: [] dump_stack+0x4c/0x81 May 24 13:51:41 localhost kernel: [] warn_slowpath_common+0x8a/0xe0 May 24 13:51:41 localhost kernel: [] warn_slowpath_fmt+0x55/0x70 May 24 13:51:41 localhost kernel: [] intel_check_page_flip+0xaa/0xf0 [i915] May 24 13:51:41 localhost kernel: [] ironlake_irq_handler+0x2e8/0x1000 [i915] May 24 13:51:41 localhost kernel: [] ? __switch_to+0x150/0x610 May 24 13:51:41 localhost kernel: [] ? irq_thread_fn+0x50/0x50 May 24 13:51:41 localhost kernel: [] irq_forced_thread_fn+0x27/0x80 May 24 13:51:41 localhost kernel: [] irq_thread+0x12f/0x180 May 24 13:51:41 localhost kernel: [] ? wake_threads_waitq+0x30/0x30 May 24 13:51:41 localhost kernel: [] ? irq_thread_check_affinity+0x90/0x90 May 24 13:51:41 localhost kernel: [] kthread+0xca/0xe0 May 24 13:51:41 localhost kernel: [] ? kthread_worker_fn+0x180/0x180 May 24 13:51:41 localhost kernel: [] ret_from_fork+0x58/0x90 May 24 13:51:41 localhost kernel: [] ? kthread_worker_fn+0x180/0x180 May 24 13:51:41 localhost kernel: ---[ end trace 05fe ]--- May 24 13:51:41 localhost kernel: [ cut here ] Any patches I could try to fix this? -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
Re: [ANNOUNCE] 4.0.4-rt1
On 05/19/2015 02:39 PM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v4.0.4-rt1 patch set. Great!! Changes since v3.18.13-rt10 - Rebase to v4.0. - David Hildenbrand's series of decouple of preempt_disable from pagefault_disable is part of the series. While doing the v4.0 I stumbled upon a few things. Therefore I plan to reorder the -RT queue and merge patches where possible. Also I intend to drop PREEMPT_RTB and PREEMPT_RT_BASE unless there is need for it… ... I had to do this to get it to build (looks like it is not rt specific, probably just a typo in mainline): --- linux-4.0/sound/soc/intel/sst/sst.c~2015-04-12 15:12:50.0 -0700 +++ linux-4.0/sound/soc/intel/sst/sst.c 2015-05-23 21:51:46.0 -0700 @@ -368,8 +368,8 @@ * initialize by FW or driver when firmware is loaded */ spin_lock_irqsave(ctx-ipc_spin_lock, irq_flags); - sst_shim_write64(shim, SST_IMRX, shim_regs-imrx), - sst_shim_write64(shim, SST_CSR, shim_regs-csr), + sst_shim_write64(shim, SST_IMRX, shim_regs-imrx); + sst_shim_write64(shim, SST_CSR, shim_regs-csr); spin_unlock_irqrestore(ctx-ipc_spin_lock, irq_flags); } On a desktop with an i7-3770k it seems to run fine (but I have not have time to test for latency problems). On my laptop, a lenovo w540, I get this continuously - so it is not really usable at this point: May 24 13:51:41 localhost kernel: [ cut here ] May 24 13:51:41 localhost kernel: WARNING: CPU: 5 PID: 361 at drivers/gpu/drm/i915/intel_display.c:9748 intel_check_page_flip+0xaa/0xf0 [i915]() May 24 13:51:41 localhost kernel: WARN_ON(!in_interrupt()) May 24 13:51:41 localhost kernel: Modules linked in: May 24 13:51:41 localhost kernel: rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJ\ ECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mang\ le ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security\ iptable_raw bnep bbswitch(OE) vfat fat iTCO_wdt iTCO_vendor_support arc4 intel_rapl iosf_mbi coretemp kvm_intel kvm uvcvideo crct10dif_pclmul videobuf2_vmalloc videobuf\ 2_core crc32_pclmul crc32c_intel videobuf2_memops ghash_clmulni_intel v4l2_common videodev media iwlmvm btusb serio_raw mac80211 bluetooth snd_hda_codec_realtek May 24 13:51:41 localhost kernel: snd_hda_codec_hdmi snd_hda_codec_generic iwlwifi snd_hda_intel sdhci_pci snd_hda_controller cfg80211 sdhci snd_hda_codec mmc_core snd_h\ wdep lpc_ich snd_seq mei_me i2c_i801 mfd_core snd_seq_device mei snd_pcm thinkpad_acpi snd_timer ie31200_edac snd shpchp edac_core soundcore tpm_tis rfkill tpm nfsd auth\ _rpcgss nfs_acl lockd grace sunrpc i915 i2c_algo_bit e1000e drm_kms_helper ptp drm pps_core wmi video May 24 13:51:41 localhost kernel: CPU: 5 PID: 361 Comm: irq/30-i915 Tainted: GW OE 4.0.4-201.rt1.2.fc21.ccrma.x86_64+rt #1 May 24 13:51:41 localhost kernel: Hardware name: LENOVO 20BGCTO1WW/20BGCTO1WW, BIOS GNET65WW (2.13 ) 06/20/2014 May 24 13:51:41 localhost kernel: 1f35af7b 8804651afc78 8179c0b9 May 24 13:51:41 localhost kernel: 8804651afcd0 8804651afcb8 8109ee1a May 24 13:51:41 localhost kernel: 8804651afcb8 88046638c000 880469dd7800 0001 May 24 13:51:41 localhost kernel: Call Trace: May 24 13:51:41 localhost kernel: [8179c0b9] dump_stack+0x4c/0x81 May 24 13:51:41 localhost kernel: [8109ee1a] warn_slowpath_common+0x8a/0xe0 May 24 13:51:41 localhost kernel: [8109eec5] warn_slowpath_fmt+0x55/0x70 May 24 13:51:41 localhost kernel: [a0186dda] intel_check_page_flip+0xaa/0xf0 [i915] May 24 13:51:41 localhost kernel: [a0152018] ironlake_irq_handler+0x2e8/0x1000 [i915] May 24 13:51:41 localhost kernel: [81014610] ? __switch_to+0x150/0x610 May 24 13:51:41 localhost kernel: [810fb040] ? irq_thread_fn+0x50/0x50 May 24 13:51:41 localhost kernel: [810fb067] irq_forced_thread_fn+0x27/0x80 May 24 13:51:41 localhost kernel: [810fb61f] irq_thread+0x12f/0x180 May 24 13:51:41 localhost kernel: [810fb0f0] ? wake_threads_waitq+0x30/0x30 May 24 13:51:41 localhost kernel: [810fb4f0] ? irq_thread_check_affinity+0x90/0x90 May 24 13:51:41 localhost kernel: [810bf4ba] kthread+0xca/0xe0 May 24 13:51:41 localhost kernel: [810bf3f0] ? kthread_worker_fn+0x180/0x180 May 24 13:51:41 localhost kernel: [817a2098] ret_from_fork+0x58/0x90 May 24 13:51:41 localhost kernel: [810bf3f0] ? kthread_worker_fn+0x180/0x180 May 24 13:51:41 localhost kernel: ---[ end
Re: [ANNOUNCE] 3.14-rt1
On 05/02/2014 04:37 AM, Sebastian Andrzej Siewior wrote: * Fernando Lopez-Lezcano | 2014-04-26 11:29:04 [-0700]: Saw this a moment ago (3.14.1 + rt1, Fedora 19 laptop - I think I have seen something similar in 3.12.x-r): Yes, you did: https://lkml.org/lkml/2014/3/7/163 You did not test I've sent. Care to do so? I did patch my kernel and (I think) I did not see the problem again. I did get some very occassional hangs that seemed to be video related but I think I could not see what had caused them. Apr 26 11:16:11 localhost kernel: [ 96.323248] [ cut here ] Apr 26 11:16:11 localhost kernel: [ 96.323262] WARNING: CPU: 0 PID: 2051 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() Apr 26 11:16:11 localhost kernel: [ 96.323264] list_del corruption. prev->next should be 8802101196a0, but was 0001 Apr 26 11:16:11 localhost kernel: [ 96.323266] Modules linked in: and please send backtrace information properly formatted. This is terrible hard to read. Sorry about that, I will attach files in the future. I re-patched 3.14.3-rt5 with a slightly tweaked version of you patch. Will see what happens and report back. -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.14-rt1
On 05/02/2014 04:37 AM, Sebastian Andrzej Siewior wrote: * Fernando Lopez-Lezcano | 2014-04-26 11:29:04 [-0700]: Saw this a moment ago (3.14.1 + rt1, Fedora 19 laptop - I think I have seen something similar in 3.12.x-r): Yes, you did: https://lkml.org/lkml/2014/3/7/163 You did not test I've sent. Care to do so? I did patch my kernel and (I think) I did not see the problem again. I did get some very occassional hangs that seemed to be video related but I think I could not see what had caused them. Apr 26 11:16:11 localhost kernel: [ 96.323248] [ cut here ] Apr 26 11:16:11 localhost kernel: [ 96.323262] WARNING: CPU: 0 PID: 2051 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() Apr 26 11:16:11 localhost kernel: [ 96.323264] list_del corruption. prev-next should be 8802101196a0, but was 0001 Apr 26 11:16:11 localhost kernel: [ 96.323266] Modules linked in: and please send backtrace information properly formatted. This is terrible hard to read. Sorry about that, I will attach files in the future. I re-patched 3.14.3-rt5 with a slightly tweaked version of you patch. Will see what happens and report back. -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.14-rt1
On 04/11/2014 11:57 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.14-rt1 patch setty). Changes since v3.12.15-rt25 - I dropped the sparc64 patches I had in the queue. They did not apply cleanly, the code in v3.14 changed in the MMU area. Here is where I remembered that it was not working perfectly either. Saw this a moment ago (3.14.1 + rt1, Fedora 19 laptop - I think I have seen something similar in 3.12.x-r): Apr 26 11:16:11 localhost kernel: [ 96.323248] [ cut here ] Apr 26 11:16:11 localhost kernel: [ 96.323262] WARNING: CPU: 0 PID: 2051 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() Apr 26 11:16:11 localhost kernel: [ 96.323264] list_del corruption. prev->next should be 8802101196a0, but was 0001 Apr 26 11:16:11 localhost kernel: [ 96.323266] Modules linked in: fuse ipt_MASQUERADE xt_CHECKSUM tun ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw rfcomm ip6table_filter bnep ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo videobuf2_vmalloc microcode videobuf2_memops snd_hda_codec_hdmi videobuf2_core videodev media serio_raw btusb bluetooth intel_ips i2c_i801 6lowpan_iphc snd_hda_codec_conexant snd_hda_codec_generic arc4 iwldvm mac80211 iwlwifi lpc_ich sdhci_pci mfd_core sdhci cfg80211 mmc_core snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e snd_timer ptp mei_me pps_core mei shpchp thinkpad_acpi snd ppdev soundcore rfkill parport_pc parport acpi_cpufreq uinput firewire_ohci nouveau firewire_core crc_itu_t i2c_algo_bit drm_kms_helper ttm drm mxm_wmi i2c_core wmi video Apr 26 11:16:11 localhost kernel: [ 96.323331] CPU: 0 PID: 2051 Comm: cinnamon Not tainted 3.14.1-200.rt1.1.fc19.ccrma.x86_64+rt #1 Apr 26 11:16:11 localhost kernel: [ 96.323332] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 Apr 26 11:16:11 localhost kernel: [ 96.323334] 8a5c11dc 8800ae715a88 81707fca Apr 26 11:16:11 localhost kernel: [ 96.323336] 8800ae715ad0 8800ae715ac0 8108d03d 8802101196a0 Apr 26 11:16:11 localhost kernel: [ 96.323337] 880210119b50 880210119b50 880210119b40 88021a615648 Apr 26 11:16:11 localhost kernel: [ 96.323338] Call Trace: Apr 26 11:16:11 localhost kernel: [ 96.323345] [] dump_stack+0x4d/0x82 Apr 26 11:16:11 localhost kernel: [ 96.323351] [] warn_slowpath_common+0x7d/0xc0 Apr 26 11:16:11 localhost kernel: [ 96.323352] [] warn_slowpath_fmt+0x5c/0x80 Apr 26 11:16:11 localhost kernel: [ 96.323354] [] __list_del_entry+0xa1/0xd0 Apr 26 11:16:11 localhost kernel: [ 96.323355] [] list_del+0xd/0x30 Apr 26 11:16:11 localhost kernel: [ 96.323393] [] nouveau_fence_signal+0x53/0x80 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323414] [] nouveau_fence_update+0x48/0xa0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323435] [] nouveau_fence_sync+0x45/0x80 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323456] [] validate_list+0xd8/0x2e0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323478] [] nouveau_gem_ioctl_pushbuf+0xaa3/0x13e0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323500] [] drm_ioctl+0x4f2/0x620 [drm] Apr 26 11:16:11 localhost kernel: [ 96.323506] [] ? migrate_enable+0x94/0x1c0 Apr 26 11:16:11 localhost kernel: [ 96.323527] [] nouveau_drm_ioctl+0x4e/0x90 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323530] [] do_vfs_ioctl+0x2e0/0x4c0 Apr 26 11:16:11 localhost kernel: [ 96.323533] [] ? file_has_perm+0xa6/0xb0 Apr 26 11:16:11 localhost kernel: [ 96.323535] [] SyS_ioctl+0x81/0xa0 Apr 26 11:16:11 localhost kernel: [ 96.323538] [] system_call_fastpath+0x16/0x1b Apr 26 11:16:11 localhost kernel: [ 96.323569] ---[ end trace 0002 ]--- -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.14-rt1
On 04/11/2014 11:57 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.14-rt1 patch setty). Changes since v3.12.15-rt25 - I dropped the sparc64 patches I had in the queue. They did not apply cleanly, the code in v3.14 changed in the MMU area. Here is where I remembered that it was not working perfectly either. Saw this a moment ago (3.14.1 + rt1, Fedora 19 laptop - I think I have seen something similar in 3.12.x-r): Apr 26 11:16:11 localhost kernel: [ 96.323248] [ cut here ] Apr 26 11:16:11 localhost kernel: [ 96.323262] WARNING: CPU: 0 PID: 2051 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() Apr 26 11:16:11 localhost kernel: [ 96.323264] list_del corruption. prev-next should be 8802101196a0, but was 0001 Apr 26 11:16:11 localhost kernel: [ 96.323266] Modules linked in: fuse ipt_MASQUERADE xt_CHECKSUM tun ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw rfcomm ip6table_filter bnep ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel uvcvideo videobuf2_vmalloc microcode videobuf2_memops snd_hda_codec_hdmi videobuf2_core videodev media serio_raw btusb bluetooth intel_ips i2c_i801 6lowpan_iphc snd_hda_codec_conexant snd_hda_codec_generic arc4 iwldvm mac80211 iwlwifi lpc_ich sdhci_pci mfd_core sdhci cfg80211 mmc_core snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e snd_timer ptp mei_me pps_core mei shpchp thinkpad_acpi snd ppdev soundcore rfkill parport_pc parport acpi_cpufreq uinput firewire_ohci nouveau firewire_core crc_itu_t i2c_algo_bit drm_kms_helper ttm drm mxm_wmi i2c_core wmi video Apr 26 11:16:11 localhost kernel: [ 96.323331] CPU: 0 PID: 2051 Comm: cinnamon Not tainted 3.14.1-200.rt1.1.fc19.ccrma.x86_64+rt #1 Apr 26 11:16:11 localhost kernel: [ 96.323332] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 Apr 26 11:16:11 localhost kernel: [ 96.323334] 8a5c11dc 8800ae715a88 81707fca Apr 26 11:16:11 localhost kernel: [ 96.323336] 8800ae715ad0 8800ae715ac0 8108d03d 8802101196a0 Apr 26 11:16:11 localhost kernel: [ 96.323337] 880210119b50 880210119b50 880210119b40 88021a615648 Apr 26 11:16:11 localhost kernel: [ 96.323338] Call Trace: Apr 26 11:16:11 localhost kernel: [ 96.323345] [81707fca] dump_stack+0x4d/0x82 Apr 26 11:16:11 localhost kernel: [ 96.323351] [8108d03d] warn_slowpath_common+0x7d/0xc0 Apr 26 11:16:11 localhost kernel: [ 96.323352] [8108d0dc] warn_slowpath_fmt+0x5c/0x80 Apr 26 11:16:11 localhost kernel: [ 96.323354] [8137c551] __list_del_entry+0xa1/0xd0 Apr 26 11:16:11 localhost kernel: [ 96.323355] [8137c58d] list_del+0xd/0x30 Apr 26 11:16:11 localhost kernel: [ 96.323393] [a0135593] nouveau_fence_signal+0x53/0x80 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323414] [a0135678] nouveau_fence_update+0x48/0xa0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323435] [a0135f85] nouveau_fence_sync+0x45/0x80 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323456] [a013aea8] validate_list+0xd8/0x2e0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323478] [a013c3d3] nouveau_gem_ioctl_pushbuf+0xaa3/0x13e0 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323500] [a002ad02] drm_ioctl+0x4f2/0x620 [drm] Apr 26 11:16:11 localhost kernel: [ 96.323506] [810c1af4] ? migrate_enable+0x94/0x1c0 Apr 26 11:16:11 localhost kernel: [ 96.323527] [a0132cfe] nouveau_drm_ioctl+0x4e/0x90 [nouveau] Apr 26 11:16:11 localhost kernel: [ 96.323530] [81203480] do_vfs_ioctl+0x2e0/0x4c0 Apr 26 11:16:11 localhost kernel: [ 96.323533] [812fd8d6] ? file_has_perm+0xa6/0xb0 Apr 26 11:16:11 localhost kernel: [ 96.323535] [812036e1] SyS_ioctl+0x81/0xa0 Apr 26 11:16:11 localhost kernel: [ 96.323538] [81716769] system_call_fastpath+0x16/0x1b Apr 26 11:16:11 localhost kernel: [ 96.323569] ---[ end trace 0002 ]--- -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/14/2014 02:43 AM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 03:55 PM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[] [] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? # addr2line -e /usr/lib/debug/lib/modules/3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.9-301.rt13.1.fc20.ccrma.x86_64/kernel/smp.c:108 So it's stuck in csd_lock_wait(), which means that the csd of the target cpu is not free. Is the machine completely dead or can you still retrieve information from it? After migrating to fc20/3.12.x-rtyy I started experiencing freezes in some workstations. This coincided with one of our students running high cpu load multi-core computations in them (he had been doing that before under 3.10.x-rtyy with no problems). In the morning I would find workstations unresponsive and catatonic. Probably his software was still eating up cpu as the machines were warm (ie: still under load). No pings back or keyboard/mouse/display response. This was the only time I could get information from a machine while it was in the process of freezing up - but this might have been a different issue. I was ssh'd in and that terminal became unresponsive. I managed to ssh in again and looked at the logs. The machine was not completely frozen but it eventually became completely catatonic. For all I know this might be different from the locked machines syndrome as it left traces in the logs (I could forward you all the log entries if you want). I could try to boot one of the machines into 3.12.xrtyy, replicate the conditions and wait. What should I look for if I can catch this in the act? -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/14/2014 02:43 AM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 03:55 PM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[810dc60a] [810dc60a] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? # addr2line -e /usr/lib/debug/lib/modules/3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.9-301.rt13.1.fc20.ccrma.x86_64/kernel/smp.c:108 So it's stuck in csd_lock_wait(), which means that the csd of the target cpu is not free. Is the machine completely dead or can you still retrieve information from it? After migrating to fc20/3.12.x-rtyy I started experiencing freezes in some workstations. This coincided with one of our students running high cpu load multi-core computations in them (he had been doing that before under 3.10.x-rtyy with no problems). In the morning I would find workstations unresponsive and catatonic. Probably his software was still eating up cpu as the machines were warm (ie: still under load). No pings back or keyboard/mouse/display response. This was the only time I could get information from a machine while it was in the process of freezing up - but this might have been a different issue. I was ssh'd in and that terminal became unresponsive. I managed to ssh in again and looked at the logs. The machine was not completely frozen but it eventually became completely catatonic. For all I know this might be different from the locked machines syndrome as it left traces in the logs (I could forward you all the log entries if you want). I could try to boot one of the machines into 3.12.xrtyy, replicate the conditions and wait. What should I look for if I can catch this in the act? -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/13/2014 03:55 PM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[] [] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? Hope this is useful (adding 0x2ce/0x330 as offsets does not make any difference, don't know if it should)... # grep smp_call_function /var/log/messages|tail -1 Feb 12 14:18:21 cmn27 kernel: [771840.224419] RIP: 0010:[] [] smp_call_function_many+0x2ce/0x330 # addr2line -e /usr/lib/debug/lib/modules/3.12.10-300.rt15.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.10-300.rt15.1.fc20.ccrma.x86_64/kernel/rtmutex.c:1295 I can't see how the kernel decoder thinks it's smp_call_function_many but addr2line looks at rtmutex.c That doesn't make any sense at all. Version mismatch? Indeed, sorry for the mixup... here I go again, hopefully this one will make sense: # addr2line -e /usr/lib/debug/lib/modules/3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.9-301.rt13.1.fc20.ccrma.x86_64/kernel/smp.c:108 -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[] [] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? Hope this is useful (adding 0x2ce/0x330 as offsets does not make any difference, don't know if it should)... # grep smp_call_function /var/log/messages|tail -1 Feb 12 14:18:21 cmn27 kernel: [771840.224419] RIP: 0010:[] [] smp_call_function_many+0x2ce/0x330 # addr2line -e /usr/lib/debug/lib/modules/3.12.10-300.rt15.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.10-300.rt15.1.fc20.ccrma.x86_64/kernel/rtmutex.c:1295 This is the only time I was able to catch some logs of the problem (if it is the same). I had to revert to 3.10.27-rt25 for the time being and that seems to be holding up well so far. -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[810dc60a] [810dc60a] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? Hope this is useful (adding 0x2ce/0x330 as offsets does not make any difference, don't know if it should)... # grep smp_call_function /var/log/messages|tail -1 Feb 12 14:18:21 cmn27 kernel: [771840.224419] RIP: 0010:[810dc60e] [810dc60e] smp_call_function_many+0x2ce/0x330 # addr2line -e /usr/lib/debug/lib/modules/3.12.10-300.rt15.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.10-300.rt15.1.fc20.ccrma.x86_64/kernel/rtmutex.c:1295 This is the only time I was able to catch some logs of the problem (if it is the same). I had to revert to 3.10.27-rt25 for the time being and that seems to be holding up well so far. -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12.9-rt13: BUG: soft lockup
On 02/13/2014 03:55 PM, Thomas Gleixner wrote: On Thu, 13 Feb 2014, Fernando Lopez-Lezcano wrote: On 02/13/2014 02:25 PM, Thomas Gleixner wrote: On Wed, 12 Feb 2014, Fernando Lopez-Lezcano wrote: [771508.546449] RIP: 0010:[810dc60a] [810dc60a] smp_call_function_many+0x2ca/0x330 Can you decode the exact location inside of smp_call_function_many via addr2line please ? Hope this is useful (adding 0x2ce/0x330 as offsets does not make any difference, don't know if it should)... # grep smp_call_function /var/log/messages|tail -1 Feb 12 14:18:21 cmn27 kernel: [771840.224419] RIP: 0010:[810dc60e] [810dc60e] smp_call_function_many+0x2ce/0x330 # addr2line -e /usr/lib/debug/lib/modules/3.12.10-300.rt15.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.10-300.rt15.1.fc20.ccrma.x86_64/kernel/rtmutex.c:1295 I can't see how the kernel decoder thinks it's smp_call_function_many but addr2line looks at rtmutex.c That doesn't make any sense at all. Version mismatch? Indeed, sorry for the mixup... here I go again, hopefully this one will make sense: # addr2line -e /usr/lib/debug/lib/modules/3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt/vmlinux 810dc60e /usr/src/debug/kernel-3.12.fc20.ccrma/linux-3.12.9-301.rt13.1.fc20.ccrma.x86_64/kernel/smp.c:108 -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12.9-rt13: BUG: soft lockup
Hi all, I'm seeing these BUGs with 3.12.9-rt13 finally caught the messages. I was getting frozen machines with no traces left behind, this could possibly be it (see below - I have to retest with rt15) -- Fernando [771508.546420] BUG: soft lockup - CPU#5 stuck for 23s! [SweepSinVsUsm:1421] [771508.546431] Modules linked in: bnep bluetooth fuse tun act_police cls_basic cls_flow cls_fw cls_u32 sch_fq_codel sch_tbf sch_prio sch_htb sch_hfsc sch_ingress sch_sfq nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd sunrpc fscache bridge stp llc xt_CHECKSUM ipt_rpfilter xt_statistic xt_CT xt_LOG xt_connlimit xt_realm xt_addrtype xt_comment xt_recent xt_nat ipt_ULOG ipt_MASQUERADE ipt_ECN ipt_CLUSTERIP ipt_ah xt_set ip_set nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast [771508.546441] nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_TPROXY xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark ebtable_nat xt_CLASSIFY ebtables xt_AUDIT xt_state iptable_raw iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_mangle nfnetlink ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables hwmon_vid iTCO_wdt iTCO_vendor_support gpio_ich eeepc_wmi asus_wmi sparse_keymap rfkill snd_ice1712 snd_cs8427 x86_pkg_temp_thermal snd_i2c snd_ice17xx_ak4xxx snd_ak4xxx_adda coretemp snd_mpu401_uart snd_rawmidi snd_ac97_codec kvm ac97_bus crct10dif_pclmul [771508.546446] crc32_pclmul snd_seq crc32c_intel snd_seq_device ghash_clmulni_intel snd_pcm microcode snd_page_alloc snd_timer snd serio_raw soundcore r8169 i2c_i801 lpc_ich mfd_core mii mei_me mei shpchp binfmt_misc usb_storage nouveau i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi video wmi [771508.546447] CPU: 5 PID: 1421 Comm: SweepSinVsUsm Tainted: GW 3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt #1 [771508.546447] Hardware name: System manufacturer System Product Name/P8Z77-M, BIOS 1406 07/19/2012 [771508.546447] task: 8804af8b8000 ti: 8807ab41 task.ti: 8807ab41 [771508.546449] RIP: 0010:[] [] smp_call_function_many+0x2ca/0x330 [771508.546449] RSP: 0018:8807ab411cf0 EFLAGS: 0202 [771508.546450] RAX: 0001 RBX: RCX: 8807fe2e9918 [771508.546450] RDX: 0001 RSI: 0400 RDI: [771508.546450] RBP: 8807ab411d50 R08: 8807fe4e6108 R09: 0010 [771508.546451] R10: 8807fe4e6108 R11: 0246 R12: [771508.546451] R13: R14: 00df R15: 810432e4 [771508.546452] FS: 7faf34514700() GS:8807fe48() knlGS: [771508.546452] CS: 0010 DS: ES: CR0: 80050033 [771508.546453] CR2: 7faebb93d000 CR3: 0006bafd8000 CR4: 001407e0 [771508.546453] Stack: [771508.546454] 8807fe4e6188 0001 000660c0 8807ab411d60 [771508.546455] 8105adb0 8807fe5660c0 0202 880077268480 [771508.546456] 880077268780 7faebb93f000 7faebafea000 880077268480 [771508.546456] Call Trace: [771508.546458] [] ? leave_mm+0x80/0x80 [771508.546459] [] native_flush_tlb_others+0x37/0x40 [771508.546460] [] flush_tlb_mm_range+0xb4/0x280 [771508.546461] [] tlb_flush_mmu.part.50+0x33/0x90 [771508.546462] [] tlb_finish_mmu+0x55/0x60 [771508.546463] [] zap_page_range+0x112/0x150 [771508.546465] [] SyS_madvise+0x381/0x7b0 [771508.546466] [] system_call_fastpath+0x16/0x1b [771508.546475] Code: 4d ea 24 00 3b 05 9f e5 c2 00 89 c2 0f 8d c3 fd ff ff 48 98 49 8b 4d 00 48 03 0c c5 e0 64 d0 81 f6 41 20 01 74 cb 0f 1f 00 f3 90 41 20 01 75 f8 eb be 0f b6 4d ac 48 8b 55 b8 44 89 ef 48 8b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12.9-rt13: BUG: soft lockup
Hi all, I'm seeing these BUGs with 3.12.9-rt13 finally caught the messages. I was getting frozen machines with no traces left behind, this could possibly be it (see below - I have to retest with rt15) -- Fernando [771508.546420] BUG: soft lockup - CPU#5 stuck for 23s! [SweepSinVsUsm:1421] [771508.546431] Modules linked in: bnep bluetooth fuse tun act_police cls_basic cls_flow cls_fw cls_u32 sch_fq_codel sch_tbf sch_prio sch_htb sch_hfsc sch_ingress sch_sfq nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd sunrpc fscache bridge stp llc xt_CHECKSUM ipt_rpfilter xt_statistic xt_CT xt_LOG xt_connlimit xt_realm xt_addrtype xt_comment xt_recent xt_nat ipt_ULOG ipt_MASQUERADE ipt_ECN ipt_CLUSTERIP ipt_ah xt_set ip_set nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast [771508.546441] nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_TPROXY xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark ebtable_nat xt_CLASSIFY ebtables xt_AUDIT xt_state iptable_raw iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_mangle nfnetlink ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables hwmon_vid iTCO_wdt iTCO_vendor_support gpio_ich eeepc_wmi asus_wmi sparse_keymap rfkill snd_ice1712 snd_cs8427 x86_pkg_temp_thermal snd_i2c snd_ice17xx_ak4xxx snd_ak4xxx_adda coretemp snd_mpu401_uart snd_rawmidi snd_ac97_codec kvm ac97_bus crct10dif_pclmul [771508.546446] crc32_pclmul snd_seq crc32c_intel snd_seq_device ghash_clmulni_intel snd_pcm microcode snd_page_alloc snd_timer snd serio_raw soundcore r8169 i2c_i801 lpc_ich mfd_core mii mei_me mei shpchp binfmt_misc usb_storage nouveau i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi video wmi [771508.546447] CPU: 5 PID: 1421 Comm: SweepSinVsUsm Tainted: GW 3.12.9-301.rt13.1.fc20.ccrma.x86_64+rt #1 [771508.546447] Hardware name: System manufacturer System Product Name/P8Z77-M, BIOS 1406 07/19/2012 [771508.546447] task: 8804af8b8000 ti: 8807ab41 task.ti: 8807ab41 [771508.546449] RIP: 0010:[810dc60a] [810dc60a] smp_call_function_many+0x2ca/0x330 [771508.546449] RSP: 0018:8807ab411cf0 EFLAGS: 0202 [771508.546450] RAX: 0001 RBX: RCX: 8807fe2e9918 [771508.546450] RDX: 0001 RSI: 0400 RDI: [771508.546450] RBP: 8807ab411d50 R08: 8807fe4e6108 R09: 0010 [771508.546451] R10: 8807fe4e6108 R11: 0246 R12: [771508.546451] R13: R14: 00df R15: 810432e4 [771508.546452] FS: 7faf34514700() GS:8807fe48() knlGS: [771508.546452] CS: 0010 DS: ES: CR0: 80050033 [771508.546453] CR2: 7faebb93d000 CR3: 0006bafd8000 CR4: 001407e0 [771508.546453] Stack: [771508.546454] 8807fe4e6188 0001 000660c0 8807ab411d60 [771508.546455] 8105adb0 8807fe5660c0 0202 880077268480 [771508.546456] 880077268780 7faebb93f000 7faebafea000 880077268480 [771508.546456] Call Trace: [771508.546458] [8105adb0] ? leave_mm+0x80/0x80 [771508.546459] [8105af07] native_flush_tlb_others+0x37/0x40 [771508.546460] [8105b084] flush_tlb_mm_range+0xb4/0x280 [771508.546461] [81177173] tlb_flush_mmu.part.50+0x33/0x90 [771508.546462] [81177d15] tlb_finish_mmu+0x55/0x60 [771508.546463] [8117a072] zap_page_range+0x112/0x150 [771508.546465] [81176bc1] SyS_madvise+0x381/0x7b0 [771508.546466] [81696169] system_call_fastpath+0x16/0x1b [771508.546475] Code: 4d ea 24 00 3b 05 9f e5 c2 00 89 c2 0f 8d c3 fd ff ff 48 98 49 8b 4d 00 48 03 0c c5 e0 64 d0 81 f6 41 20 01 74 cb 0f 1f 00 f3 90 f6 41 20 01 75 f8 eb be 0f b6 4d ac 48 8b 55 b8 44 89 ef 48 8b -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.10.20-rt17, BUG and Oops
Hi all, Just got this on 3.10.20-rt17, ThinkPad T510 running Fedora 19 (I think it has happened a few times before). The machine is not completely dead, the mouse pointer moves around but otherwise display updates and keyboard response are nil. -- Fernando Nov 29 23:17:52 localhost kernel: [50532.638944] BUG: unable to handle kernel NULL pointer dereference at 02c7 Nov 29 23:17:52 localhost kernel: [50532.638951] IP: [] advance_transaction+0x60/0x121 Nov 29 23:17:52 localhost kernel: [50532.638953] PGD 1db141067 PUD 228703067 PMD 0 Nov 29 23:17:52 localhost kernel: [50532.638955] Oops: [#1] PREEMPT SMP Nov 29 23:17:52 localhost kernel: [50532.638983] Modules linked in: snd_hrtimer snd_seq_midi snd_seq_midi_event snd_seq_dummy snd_hdsp snd_rawmidi fuse xt_CHECKSUM tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security rfcomm iptable_raw bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi acpi_cpufreq mperf coretemp kvm_intel kvm crc32_pclmul uvcvideo crc32c_intel ghash_clmulni_intel videobuf2_vmalloc videobuf2_memops videobuf2_core microcode videodev media serio_raw snd_hda_codec_conexant intel_ips btusb i2c_i801 bluetooth arc4 iwldvm mac80211 snd_hda_intel snd_hda_codec iwlwifi snd_hwdep sdhci_pci snd_seq sdhci snd_seq_device cfg80211 mmc_core snd_pcm lpc_ich mfd_core e1000e snd_page_alloc ptp mei_me snd_timer pps_core mei thinkpad_acpi snd soundcore rfkill shpchp uinput nouveau i2c_algo_bit firewire_ohci drm_kms_helper firewire_core crc_itu_t ttm drm i2c_core mxm_wmi video wmi Nov 29 23:17:52 localhost kernel: [50532.639006] CPU: 0 PID: 45 Comm: irq/9-acpi Not tainted 3.10.20-200.rt17.1.fc19.ccrma.x86_64.rt #1 Nov 29 23:17:52 localhost kernel: [50532.639007] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 Nov 29 23:17:52 localhost kernel: [50532.639008] task: 880229bc8000 ti: 880229bac000 task.ti: 880229bac000 Nov 29 23:17:52 localhost kernel: [50532.639011] RIP: 0010:[] [] advance_transaction+0x60/0x121 Nov 29 23:17:52 localhost kernel: [50532.639012] RSP: 0018:880229badd50 EFLAGS: 00010246 Nov 29 23:17:52 localhost kernel: [50532.639013] RAX: 0081 RBX: 88011339f990 RCX: 0082 Nov 29 23:17:52 localhost kernel: [50532.639013] RDX: 0246 RSI: 0001 RDI: 880229b78eb0 Nov 29 23:17:52 localhost kernel: [50532.639014] RBP: 880229badd70 R08: R09: Nov 29 23:17:52 localhost kernel: [50532.639015] R10: 0001 R11: 102a R12: 880229b78e00 Nov 29 23:17:52 localhost kernel: [50532.639016] R13: 0001 R14: 880229b78eb0 R15: 880229b78d36 Nov 29 23:17:52 localhost kernel: [50532.639017] FS: () GS:88023bc0() knlGS: Nov 29 23:17:52 localhost kernel: [50532.639018] CS: 0010 DS: ES: CR0: 8005003b Nov 29 23:17:52 localhost kernel: [50532.639019] CR2: 02c7 CR3: 0001e8198000 CR4: 07f0 Nov 29 23:17:52 localhost kernel: [50532.639020] DR0: DR1: DR2: Nov 29 23:17:52 localhost kernel: [50532.639021] DR3: DR6: 0ff0 DR7: 0400 Nov 29 23:17:52 localhost kernel: [50532.639021] Stack: Nov 29 23:17:52 localhost kernel: [50532.639024] 880229b78e00 0001 88022983b000 0001 Nov 29 23:17:52 localhost kernel: [50532.639026] 880229badd90 8136258e 880229bc0198 0011 Nov 29 23:17:52 localhost kernel: [50532.639028] 880229baddb8 8136c3a3 880229b8a660 Nov 29 23:17:52 localhost kernel: [50532.639028] Call Trace: Nov 29 23:17:52 localhost kernel: [50532.639032] [] acpi_ec_gpe_handler+0x48/0xc9 Nov 29 23:17:52 localhost kernel: [50532.639036] [] acpi_ev_gpe_dispatch+0xb6/0x126 Nov 29 23:17:52 localhost kernel: [50532.639037] [] acpi_ev_gpe_detect+0xc0/0x111 Nov 29 23:17:52 localhost kernel: [50532.639043] [] ? irq_thread_fn+0x50/0x50 Nov 29 23:17:52 localhost kernel: [50532.639044] [] acpi_ev_sci_xrupt_handler+0x1f/0x25 Nov 29 23:17:52 localhost kernel: [50532.639048] [] acpi_irq+0x16/0x31 Nov 29 23:17:52 localhost kernel: [50532.639050] [] irq_forced_thread_fn+0x23/0x70 Nov 29 23:17:52 localhost kernel: [50532.639051] [] irq_thread+0x10f/0x150 Nov 29 23:17:52 localhost kernel: [50532.639053] [] ? wake_threads_waitq+0x50/0x50 Nov 29 23:17:52 localhost kernel: [50532.639054] [] ? irq_thread_check_affinity+0x90/0x90 Nov 29 23:17:52 localhost kernel: [50532.639058] []
3.10.20-rt17, BUG and Oops
Hi all, Just got this on 3.10.20-rt17, ThinkPad T510 running Fedora 19 (I think it has happened a few times before). The machine is not completely dead, the mouse pointer moves around but otherwise display updates and keyboard response are nil. -- Fernando Nov 29 23:17:52 localhost kernel: [50532.638944] BUG: unable to handle kernel NULL pointer dereference at 02c7 Nov 29 23:17:52 localhost kernel: [50532.638951] IP: [81361e9a] advance_transaction+0x60/0x121 Nov 29 23:17:52 localhost kernel: [50532.638953] PGD 1db141067 PUD 228703067 PMD 0 Nov 29 23:17:52 localhost kernel: [50532.638955] Oops: [#1] PREEMPT SMP Nov 29 23:17:52 localhost kernel: [50532.638983] Modules linked in: snd_hrtimer snd_seq_midi snd_seq_midi_event snd_seq_dummy snd_hdsp snd_rawmidi fuse xt_CHECKSUM tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security rfcomm iptable_raw bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi acpi_cpufreq mperf coretemp kvm_intel kvm crc32_pclmul uvcvideo crc32c_intel ghash_clmulni_intel videobuf2_vmalloc videobuf2_memops videobuf2_core microcode videodev media serio_raw snd_hda_codec_conexant intel_ips btusb i2c_i801 bluetooth arc4 iwldvm mac80211 snd_hda_intel snd_hda_codec iwlwifi snd_hwdep sdhci_pci snd_seq sdhci snd_seq_device cfg80211 mmc_core snd_pcm lpc_ich mfd_core e1000e snd_page_alloc ptp mei_me snd_timer pps_core mei thinkpad_acpi snd soundcore rfkill shpchp uinput nouveau i2c_algo_bit firewire_ohci drm_kms_helper firewire_core crc_itu_t ttm drm i2c_core mxm_wmi video wmi Nov 29 23:17:52 localhost kernel: [50532.639006] CPU: 0 PID: 45 Comm: irq/9-acpi Not tainted 3.10.20-200.rt17.1.fc19.ccrma.x86_64.rt #1 Nov 29 23:17:52 localhost kernel: [50532.639007] Hardware name: LENOVO 4313CTO/4313CTO, BIOS 6MET64WW (1.27 ) 07/15/2010 Nov 29 23:17:52 localhost kernel: [50532.639008] task: 880229bc8000 ti: 880229bac000 task.ti: 880229bac000 Nov 29 23:17:52 localhost kernel: [50532.639011] RIP: 0010:[81361e9a] [81361e9a] advance_transaction+0x60/0x121 Nov 29 23:17:52 localhost kernel: [50532.639012] RSP: 0018:880229badd50 EFLAGS: 00010246 Nov 29 23:17:52 localhost kernel: [50532.639013] RAX: 0081 RBX: 88011339f990 RCX: 0082 Nov 29 23:17:52 localhost kernel: [50532.639013] RDX: 0246 RSI: 0001 RDI: 880229b78eb0 Nov 29 23:17:52 localhost kernel: [50532.639014] RBP: 880229badd70 R08: R09: Nov 29 23:17:52 localhost kernel: [50532.639015] R10: 0001 R11: 102a R12: 880229b78e00 Nov 29 23:17:52 localhost kernel: [50532.639016] R13: 0001 R14: 880229b78eb0 R15: 880229b78d36 Nov 29 23:17:52 localhost kernel: [50532.639017] FS: () GS:88023bc0() knlGS: Nov 29 23:17:52 localhost kernel: [50532.639018] CS: 0010 DS: ES: CR0: 8005003b Nov 29 23:17:52 localhost kernel: [50532.639019] CR2: 02c7 CR3: 0001e8198000 CR4: 07f0 Nov 29 23:17:52 localhost kernel: [50532.639020] DR0: DR1: DR2: Nov 29 23:17:52 localhost kernel: [50532.639021] DR3: DR6: 0ff0 DR7: 0400 Nov 29 23:17:52 localhost kernel: [50532.639021] Stack: Nov 29 23:17:52 localhost kernel: [50532.639024] 880229b78e00 0001 88022983b000 0001 Nov 29 23:17:52 localhost kernel: [50532.639026] 880229badd90 8136258e 880229bc0198 0011 Nov 29 23:17:52 localhost kernel: [50532.639028] 880229baddb8 8136c3a3 880229b8a660 Nov 29 23:17:52 localhost kernel: [50532.639028] Call Trace: Nov 29 23:17:52 localhost kernel: [50532.639032] [8136258e] acpi_ec_gpe_handler+0x48/0xc9 Nov 29 23:17:52 localhost kernel: [50532.639036] [8136c3a3] acpi_ev_gpe_dispatch+0xb6/0x126 Nov 29 23:17:52 localhost kernel: [50532.639037] [8136c4d3] acpi_ev_gpe_detect+0xc0/0x111 Nov 29 23:17:52 localhost kernel: [50532.639043] [810f46b0] ? irq_thread_fn+0x50/0x50 Nov 29 23:17:52 localhost kernel: [50532.639044] [8136e3cf] acpi_ev_sci_xrupt_handler+0x1f/0x25 Nov 29 23:17:52 localhost kernel: [50532.639048] [8135b12f] acpi_irq+0x16/0x31 Nov 29 23:17:52 localhost kernel: [50532.639050] [810f46d3] irq_forced_thread_fn+0x23/0x70 Nov 29 23:17:52 localhost kernel: [50532.639051] [810f4c7f] irq_thread+0x10f/0x150 Nov 29 23:17:52 localhost kernel: [50532.639053]
Re: [ANNOUNCE] 3.10.9-rt5
On 08/23/2013 10:56 AM, Sebastian Andrzej Siewior wrote: * Fernando Lopez-Lezcano | 2013-08-23 10:18:08 [-0700]: Please post a patch when/if you have it so I can retry the build... Thanks for taking a look at this! Does this fix your trobule? Yes, it does, thanks! Builds, installs and boots the x86_64 kernel (I did not test the i686 build, I don't have a 32 machine to test). -- Fernando diff --git a/drivers/misc/hwlat_detector.c b/drivers/misc/hwlat_detector.c index 0bfa40d..6f61d5f 100644 --- a/drivers/misc/hwlat_detector.c +++ b/drivers/misc/hwlat_detector.c @@ -220,7 +220,7 @@ static struct sample *buffer_get_sample(struct sample *sample) #else #define time_type u64 #define time_get()trace_clock_local() -#define time_to_us(x) ((x) / 1000) +#define time_to_us(x) div_u64(x, 1000) #define time_sub(a, b)((a) - (b)) #define init_time(a, b) a = b #define time_u64(a) a -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.9-rt5
On 08/23/2013 12:08 AM, Sebastian Andrzej Siewior wrote: On 08/23/2013 07:50 AM, Fernando Lopez-Lezcano wrote: On 08/22/2013 11:21 AM, Sebastian Andrzej Siewior wrote: - hwlat improvements by Steven Known issues: ... Trying to build I get (in make modules): ERROR: "__udivdi3" [drivers/misc/hwlat_detector.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 looks like someone forgot to do do_div() instead / which fails on 32bit if used on a 64bit dividend. Will fix later. Please post a patch when/if you have it so I can retry the build... Thanks for taking a look at this! -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.9-rt5
On 08/23/2013 12:08 AM, Sebastian Andrzej Siewior wrote: On 08/23/2013 07:50 AM, Fernando Lopez-Lezcano wrote: On 08/22/2013 11:21 AM, Sebastian Andrzej Siewior wrote: - hwlat improvements by Steven Known issues: ... Trying to build I get (in make modules): ERROR: __udivdi3 [drivers/misc/hwlat_detector.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 looks like someone forgot to do do_div() instead / which fails on 32bit if used on a 64bit dividend. Will fix later. Please post a patch when/if you have it so I can retry the build... Thanks for taking a look at this! -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.9-rt5
On 08/23/2013 10:56 AM, Sebastian Andrzej Siewior wrote: * Fernando Lopez-Lezcano | 2013-08-23 10:18:08 [-0700]: Please post a patch when/if you have it so I can retry the build... Thanks for taking a look at this! Does this fix your trobule? Yes, it does, thanks! Builds, installs and boots the x86_64 kernel (I did not test the i686 build, I don't have a 32 machine to test). -- Fernando diff --git a/drivers/misc/hwlat_detector.c b/drivers/misc/hwlat_detector.c index 0bfa40d..6f61d5f 100644 --- a/drivers/misc/hwlat_detector.c +++ b/drivers/misc/hwlat_detector.c @@ -220,7 +220,7 @@ static struct sample *buffer_get_sample(struct sample *sample) #else #define time_type u64 #define time_get()trace_clock_local() -#define time_to_us(x) ((x) / 1000) +#define time_to_us(x) div_u64(x, 1000) #define time_sub(a, b)((a) - (b)) #define init_time(a, b) a = b #define time_u64(a) a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.9-rt5
On 08/22/2013 11:21 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.9-rt5 patch set. Thanks!, Changes since v3.10.9-rt4 - swait fixes from Steven. It fixed the issues with CONFIG_RCU_NOCB_CPU where the system suddenly froze and RCU wasn't doing its job anymore - hwlat improvements by Steven Known issues: ... Trying to build I get (in make modules): ERROR: "__udivdi3" [drivers/misc/hwlat_detector.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 (find attached the final configuration used for building) -- Fernando build.log.bz2 Description: application/bzip
Re: [ANNOUNCE] 3.10.9-rt5
On 08/22/2013 11:21 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.9-rt5 patch set. Thanks!, Changes since v3.10.9-rt4 - swait fixes from Steven. It fixed the issues with CONFIG_RCU_NOCB_CPU where the system suddenly froze and RCU wasn't doing its job anymore - hwlat improvements by Steven Known issues: ... Trying to build I get (in make modules): ERROR: __udivdi3 [drivers/misc/hwlat_detector.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 (find attached the final configuration used for building) -- Fernando build.log.bz2 Description: application/bzip
Re: [ANNOUNCE] 3.10.6-rt3
On 08/19/2013 05:29 PM, Steven Rostedt wrote: On Mon, 19 Aug 2013 10:23:44 -0700 Fernando Lopez-Lezcano wrote: The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? I think I'm going to send them an email about that. In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Can you boot without enabling CONFIG_BCACHE? Just to confirm that the kernel builds, installs and boots fine without this option... -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/19/2013 05:29 PM, Steven Rostedt wrote: On Mon, 19 Aug 2013 10:23:44 -0700 Fernando Lopez-Lezcano na...@ccrma.stanford.edu wrote: The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? I think I'm going to send them an email about that. In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Can you boot without enabling CONFIG_BCACHE? Just to confirm that the kernel builds, installs and boots fine without this option... -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/19/2013 05:29 PM, Steven Rostedt wrote: On Mon, 19 Aug 2013 10:23:44 -0700 Fernando Lopez-Lezcano wrote: The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? I think I'm going to send them an email about that. In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Can you boot without enabling CONFIG_BCACHE? I'm pretty sure I'll be able to do that. No real need in my personal case AFAICT. I'll try that next - it is just that I try very hard to keep the configuration of my rt kernels as close as possible to the defaults that Fedora uses (they get distributed as part of Planet CCRMA and there is no telling what usage cases they will hit - it would be confusing to have something that works on Fedora kernels and does not on equivalent RT patched kernels). Thanks for the heads up!, -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/16/2013 12:01 AM, Sebastian Andrzej Siewior wrote: On 08/15/2013 09:22 PM, Steven Rostedt wrote: On Thu, 15 Aug 2013 11:42:55 -0700 Fernando Lopez-Lezcano wrote: On 08/12/2013 09:34 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.6-rt3 patch set. I'm getting this when trying to build: drivers/md/bcache/request.c: In function 'cached_dev_write_complete': drivers/md/bcache/request.c:1008:2: error: implicit declaration of function 'up_read_non_owner' [-Werror=implicit-function-declaration] up_read_non_owner(>writeback_lock); ^ drivers/md/bcache/request.c: In function 'request_write': drivers/md/bcache/request.c:1034:2: error: implicit declaration of function 'down_read_non_owner' [-Werror=implicit-function-declaration] down_read_non_owner(>writeback_lock); ^ cc1: some warnings being treated as errors Can you send us your config. The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Thanks, -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/16/2013 12:01 AM, Sebastian Andrzej Siewior wrote: On 08/15/2013 09:22 PM, Steven Rostedt wrote: On Thu, 15 Aug 2013 11:42:55 -0700 Fernando Lopez-Lezcano na...@ccrma.stanford.edu wrote: On 08/12/2013 09:34 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.6-rt3 patch set. I'm getting this when trying to build: drivers/md/bcache/request.c: In function 'cached_dev_write_complete': drivers/md/bcache/request.c:1008:2: error: implicit declaration of function 'up_read_non_owner' [-Werror=implicit-function-declaration] up_read_non_owner(dc-writeback_lock); ^ drivers/md/bcache/request.c: In function 'request_write': drivers/md/bcache/request.c:1034:2: error: implicit declaration of function 'down_read_non_owner' [-Werror=implicit-function-declaration] down_read_non_owner(dc-writeback_lock); ^ cc1: some warnings being treated as errors Can you send us your config. The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Thanks, -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/19/2013 05:29 PM, Steven Rostedt wrote: On Mon, 19 Aug 2013 10:23:44 -0700 Fernando Lopez-Lezcano na...@ccrma.stanford.edu wrote: The problem is that bcache is using new semaphore functions which it just introduced which rt does not know about. The comment above their definition says that it is wrong to use them and completion is the right way to do it. So my question is, why don't we use completion but this nasty hack? I think I'm going to send them an email about that. In the meanwhile, any hope of a patch to be able to compile and test with my current configuration? Can you boot without enabling CONFIG_BCACHE? I'm pretty sure I'll be able to do that. No real need in my personal case AFAICT. I'll try that next - it is just that I try very hard to keep the configuration of my rt kernels as close as possible to the defaults that Fedora uses (they get distributed as part of Planet CCRMA and there is no telling what usage cases they will hit - it would be confusing to have something that works on Fedora kernels and does not on equivalent RT patched kernels). Thanks for the heads up!, -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/12/2013 09:34 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.6-rt3 patch set. I'm getting this when trying to build: drivers/md/bcache/request.c: In function 'cached_dev_write_complete': drivers/md/bcache/request.c:1008:2: error: implicit declaration of function 'up_read_non_owner' [-Werror=implicit-function-declaration] up_read_non_owner(>writeback_lock); ^ drivers/md/bcache/request.c: In function 'request_write': drivers/md/bcache/request.c:1034:2: error: implicit declaration of function 'down_read_non_owner' [-Werror=implicit-function-declaration] down_read_non_owner(>writeback_lock); ^ cc1: some warnings being treated as errors Does not look like *_read_non_owner exist in rwsem_rt.h... -- Fernando Changes since v3.10.6-rt2 - the queue can be imported with git quiltimport - powerpc compiles again. Thanks to Paul Gortmaker for the patch. - added three patches from v3.8 which fall the wagon on their way to 3.10. One of them enables RT-FULL on ARM :) - removed all cpsw patches from the queue. They made it upstream. My nfsboot setup seems not to work, lets look at this later. - make arm/spear compile. Thanks to Felipe Balbi for the patch. - Add a patch from Corey Minyard to no longer use deprecated CONFIG_NO_HZ. - add the one patch which I added to the last 3.8-rt to get get list_bl work again on !SMP && !DEBUG_SPINLOCK - Spell "preemptible" properly in "Preemptible Kernel (Basic RT)" menu item. Thanks to Uwe Kleine-König for the patch. - a patch from John Kacur to avoid a warning in the hpsa. - a patch for the ppc5200 where the compiler thinks a variable isn't initialized and stops compililing due to -Werror Known issues: - SLAB support not working - The cpsw network driver shows some issues. - ARM & PPC don't fall apart once booted. More testing doesn't hurt. - bcache with CONFIG_DEBUG_LOCK_ALLOC enabled does not compile. ... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.10.6-rt3
On 08/12/2013 09:34 AM, Sebastian Andrzej Siewior wrote: Dear RT folks! I'm pleased to announce the v3.10.6-rt3 patch set. I'm getting this when trying to build: drivers/md/bcache/request.c: In function 'cached_dev_write_complete': drivers/md/bcache/request.c:1008:2: error: implicit declaration of function 'up_read_non_owner' [-Werror=implicit-function-declaration] up_read_non_owner(dc-writeback_lock); ^ drivers/md/bcache/request.c: In function 'request_write': drivers/md/bcache/request.c:1034:2: error: implicit declaration of function 'down_read_non_owner' [-Werror=implicit-function-declaration] down_read_non_owner(dc-writeback_lock); ^ cc1: some warnings being treated as errors Does not look like *_read_non_owner exist in rwsem_rt.h... -- Fernando Changes since v3.10.6-rt2 - the queue can be imported with git quiltimport - powerpc compiles again. Thanks to Paul Gortmaker for the patch. - added three patches from v3.8 which fall the wagon on their way to 3.10. One of them enables RT-FULL on ARM :) - removed all cpsw patches from the queue. They made it upstream. My nfsboot setup seems not to work, lets look at this later. - make arm/spear compile. Thanks to Felipe Balbi for the patch. - Add a patch from Corey Minyard to no longer use deprecated CONFIG_NO_HZ. - add the one patch which I added to the last 3.8-rt to get get list_bl work again on !SMP !DEBUG_SPINLOCK - Spell preemptible properly in Preemptible Kernel (Basic RT) menu item. Thanks to Uwe Kleine-König for the patch. - a patch from John Kacur to avoid a warning in the hpsa. - a patch for the ppc5200 where the compiler thinks a variable isn't initialized and stops compililing due to -Werror Known issues: - SLAB support not working - The cpsw network driver shows some issues. - ARM PPC don't fall apart once booted. More testing doesn't hurt. - bcache with CONFIG_DEBUG_LOCK_ALLOC enabled does not compile. ... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.6-rt17
On 11/15/2012 10:11 AM, Thomas Gleixner wrote: On Wed, 14 Nov 2012, Fernando Lopez-Lezcano wrote: On 11/12/2012 01:28 PM, Thomas Gleixner wrote: Dear RT Folks, I'm pleased to announce the 3.6.6-rt17 release. 3.6.6-rt16 is just a not announced update release to 3.6.6. Got this: net/nfc/llcp/llcp.c: In function 'nfc_llcp_register_device': net/nfc/llcp/llcp.c:1185:24: error: expected expression before '{' token net/nfc/llcp/llcp.c:1186:35: error: expected expression before '{' token when building with CONFIG_NFC / CONFIG_NFS_LLCP (builds fine when those are not set) Grrr. Damned ignorants. Does that fix it for you ? Yes, thanks! I had to tweak the patch but it does make the whole thing compile. -- Fernando > Subject: nfc: Use proper lock init functions From: Thomas Gleixner Date: Thu, 15 Nov 2012 19:03:20 +0100 Grmbl. Why insist people on using static initializers if there are proper init functions? Just because they can? Signed-off-by: Thomas Gleixner --- net/nfc/llcp/llcp.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-stable/net/nfc/llcp/llcp.c === --- linux-stable.orig/net/nfc/llcp/llcp.c +++ linux-stable/net/nfc/llcp/llcp.c @@ -1182,8 +1182,8 @@ int nfc_llcp_register_device(struct nfc_ goto err_rx_wq; } - local->sockets.lock = __RW_LOCK_UNLOCKED(local->sockets.lock); - local->connecting_sockets.lock = __RW_LOCK_UNLOCKED(local->connecting_sockets.lock); + rwlock_init(>sockets.lock); + rwlock_init(>connecting_sockets.lock); nfc_llcp_build_gb(local); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.6-rt17
On 11/15/2012 10:11 AM, Thomas Gleixner wrote: On Wed, 14 Nov 2012, Fernando Lopez-Lezcano wrote: On 11/12/2012 01:28 PM, Thomas Gleixner wrote: Dear RT Folks, I'm pleased to announce the 3.6.6-rt17 release. 3.6.6-rt16 is just a not announced update release to 3.6.6. Got this: net/nfc/llcp/llcp.c: In function 'nfc_llcp_register_device': net/nfc/llcp/llcp.c:1185:24: error: expected expression before '{' token net/nfc/llcp/llcp.c:1186:35: error: expected expression before '{' token when building with CONFIG_NFC / CONFIG_NFS_LLCP (builds fine when those are not set) Grrr. Damned ignorants. Does that fix it for you ? Yes, thanks! I had to tweak the patch but it does make the whole thing compile. -- Fernando Subject: nfc: Use proper lock init functions From: Thomas Gleixnert...@linutronix.de Date: Thu, 15 Nov 2012 19:03:20 +0100 Grmbl. Why insist people on using static initializers if there are proper init functions? Just because they can? Signed-off-by: Thomas Gleixnert...@linutronix.de --- net/nfc/llcp/llcp.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-stable/net/nfc/llcp/llcp.c === --- linux-stable.orig/net/nfc/llcp/llcp.c +++ linux-stable/net/nfc/llcp/llcp.c @@ -1182,8 +1182,8 @@ int nfc_llcp_register_device(struct nfc_ goto err_rx_wq; } - local-sockets.lock = __RW_LOCK_UNLOCKED(local-sockets.lock); - local-connecting_sockets.lock = __RW_LOCK_UNLOCKED(local-connecting_sockets.lock); + rwlock_init(local-sockets.lock); + rwlock_init(local-connecting_sockets.lock); nfc_llcp_build_gb(local); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.6-rt17
On 11/12/2012 01:28 PM, Thomas Gleixner wrote: Dear RT Folks, I'm pleased to announce the 3.6.6-rt17 release. 3.6.6-rt16 is just a not announced update release to 3.6.6. Got this: net/nfc/llcp/llcp.c: In function 'nfc_llcp_register_device': net/nfc/llcp/llcp.c:1185:24: error: expected expression before '{' token net/nfc/llcp/llcp.c:1186:35: error: expected expression before '{' token when building with CONFIG_NFC / CONFIG_NFS_LLCP (builds fine when those are not set) -- Fernando Changes since 3.6.6-rt16: * Finally make the NOHZ softirq pending detection work with the new softirq scheme. * Remove the WARN_ON from __raise_softirq_irqoff(). I got the information I want for now. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.6-rt17
On 11/12/2012 01:28 PM, Thomas Gleixner wrote: Dear RT Folks, I'm pleased to announce the 3.6.6-rt17 release. 3.6.6-rt16 is just a not announced update release to 3.6.6. Got this: net/nfc/llcp/llcp.c: In function 'nfc_llcp_register_device': net/nfc/llcp/llcp.c:1185:24: error: expected expression before '{' token net/nfc/llcp/llcp.c:1186:35: error: expected expression before '{' token when building with CONFIG_NFC / CONFIG_NFS_LLCP (builds fine when those are not set) -- Fernando Changes since 3.6.6-rt16: * Finally make the NOHZ softirq pending detection work with the new softirq scheme. * Remove the WARN_ON from __raise_softirq_irqoff(). I got the information I want for now. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Mon, 2008-01-28 at 10:26 -0800, Fernando Lopez-Lezcano wrote: > On Sun, 2008-01-27 at 05:46 +0100, Mike Galbraith wrote: > > On Sat, 2008-01-26 at 17:59 -0800, Fernando Lopez-Lezcano wrote: > > > > > Hi Ingo... back to testing. > > > History: > > > > > > 2.6.23.x + rt has not been very usable for audio applications. > > > 2.6.24-rt1: same so far. > > > > > > Why: Jack keeps printing "delayed..." messages and has xruns which means > > > that somehow the timing is delayed more than what jack would think > > > reasonable. As in the case with an old timing bug, the problem > > > dissapears when booting the kernel with idle=poll. Other users of Planet > > > CCRMA are able to replicate the behavior, which goes away with idle=poll > > > or booting the machine with only one core. As a workaround I have been > > > packaging 2.6.22.x but now I'm not able to use that as the old rt14 > > > patch, suitably tweaked results in a non working kernel. > > > > > > So it looks like, again, timing is getting skewed when the jack process > > > jumps between cpus and thus jack sees timing jumps that are just not > > > happenning. > > > > > > This is with a build based on 2.6.24 using as a base the latest Fedora > > > rawhide source package plus 2.6.24-rt1. > > > > Do you have a simple testcase? (one which doesn't entail installing > > ccrma and becoming an audiophile) > > No, I don't at this point. > I'll see if I can cook something simple today... (naively thinking that > some short C code could test for the clock being actually monotonic > across cpus). Sorry, no luck so far in writing something simple that will fail. I tried testing for the results from repeated calls to clock_gettime (what jack uses for timing by default) to actually be monotonic, while a script uses taskset to force a cpu switch and of course got no errors. 2.6.24-rt1 with idle=poll works fine, without it I get multiple problems with the jack internal timing, or least that is what it seems to me from the symptoms. -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Sun, 2008-01-27 at 05:46 +0100, Mike Galbraith wrote: > On Sat, 2008-01-26 at 17:59 -0800, Fernando Lopez-Lezcano wrote: > > > Hi Ingo... back to testing. > > History: > > > > 2.6.23.x + rt has not been very usable for audio applications. > > 2.6.24-rt1: same so far. > > > > Why: Jack keeps printing "delayed..." messages and has xruns which means > > that somehow the timing is delayed more than what jack would think > > reasonable. As in the case with an old timing bug, the problem > > dissapears when booting the kernel with idle=poll. Other users of Planet > > CCRMA are able to replicate the behavior, which goes away with idle=poll > > or booting the machine with only one core. As a workaround I have been > > packaging 2.6.22.x but now I'm not able to use that as the old rt14 > > patch, suitably tweaked results in a non working kernel. > > > > So it looks like, again, timing is getting skewed when the jack process > > jumps between cpus and thus jack sees timing jumps that are just not > > happenning. > > > > This is with a build based on 2.6.24 using as a base the latest Fedora > > rawhide source package plus 2.6.24-rt1. > > Do you have a simple testcase? (one which doesn't entail installing > ccrma and becoming an audiophile) No, I don't at this point. I'll see if I can cook something simple today... (naively thinking that some short C code could test for the clock being actually monotonic across cpus). -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Mon, 2008-01-28 at 10:26 -0800, Fernando Lopez-Lezcano wrote: On Sun, 2008-01-27 at 05:46 +0100, Mike Galbraith wrote: On Sat, 2008-01-26 at 17:59 -0800, Fernando Lopez-Lezcano wrote: Hi Ingo... back to testing. History: 2.6.23.x + rt has not been very usable for audio applications. 2.6.24-rt1: same so far. Why: Jack keeps printing delayed... messages and has xruns which means that somehow the timing is delayed more than what jack would think reasonable. As in the case with an old timing bug, the problem dissapears when booting the kernel with idle=poll. Other users of Planet CCRMA are able to replicate the behavior, which goes away with idle=poll or booting the machine with only one core. As a workaround I have been packaging 2.6.22.x but now I'm not able to use that as the old rt14 patch, suitably tweaked results in a non working kernel. So it looks like, again, timing is getting skewed when the jack process jumps between cpus and thus jack sees timing jumps that are just not happenning. This is with a build based on 2.6.24 using as a base the latest Fedora rawhide source package plus 2.6.24-rt1. Do you have a simple testcase? (one which doesn't entail installing ccrma and becoming an audiophile) No, I don't at this point. I'll see if I can cook something simple today... (naively thinking that some short C code could test for the clock being actually monotonic across cpus). Sorry, no luck so far in writing something simple that will fail. I tried testing for the results from repeated calls to clock_gettime (what jack uses for timing by default) to actually be monotonic, while a script uses taskset to force a cpu switch and of course got no errors. 2.6.24-rt1 with idle=poll works fine, without it I get multiple problems with the jack internal timing, or least that is what it seems to me from the symptoms. -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Sun, 2008-01-27 at 05:46 +0100, Mike Galbraith wrote: On Sat, 2008-01-26 at 17:59 -0800, Fernando Lopez-Lezcano wrote: Hi Ingo... back to testing. History: 2.6.23.x + rt has not been very usable for audio applications. 2.6.24-rt1: same so far. Why: Jack keeps printing delayed... messages and has xruns which means that somehow the timing is delayed more than what jack would think reasonable. As in the case with an old timing bug, the problem dissapears when booting the kernel with idle=poll. Other users of Planet CCRMA are able to replicate the behavior, which goes away with idle=poll or booting the machine with only one core. As a workaround I have been packaging 2.6.22.x but now I'm not able to use that as the old rt14 patch, suitably tweaked results in a non working kernel. So it looks like, again, timing is getting skewed when the jack process jumps between cpus and thus jack sees timing jumps that are just not happenning. This is with a build based on 2.6.24 using as a base the latest Fedora rawhide source package plus 2.6.24-rt1. Do you have a simple testcase? (one which doesn't entail installing ccrma and becoming an audiophile) No, I don't at this point. I'll see if I can cook something simple today... (naively thinking that some short C code could test for the clock being actually monotonic across cpus). -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Sat, 2007-12-08 at 10:17 +0100, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > On Fri, 2007-12-07 at 20:59 +0100, Ingo Molnar wrote: > > > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > > > > > Nope, it doesn't still getting "delay" and "xrun" messages galore. > > > > > > > > Attached: configuration and dmesg output booting with idle=poll, > > > > reconfirmed that that makes the delay and xrun messages go away. > > > > > > could you try the rolled up patch of various fixlets, ontop of > > > current -git? (it might even apply to -rc4) It includes some more > > > stuff beyond the ones in the pull request. (still being > > > tested/reviewed) > > > > I'll try but it will take me a while to figure git and do a package > > build of it... > > if you want to try a vanilla kernel package then pick up the kernel > package from Fedora rawhide - this fixlet should show up there within a > couple of days, Dave Jones is doing a really nice job of keeping up with > latest -git. (and the Fedora kernel has hrtimers and dynticks enabled.) Hi Ingo... back to testing. History: 2.6.23.x + rt has not been very usable for audio applications. 2.6.24-rt1: same so far. Why: Jack keeps printing "delayed..." messages and has xruns which means that somehow the timing is delayed more than what jack would think reasonable. As in the case with an old timing bug, the problem dissapears when booting the kernel with idle=poll. Other users of Planet CCRMA are able to replicate the behavior, which goes away with idle=poll or booting the machine with only one core. As a workaround I have been packaging 2.6.22.x but now I'm not able to use that as the old rt14 patch, suitably tweaked results in a non working kernel. So it looks like, again, timing is getting skewed when the jack process jumps between cpus and thus jack sees timing jumps that are just not happenning. This is with a build based on 2.6.24 using as a base the latest Fedora rawhide source package plus 2.6.24-rt1. -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rt1: timing problems (was [git pull] x86/hrtimer/acpi fixes)
On Sat, 2007-12-08 at 10:17 +0100, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: On Fri, 2007-12-07 at 20:59 +0100, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: Nope, it doesn't still getting delay and xrun messages galore. Attached: configuration and dmesg output booting with idle=poll, reconfirmed that that makes the delay and xrun messages go away. could you try the rolled up patch of various fixlets, ontop of current -git? (it might even apply to -rc4) It includes some more stuff beyond the ones in the pull request. (still being tested/reviewed) I'll try but it will take me a while to figure git and do a package build of it... if you want to try a vanilla kernel package then pick up the kernel package from Fedora rawhide - this fixlet should show up there within a couple of days, Dave Jones is doing a really nice job of keeping up with latest -git. (and the Fedora kernel has hrtimers and dynticks enabled.) Hi Ingo... back to testing. History: 2.6.23.x + rt has not been very usable for audio applications. 2.6.24-rt1: same so far. Why: Jack keeps printing delayed... messages and has xruns which means that somehow the timing is delayed more than what jack would think reasonable. As in the case with an old timing bug, the problem dissapears when booting the kernel with idle=poll. Other users of Planet CCRMA are able to replicate the behavior, which goes away with idle=poll or booting the machine with only one core. As a workaround I have been packaging 2.6.22.x but now I'm not able to use that as the old rt14 patch, suitably tweaked results in a non working kernel. So it looks like, again, timing is getting skewed when the jack process jumps between cpus and thus jack sees timing jumps that are just not happenning. This is with a build based on 2.6.24 using as a base the latest Fedora rawhide source package plus 2.6.24-rt1. -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 20:59 +0100, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > Nope, it doesn't still getting "delay" and "xrun" messages galore. > > > > Attached: configuration and dmesg output booting with idle=poll, > > reconfirmed that that makes the delay and xrun messages go away. > > could you try the rolled up patch of various fixlets, ontop of current > -git? (it might even apply to -rc4) It includes some more stuff beyond > the ones in the pull request. (still being tested/reviewed) I'll try but it will take me a while to figure git and do a package build of it... -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 19:59 +0100, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > Ingo, I was about to post about timer problems in 2.6.23.9+rt12 when I > > saw this. Would this be related / should I test / will this solve > > everything? :-) > > > > What I'm seeing is jack "delays" that go away if I boot with > > "idle=poll", just like it was happening a long time ago. Smells like > > 'time of day' glitches when the process switches cpus (this is on a > > dual core intel laptop). > > does it go away with hpet=disable as well? If yes then there could be a > relation. If not then it's something else and we need to debug it. Nope, it doesn't still getting "delay" and "xrun" messages galore. -- Fernando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 19:36 +0100, Ingo Molnar wrote: > Linus, please pull the latest x86 git tree from: > >git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git > > This contains 3 x86/hrtimer/hpet/ACPI fixes from Thomas: the ACPI fix > has been ACK-ed by Venki. Build and boot tested on various boxes. Ingo, I was about to post about timer problems in 2.6.23.9+rt12 when I saw this. Would this be related / should I test / will this solve everything? :-) What I'm seeing is jack "delays" that go away if I boot with "idle=poll", just like it was happening a long time ago. Smells like 'time of day' glitches when the process switches cpus (this is on a dual core intel laptop). Does not happen in 2.6.22.10 + rt9 - well, I do see very occassional delay warnings there as well. I also see occassional complete hangs but I don't have a way of knowing what triggers that. -- Fernando > --> > Thomas Gleixner (3): > hrtimers: avoid overflow for large relative timeouts > clockevents: warn once when program_event() is called with negative > expiry > ACPI: move timer broadcast before busmaster disable > > drivers/acpi/processor_idle.c | 19 ++- > kernel/hrtimer.c |8 > kernel/time/clockevents.c |5 + > 3 files changed, 27 insertions(+), 5 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index b1fbee3..2fe34cc 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -531,6 +531,11 @@ static void acpi_processor_idle(void) > > case ACPI_STATE_C3: > /* > + * Must be done before busmaster disable as we might > + * need to access HPET ! > + */ > + acpi_state_timer_broadcast(pr, cx, 1); > + /* >* disable bus master >* bm_check implies we need ARB_DIS >* !bm_check implies we need cache flush > @@ -557,7 +562,6 @@ static void acpi_processor_idle(void) > /* Get start time (ticks) */ > t1 = inl(acpi_gbl_FADT.xpm_timer_block.address); > /* Invoke C3 */ > - acpi_state_timer_broadcast(pr, cx, 1); > /* Tell the scheduler that we are going deep-idle: */ > sched_clock_idle_sleep_event(); > acpi_cstate_enter(cx); > @@ -1401,9 +1405,6 @@ static int acpi_idle_enter_simple(struct cpuidle_device > *dev, > if (acpi_idle_suspend) > return(acpi_idle_enter_c1(dev, state)); > > - if (pr->flags.bm_check) > - acpi_idle_update_bm_rld(pr, cx); > - > local_irq_disable(); > current_thread_info()->status &= ~TS_POLLING; > /* > @@ -1418,13 +1419,21 @@ static int acpi_idle_enter_simple(struct > cpuidle_device *dev, > return 0; > } > > + /* > + * Must be done before busmaster disable as we might need to > + * access HPET ! > + */ > + acpi_state_timer_broadcast(pr, cx, 1); > + > + if (pr->flags.bm_check) > + acpi_idle_update_bm_rld(pr, cx); > + > if (cx->type == ACPI_STATE_C3) > ACPI_FLUSH_CPU_CACHE(); > > t1 = inl(acpi_gbl_FADT.xpm_timer_block.address); > /* Tell the scheduler that we are going deep-idle: */ > sched_clock_idle_sleep_event(); > - acpi_state_timer_broadcast(pr, cx, 1); > acpi_idle_do_entry(cx); > t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); > > diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c > index 22a2514..e65dd0b 100644 > --- a/kernel/hrtimer.c > +++ b/kernel/hrtimer.c > @@ -850,6 +850,14 @@ hrtimer_start(struct hrtimer *timer, ktime_t tim, const > enum hrtimer_mode mode) > #ifdef CONFIG_TIME_LOW_RES > tim = ktime_add(tim, base->resolution); > #endif > + /* > + * Careful here: User space might have asked for a > + * very long sleep, so the add above might result in a > + * negative number, which enqueues the timer in front > + * of the queue. > + */ > + if (tim.tv64 < 0) > + tim.tv64 = KTIME_MAX; > } > timer->expires = tim; > > diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c > index 822beeb..5fb139f 100644 > --- a/kernel/time/clockevents.c > +++ b/kernel/time/clockevents.c > @@ -78,6 +78,11 @@ int clockevents_program_event(struct clock_event_device > *dev, ktime_t expires, > unsigned long long clc; > int64_t delta; > > + if (unlikely(expires.tv64 < 0)) { > + WARN_ON_ONCE(1); > + return -ETIME; > + } > + > delta = ktime_to_ns(ktime_sub(expires, now)); > > if (delta <= 0) > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 19:36 +0100, Ingo Molnar wrote: Linus, please pull the latest x86 git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git This contains 3 x86/hrtimer/hpet/ACPI fixes from Thomas: the ACPI fix has been ACK-ed by Venki. Build and boot tested on various boxes. Ingo, I was about to post about timer problems in 2.6.23.9+rt12 when I saw this. Would this be related / should I test / will this solve everything? :-) What I'm seeing is jack delays that go away if I boot with idle=poll, just like it was happening a long time ago. Smells like 'time of day' glitches when the process switches cpus (this is on a dual core intel laptop). Does not happen in 2.6.22.10 + rt9 - well, I do see very occassional delay warnings there as well. I also see occassional complete hangs but I don't have a way of knowing what triggers that. -- Fernando -- Thomas Gleixner (3): hrtimers: avoid overflow for large relative timeouts clockevents: warn once when program_event() is called with negative expiry ACPI: move timer broadcast before busmaster disable drivers/acpi/processor_idle.c | 19 ++- kernel/hrtimer.c |8 kernel/time/clockevents.c |5 + 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index b1fbee3..2fe34cc 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -531,6 +531,11 @@ static void acpi_processor_idle(void) case ACPI_STATE_C3: /* + * Must be done before busmaster disable as we might + * need to access HPET ! + */ + acpi_state_timer_broadcast(pr, cx, 1); + /* * disable bus master * bm_check implies we need ARB_DIS * !bm_check implies we need cache flush @@ -557,7 +562,6 @@ static void acpi_processor_idle(void) /* Get start time (ticks) */ t1 = inl(acpi_gbl_FADT.xpm_timer_block.address); /* Invoke C3 */ - acpi_state_timer_broadcast(pr, cx, 1); /* Tell the scheduler that we are going deep-idle: */ sched_clock_idle_sleep_event(); acpi_cstate_enter(cx); @@ -1401,9 +1405,6 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, if (acpi_idle_suspend) return(acpi_idle_enter_c1(dev, state)); - if (pr-flags.bm_check) - acpi_idle_update_bm_rld(pr, cx); - local_irq_disable(); current_thread_info()-status = ~TS_POLLING; /* @@ -1418,13 +1419,21 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev, return 0; } + /* + * Must be done before busmaster disable as we might need to + * access HPET ! + */ + acpi_state_timer_broadcast(pr, cx, 1); + + if (pr-flags.bm_check) + acpi_idle_update_bm_rld(pr, cx); + if (cx-type == ACPI_STATE_C3) ACPI_FLUSH_CPU_CACHE(); t1 = inl(acpi_gbl_FADT.xpm_timer_block.address); /* Tell the scheduler that we are going deep-idle: */ sched_clock_idle_sleep_event(); - acpi_state_timer_broadcast(pr, cx, 1); acpi_idle_do_entry(cx); t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 22a2514..e65dd0b 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -850,6 +850,14 @@ hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode) #ifdef CONFIG_TIME_LOW_RES tim = ktime_add(tim, base-resolution); #endif + /* + * Careful here: User space might have asked for a + * very long sleep, so the add above might result in a + * negative number, which enqueues the timer in front + * of the queue. + */ + if (tim.tv64 0) + tim.tv64 = KTIME_MAX; } timer-expires = tim; diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 822beeb..5fb139f 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -78,6 +78,11 @@ int clockevents_program_event(struct clock_event_device *dev, ktime_t expires, unsigned long long clc; int64_t delta; + if (unlikely(expires.tv64 0)) { + WARN_ON_ONCE(1); + return -ETIME; + } + delta = ktime_to_ns(ktime_sub(expires, now)); if (delta = 0) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 19:59 +0100, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: Ingo, I was about to post about timer problems in 2.6.23.9+rt12 when I saw this. Would this be related / should I test / will this solve everything? :-) What I'm seeing is jack delays that go away if I boot with idle=poll, just like it was happening a long time ago. Smells like 'time of day' glitches when the process switches cpus (this is on a dual core intel laptop). does it go away with hpet=disable as well? If yes then there could be a relation. If not then it's something else and we need to debug it. Nope, it doesn't still getting delay and xrun messages galore. -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] x86/hrtimer/acpi fixes
On Fri, 2007-12-07 at 20:59 +0100, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: Nope, it doesn't still getting delay and xrun messages galore. Attached: configuration and dmesg output booting with idle=poll, reconfirmed that that makes the delay and xrun messages go away. could you try the rolled up patch of various fixlets, ontop of current -git? (it might even apply to -rc4) It includes some more stuff beyond the ones in the pull request. (still being tested/reviewed) I'll try but it will take me a while to figure git and do a package build of it... -- Fernando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23.9-rt12: BUGs
> I'll try rt12... > > Same problems in rt12, getting lots of "delay of xxx usecs exceeds > estimated spare time of ; restart" in jackd (on my T61 Lenovo laptop > running fc7). Does not happen with 2.6.22.10 + rt9. This is both with > the internal snd-hda-intel card and a pcmcia rme hdsp multiface. While trying out 2.6.23.9-rt12 I got the three attached bugs. Also attached is the output of dmesg for a clean boot on the machine. Jack displays timing problems, similar to when there were timing issues with dual processor machines. Still investigating as time permits. -- Fernando apparently while suspending --- Nov 27 20:06:01 localhost kernel: Stopping tasks ... done. Nov 27 20:06:01 localhost kernel: Suspending console(s) Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Stopping disk Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :15:00.2 disabled Nov 27 20:06:01 localhost kernel: eth%d: Going into suspend... Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :03:00.0 disabled Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1f.2 disabled Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.2 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1b.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:19.0 disabled Nov 27 20:06:01 localhost kernel: Disabling non-boot CPUs ... Nov 27 20:06:01 localhost kernel: Breaking affinity for irq 218 Nov 27 20:06:01 localhost kernel: CPU 1 is now offline Nov 27 20:06:01 localhost kernel: SMP alternatives: switching to UP code Nov 27 20:06:01 localhost kernel: BUG: sleeping function called from invalid context pm-suspend(3740) at kernel/rtmutex.c:637 Nov 27 20:06:01 localhost gnome-power-manager: (nando) DBUS timed out, but recovering Nov 27 20:06:01 localhost kernel: in_atomic():0 [], irqs_disabled():1 Nov 27 20:06:01 localhost kernel: [] __rt_spin_lock+0x21/0x3d Nov 27 20:06:01 localhost kernel: [] free_pages_bulk+0x28/0x188 Nov 27 20:06:01 localhost kernel: [] __drain_pages+0x48/0x69 Nov 27 20:06:01 localhost kernel: [] page_alloc_cpu_notify+0x1e/0x3d Nov 27 20:06:01 localhost kernel: [] notifier_call_chain+0x2a/0x47 Nov 27 20:06:01 localhost kernel: [] raw_notifier_call_chain+0x17/0x1a Nov 27 20:06:01 localhost kernel: [] _cpu_down+0x184/0x242 Nov 27 20:06:01 localhost kernel: [] disable_nonboot_cpus+0x4e/0xd2 Nov 27 20:06:01 localhost kernel: [] acpi_sleep_prepare+0x41/0x48 Nov 27 20:06:01 localhost kernel: [] suspend_devices_and_enter+0x64/0x96 Nov 27 20:06:01 localhost kernel: [] enter_state+0x11b/0x193 Nov 27 20:06:01 localhost kernel: [] state_store+0x8e/0xa2 Nov 27 20:06:01 localhost kernel: [] state_store+0x0/0xa2 Nov 27 20:06:01 localhost kernel: [] subsys_attr_store+0x27/0x2b Nov 27 20:06:01 localhost kernel: [] sysfs_write_file+0xa6/0xd9 Nov 27 20:06:01 localhost kernel: [] sysfs_write_file+0x0/0xd9 Nov 27 20:06:01 localhost kernel: [] vfs_write+0xa8/0x15a Nov 27 20:06:01 localhost gnome-power-manager: (nando) Resuming computer Nov 27 20:06:01 localhost kernel: [] sys_write+0x41/0x67 Nov 27 20:06:01 localhost kernel: [] syscall_call+0x7/0xb Nov 27 20:06:01 localhost kernel: [] xfrm_send_policy_notify+0x44f/0x4f4 Nov 27 20:06:01 localhost NetworkManager: Waking up from sleep. Nov 27 20:06:01 localhost kernel: === Nov 27 20:06:01 localhost NetworkManager: Deactivating device eth1. Nov 27 20:06:01 localhost kernel: CPU1 is down Nov 27 20:06:01 localhost NetworkManager: eth1: Device is fully-supported using driver 'e1000'. Nov 27 20:06:01 localhost kernel: Intel machine check architecture supported. Nov 27 20:06:01 localhost NetworkManager: nm_device_init(): waiting for device's worker thread to start Nov 27 20:06:01 localhost kernel: Intel machine check reporting enabled on CPU#0.
2.6.23.9-rt12: BUGs
I'll try rt12... Same problems in rt12, getting lots of delay of xxx usecs exceeds estimated spare time of ; restart in jackd (on my T61 Lenovo laptop running fc7). Does not happen with 2.6.22.10 + rt9. This is both with the internal snd-hda-intel card and a pcmcia rme hdsp multiface. While trying out 2.6.23.9-rt12 I got the three attached bugs. Also attached is the output of dmesg for a clean boot on the machine. Jack displays timing problems, similar to when there were timing issues with dual processor machines. Still investigating as time permits. -- Fernando apparently while suspending --- Nov 27 20:06:01 localhost kernel: Stopping tasks ... done. Nov 27 20:06:01 localhost kernel: Suspending console(s) Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache Nov 27 20:06:01 localhost kernel: sd 0:0:0:0: [sda] Stopping disk Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :15:00.2 disabled Nov 27 20:06:01 localhost kernel: eth%d: Going into suspend... Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :03:00.0 disabled Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost pcscd: winscard_msg_srv.c:238:SHMProcessEventsContext() select returns with failure: Interrupted system call Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1f.2 disabled Nov 27 20:06:01 localhost pcscd: winscard_svc.c:222:ContextThread() Error in SHMProcessEventsContext Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.2 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1d.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1b.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.7 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.1 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:1a.0 disabled Nov 27 20:06:01 localhost kernel: ACPI: PCI interrupt for device :00:19.0 disabled Nov 27 20:06:01 localhost kernel: Disabling non-boot CPUs ... Nov 27 20:06:01 localhost kernel: Breaking affinity for irq 218 Nov 27 20:06:01 localhost kernel: CPU 1 is now offline Nov 27 20:06:01 localhost kernel: SMP alternatives: switching to UP code Nov 27 20:06:01 localhost kernel: BUG: sleeping function called from invalid context pm-suspend(3740) at kernel/rtmutex.c:637 Nov 27 20:06:01 localhost gnome-power-manager: (nando) DBUS timed out, but recovering Nov 27 20:06:01 localhost kernel: in_atomic():0 [], irqs_disabled():1 Nov 27 20:06:01 localhost kernel: [c062d88d] __rt_spin_lock+0x21/0x3d Nov 27 20:06:01 localhost kernel: [c0466b68] free_pages_bulk+0x28/0x188 Nov 27 20:06:01 localhost kernel: [c0466d10] __drain_pages+0x48/0x69 Nov 27 20:06:01 localhost kernel: [c0466d4f] page_alloc_cpu_notify+0x1e/0x3d Nov 27 20:06:01 localhost kernel: [c062faa4] notifier_call_chain+0x2a/0x47 Nov 27 20:06:01 localhost kernel: [c0439520] raw_notifier_call_chain+0x17/0x1a Nov 27 20:06:01 localhost kernel: [c044a805] _cpu_down+0x184/0x242 Nov 27 20:06:01 localhost kernel: [c044aa6c] disable_nonboot_cpus+0x4e/0xd2 Nov 27 20:06:01 localhost kernel: [c0533915] acpi_sleep_prepare+0x41/0x48 Nov 27 20:06:01 localhost kernel: [c044f213] suspend_devices_and_enter+0x64/0x96 Nov 27 20:06:01 localhost kernel: [c044f360] enter_state+0x11b/0x193 Nov 27 20:06:01 localhost kernel: [c044f466] state_store+0x8e/0xa2 Nov 27 20:06:01 localhost kernel: [c044f3d8] state_store+0x0/0xa2 Nov 27 20:06:01 localhost kernel: [c04bb067] subsys_attr_store+0x27/0x2b Nov 27 20:06:01 localhost kernel: [c04bb2a9] sysfs_write_file+0xa6/0xd9 Nov 27 20:06:01 localhost kernel: [c04bb203] sysfs_write_file+0x0/0xd9 Nov 27 20:06:01 localhost kernel: [c04826eb] vfs_write+0xa8/0x15a Nov 27 20:06:01 localhost gnome-power-manager: (nando) Resuming computer Nov 27 20:06:01 localhost kernel: [c0482d14] sys_write+0x41/0x67 Nov 27 20:06:01 localhost kernel: [c040514a] syscall_call+0x7/0xb Nov 27 20:06:01 localhost kernel: [c062] xfrm_send_policy_notify+0x44f/0x4f4 Nov 27 20:06:01 localhost NetworkManager: info Waking up from sleep. Nov 27 20:06:01 localhost kernel: === Nov 27 20:06:01 localhost NetworkManager: info Deactivating device eth1. Nov 27 20:06:01 localhost kernel: CPU1 is down Nov 27 20:06:01 localhost NetworkManager: info eth1: Device is fully-supported using driver 'e1000'. Nov 27 20:06:01 localhost kernel: Intel machine check architecture supported. Nov 27 20:06:01 localhost
Re: 2.6.22.14 + rt? vs 2.6.23.9-rt12
On Tue, 2007-11-27 at 17:02 -0800, Fernando Lopez-Lezcano wrote: > Hi Ingo... any hope of an updated realtime patch for 2.6.22.14? I'm > having problems with 2.6.23.1 + rt11 (I spent the morning rediffing > agains 2.6.23.9 and just _now_ pressed reload in my browser and there it > is..., rt12 for 2.6.23.9!, argh! :-) and wanted to compare with 2.6.22.x > and the latest I managed to repatch and run successfully is 2.6.22.10. I > did 2.6.22.14 in the afternoon but I obviously bungled it somewhere as > the boot... takes... a... long... time... I can send my .14 patch off > the list if you want/need it. > > [in my 2.6.23.1-rt11 tests I am getting "delayed..." messages from > jackd, smells like a problem with internal timing in the kernel] > > I'll try rt12... Same problems in rt12, getting lots of "delay of xxx usecs exceeds estimated spare time of ; restart" in jackd (on my T61 Lenovo laptop running fc7). Does not happen with 2.6.22.10 + rt9. This is both with the internal snd-hda-intel card and a pcmcia rme hdsp multiface. -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.22.14 + rt?
Hi Ingo... any hope of an updated realtime patch for 2.6.22.14? I'm having problems with 2.6.23.1 + rt11 (I spent the morning rediffing agains 2.6.23.9 and just _now_ pressed reload in my browser and there it is..., rt12 for 2.6.23.9!, argh! :-) and wanted to compare with 2.6.22.x and the latest I managed to repatch and run successfully is 2.6.22.10. I did 2.6.22.14 in the afternoon but I obviously bungled it somewhere as the boot... takes... a... long... time... I can send my .14 patch off the list if you want/need it. [in my 2.6.23.1-rt11 tests I am getting "delayed..." messages from jackd, smells like a problem with internal timing in the kernel] I'll try rt12... -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.22.14 + rt?
Hi Ingo... any hope of an updated realtime patch for 2.6.22.14? I'm having problems with 2.6.23.1 + rt11 (I spent the morning rediffing agains 2.6.23.9 and just _now_ pressed reload in my browser and there it is..., rt12 for 2.6.23.9!, argh! :-) and wanted to compare with 2.6.22.x and the latest I managed to repatch and run successfully is 2.6.22.10. I did 2.6.22.14 in the afternoon but I obviously bungled it somewhere as the boot... takes... a... long... time... I can send my .14 patch off the list if you want/need it. [in my 2.6.23.1-rt11 tests I am getting delayed... messages from jackd, smells like a problem with internal timing in the kernel] I'll try rt12... -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22.14 + rt? vs 2.6.23.9-rt12
On Tue, 2007-11-27 at 17:02 -0800, Fernando Lopez-Lezcano wrote: Hi Ingo... any hope of an updated realtime patch for 2.6.22.14? I'm having problems with 2.6.23.1 + rt11 (I spent the morning rediffing agains 2.6.23.9 and just _now_ pressed reload in my browser and there it is..., rt12 for 2.6.23.9!, argh! :-) and wanted to compare with 2.6.22.x and the latest I managed to repatch and run successfully is 2.6.22.10. I did 2.6.22.14 in the afternoon but I obviously bungled it somewhere as the boot... takes... a... long... time... I can send my .14 patch off the list if you want/need it. [in my 2.6.23.1-rt11 tests I am getting delayed... messages from jackd, smells like a problem with internal timing in the kernel] I'll try rt12... Same problems in rt12, getting lots of delay of xxx usecs exceeds estimated spare time of ; restart in jackd (on my T61 Lenovo laptop running fc7). Does not happen with 2.6.22.10 + rt9. This is both with the internal snd-hda-intel card and a pcmcia rme hdsp multiface. -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PlanetCCRMA] Re: 2.6.22.6 + rt9: suspend/hibernate not working
On Thu, 2007-09-06 at 12:55 -0700, Fernando Lopez-Lezcano wrote: > On Thu, 2007-09-06 at 11:42 -0700, Fernando Lopez-Lezcano wrote: > > On Tue, 2007-09-04 at 17:15 -0700, Daniel Walker wrote: > > > On Tue, 2007-09-04 at 17:12 -0700, Fernando Lopez-Lezcano wrote: > > > > Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which > > > > I confirmed) that the latest rt kernel I released has broken suspend > > > > (tested on fc6 & fc7, stock Fedora kernel works fine - the rt > > > > configuration files are virtual clones as far as possible of the > > > > standard Fedora kernel config files). > > > > > > > > I don't know where to start debugging this. When suspend is initiated it > > > > freezes with a "Stopping tasks ... " message in the text console - a > > > > hard power cycle is the only way to get the machine back to normal. > > > > > > > > kernel/power/process.c seems to contain that string in the > > > > freeze_processes function so it looks like the freezer is not freezing > > > > tasks as no "done" message is ever printed. > > > > > > > > What could we do to help? > > > > > > If you have high resolution timers enabled you could try disabling it, > > > and see if the problem persists . > > > > The problem is still there ("Stopping tasks ... " and nothing > > afterwards). > > Looks like it was a known problem (sorry about the noise), see: > http://lkml.org/lkml/2007/8/25/117 > > It does fix the problem here as well. > Ingo: is this still the right fix for 2.6.22.6 + rt9? I'm seeing this while going into suspend: ... Disabling non-boot CPUs ... Breaking affinity for irq 218 CPU 1 is now offline SMP alternatives: switching to UP code BUG: sleeping function called from invalid context pm-suspend(3676) at kernel/rtmutex.c:636 in_atomic():0 [], irqs_disabled():1 [] __rt_spin_lock+0x21/0x3d [] free_pages_bulk+0x28/0x188 [] migration_call+0x3a5/0x3be [] __drain_pages+0x48/0x69 [] page_alloc_cpu_notify+0x15/0x2b [] notifier_call_chain+0x2a/0x47 [] raw_notifier_call_chain+0x17/0x1a [] _cpu_down+0x17a/0x238 [] printk+0x1f/0x92 [] disable_nonboot_cpus+0x4e/0xd2 [] enter_state+0x116/0x1d6 [] state_store+0xc9/0xe0 [] state_store+0x0/0xe0 [] subsys_attr_store+0x27/0x2b [] sysfs_write_file+0x9a/0xbd [] sysfs_write_file+0x0/0xbd [] vfs_write+0xa8/0x15a [] sys_write+0x41/0x67 [] syscall_call+0x7/0xb === CPU1 is down PM: Entering mem sleep thinkpad_acpi thinkpad_acpi: LATE suspend ... I'm attaching the whole compressed dmesg output to put it in context. -- Fernando dmesg.1.bz2 Description: application/bzip
Re: 2.6.22.6 + rt9: suspend/hibernate not working
On Thu, 2007-09-06 at 11:42 -0700, Fernando Lopez-Lezcano wrote: > On Tue, 2007-09-04 at 17:15 -0700, Daniel Walker wrote: > > On Tue, 2007-09-04 at 17:12 -0700, Fernando Lopez-Lezcano wrote: > > > Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which > > > I confirmed) that the latest rt kernel I released has broken suspend > > > (tested on fc6 & fc7, stock Fedora kernel works fine - the rt > > > configuration files are virtual clones as far as possible of the > > > standard Fedora kernel config files). > > > > > > I don't know where to start debugging this. When suspend is initiated it > > > freezes with a "Stopping tasks ... " message in the text console - a > > > hard power cycle is the only way to get the machine back to normal. > > > > > > kernel/power/process.c seems to contain that string in the > > > freeze_processes function so it looks like the freezer is not freezing > > > tasks as no "done" message is ever printed. > > > > > > What could we do to help? > > > > If you have high resolution timers enabled you could try disabling it, > > and see if the problem persists . > > The problem is still there ("Stopping tasks ... " and nothing > afterwards). Looks like it was a known problem (sorry about the noise), see: http://lkml.org/lkml/2007/8/25/117 It does fix the problem here as well. Ingo: is this still the right fix for 2.6.22.6 + rt9? -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22.6 + rt9: suspend/hibernate not working
On Thu, 2007-09-06 at 11:42 -0700, Fernando Lopez-Lezcano wrote: On Tue, 2007-09-04 at 17:15 -0700, Daniel Walker wrote: On Tue, 2007-09-04 at 17:12 -0700, Fernando Lopez-Lezcano wrote: Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which I confirmed) that the latest rt kernel I released has broken suspend (tested on fc6 fc7, stock Fedora kernel works fine - the rt configuration files are virtual clones as far as possible of the standard Fedora kernel config files). I don't know where to start debugging this. When suspend is initiated it freezes with a Stopping tasks ... message in the text console - a hard power cycle is the only way to get the machine back to normal. kernel/power/process.c seems to contain that string in the freeze_processes function so it looks like the freezer is not freezing tasks as no done message is ever printed. What could we do to help? If you have high resolution timers enabled you could try disabling it, and see if the problem persists . The problem is still there (Stopping tasks ... and nothing afterwards). Looks like it was a known problem (sorry about the noise), see: http://lkml.org/lkml/2007/8/25/117 It does fix the problem here as well. Ingo: is this still the right fix for 2.6.22.6 + rt9? -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PlanetCCRMA] Re: 2.6.22.6 + rt9: suspend/hibernate not working
On Thu, 2007-09-06 at 12:55 -0700, Fernando Lopez-Lezcano wrote: On Thu, 2007-09-06 at 11:42 -0700, Fernando Lopez-Lezcano wrote: On Tue, 2007-09-04 at 17:15 -0700, Daniel Walker wrote: On Tue, 2007-09-04 at 17:12 -0700, Fernando Lopez-Lezcano wrote: Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which I confirmed) that the latest rt kernel I released has broken suspend (tested on fc6 fc7, stock Fedora kernel works fine - the rt configuration files are virtual clones as far as possible of the standard Fedora kernel config files). I don't know where to start debugging this. When suspend is initiated it freezes with a Stopping tasks ... message in the text console - a hard power cycle is the only way to get the machine back to normal. kernel/power/process.c seems to contain that string in the freeze_processes function so it looks like the freezer is not freezing tasks as no done message is ever printed. What could we do to help? If you have high resolution timers enabled you could try disabling it, and see if the problem persists . The problem is still there (Stopping tasks ... and nothing afterwards). Looks like it was a known problem (sorry about the noise), see: http://lkml.org/lkml/2007/8/25/117 It does fix the problem here as well. Ingo: is this still the right fix for 2.6.22.6 + rt9? I'm seeing this while going into suspend: ... Disabling non-boot CPUs ... Breaking affinity for irq 218 CPU 1 is now offline SMP alternatives: switching to UP code BUG: sleeping function called from invalid context pm-suspend(3676) at kernel/rtmutex.c:636 in_atomic():0 [], irqs_disabled():1 [c061c13d] __rt_spin_lock+0x21/0x3d [c0461b2c] free_pages_bulk+0x28/0x188 [c042653d] migration_call+0x3a5/0x3be [c0461cd4] __drain_pages+0x48/0x69 [c0461d0a] page_alloc_cpu_notify+0x15/0x2b [c061e083] notifier_call_chain+0x2a/0x47 [c0435fe8] raw_notifier_call_chain+0x17/0x1a [c0446869] _cpu_down+0x17a/0x238 [c042b4e5] printk+0x1f/0x92 [c0446ad0] disable_nonboot_cpus+0x4e/0xd2 [c044b673] enter_state+0x116/0x1d6 [c044b831] state_store+0xc9/0xe0 [c044b768] state_store+0x0/0xe0 [c04b44cf] subsys_attr_store+0x27/0x2b [c04b45db] sysfs_write_file+0x9a/0xbd [c04b4541] sysfs_write_file+0x0/0xbd [c047bc97] vfs_write+0xa8/0x15a [c047c2c0] sys_write+0x41/0x67 [c0404ef6] syscall_call+0x7/0xb === CPU1 is down PM: Entering mem sleep thinkpad_acpi thinkpad_acpi: LATE suspend ... I'm attaching the whole compressed dmesg output to put it in context. -- Fernando dmesg.1.bz2 Description: application/bzip
2.6.22.6 + rt9: suspend/hibernate not working
Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which I confirmed) that the latest rt kernel I released has broken suspend (tested on fc6 & fc7, stock Fedora kernel works fine - the rt configuration files are virtual clones as far as possible of the standard Fedora kernel config files). I don't know where to start debugging this. When suspend is initiated it freezes with a "Stopping tasks ... " message in the text console - a hard power cycle is the only way to get the machine back to normal. kernel/power/process.c seems to contain that string in the freeze_processes function so it looks like the freezer is not freezing tasks as no "done" message is ever printed. What could we do to help? -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.22.6 + rt9: suspend/hibernate not working
Hi Ingo... I'm getting reports from some of my Planet CCRMA users (which I confirmed) that the latest rt kernel I released has broken suspend (tested on fc6 fc7, stock Fedora kernel works fine - the rt configuration files are virtual clones as far as possible of the standard Fedora kernel config files). I don't know where to start debugging this. When suspend is initiated it freezes with a Stopping tasks ... message in the text console - a hard power cycle is the only way to get the machine back to normal. kernel/power/process.c seems to contain that string in the freeze_processes function so it looks like the freezer is not freezing tasks as no done message is ever printed. What could we do to help? -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: [PlanetCCRMA] atl1 driver; sleeping function]
On Tue, 2007-07-31 at 10:51 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > Hi Ingo, I'm forwading this report from a Planet CCRMA user, this is > > happening to him with 2.6.21.6-rt21... > > thanks! Thanks for the patch! Looks like it fixed the problem Matt was having... -- Fernando Forwarded Message From: Matt Barbe To: Fernando Lopez-Lezcano Cc: [EMAIL PROTECTED] Subject: Re: [PlanetCCRMA] atl1 driver; sleeping function Date: Tue, 31 Jul 2007 22:50:28 -0400 The newly patched atl1 driver seems to be working fine. I tried it also in rt21.3 (that's the latest src.rpm in http://ccrma.stanford.edu/planetccrma/mirror/all/linux/SRPMS/), and it also worked fine -- I need kernel-rt-devel because I do use apps that need nvidia drivers, and those are working fine in rt21.3 as well. I can keep you up to date if anything negative happens. Thanks again, Matt > > > BUG: sleeping function called from invalid context IRQ-219(2243) at > > kernel/rtmutex.c:613 > > in_atomic():0 [], irqs_disabled():1 > > [] dump_trace+0x64/0x105 > > [] show_trace_log_lvl+0x18/0x2c > > [] show_trace+0xf/0x11 > > [] dump_stack+0x12/0x14 > > [] __rt_spin_lock+0x21/0x3d > > [] atl1_xmit_frame+0x66f/0x6c6 [atl1] > > [] dev_hard_start_xmit+0x1c6/0x225 > > [] __qdisc_run+0xb7/0x1cf > > could you try the patch below, does it fix the problem? The atl1 driver > uses raw irq flags in combination with a spinlock that is a sleeping > lock on -rt. (this is valid code on upstream, fortunately the -rt fix is > also a cleanup and a small code reduction enhancement on upstream, so > there's no problem pushing such fixes upstream.) > > Ingo > > ---> > Subject: [patch] drivers/net/atl1/atl1_main.c: use spin_trylock_irqsave() > From: Ingo Molnar <[EMAIL PROTECTED]> > > use the simpler spin_trylock_irqsave() API to get the adapter lock. > > [ this is also a fix for -rt where adapter->lock is a sleeping lock. ] > > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > --- > drivers/net/atl1/atl1_main.c |4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > Index: linux-rt-rebase.q/drivers/net/atl1/atl1_main.c > === > --- linux-rt-rebase.q.orig/drivers/net/atl1/atl1_main.c > +++ linux-rt-rebase.q/drivers/net/atl1/atl1_main.c > @@ -1704,10 +1704,8 @@ static int atl1_xmit_frame(struct sk_buf > } > } > > - local_irq_save(flags); > - if (!spin_trylock(>lock)) { > + if (!spin_trylock_irqsave(>lock, flags)) { > /* Can't get lock - tell upper layer to requeue */ > - local_irq_restore(flags); > dev_printk(KERN_DEBUG, >pdev->dev, "tx locked\n"); > return NETDEV_TX_LOCKED; > } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: [PlanetCCRMA] atl1 driver; sleeping function]
On Tue, 2007-07-31 at 10:51 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: Hi Ingo, I'm forwading this report from a Planet CCRMA user, this is happening to him with 2.6.21.6-rt21... thanks! Thanks for the patch! Looks like it fixed the problem Matt was having... -- Fernando Forwarded Message From: Matt Barbe To: Fernando Lopez-Lezcano Cc: [EMAIL PROTECTED] Subject: Re: [PlanetCCRMA] atl1 driver; sleeping function Date: Tue, 31 Jul 2007 22:50:28 -0400 The newly patched atl1 driver seems to be working fine. I tried it also in rt21.3 (that's the latest src.rpm in http://ccrma.stanford.edu/planetccrma/mirror/all/linux/SRPMS/), and it also worked fine -- I need kernel-rt-devel because I do use apps that need nvidia drivers, and those are working fine in rt21.3 as well. I can keep you up to date if anything negative happens. Thanks again, Matt BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [c0405f88] dump_trace+0x64/0x105 [c0406041] show_trace_log_lvl+0x18/0x2c [c040664e] show_trace+0xf/0x11 [c04066cf] dump_stack+0x12/0x14 [c060511d] __rt_spin_lock+0x21/0x3d [f8a20e0c] atl1_xmit_frame+0x66f/0x6c6 [atl1] [c05a3d96] dev_hard_start_xmit+0x1c6/0x225 [c05b29bd] __qdisc_run+0xb7/0x1cf could you try the patch below, does it fix the problem? The atl1 driver uses raw irq flags in combination with a spinlock that is a sleeping lock on -rt. (this is valid code on upstream, fortunately the -rt fix is also a cleanup and a small code reduction enhancement on upstream, so there's no problem pushing such fixes upstream.) Ingo --- Subject: [patch] drivers/net/atl1/atl1_main.c: use spin_trylock_irqsave() From: Ingo Molnar [EMAIL PROTECTED] use the simpler spin_trylock_irqsave() API to get the adapter lock. [ this is also a fix for -rt where adapter-lock is a sleeping lock. ] Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) Index: linux-rt-rebase.q/drivers/net/atl1/atl1_main.c === --- linux-rt-rebase.q.orig/drivers/net/atl1/atl1_main.c +++ linux-rt-rebase.q/drivers/net/atl1/atl1_main.c @@ -1704,10 +1704,8 @@ static int atl1_xmit_frame(struct sk_buf } } - local_irq_save(flags); - if (!spin_trylock(adapter-lock)) { + if (!spin_trylock_irqsave(adapter-lock, flags)) { /* Can't get lock - tell upper layer to requeue */ - local_irq_restore(flags); dev_printk(KERN_DEBUG, adapter-pdev-dev, tx locked\n); return NETDEV_TX_LOCKED; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Fwd: [PlanetCCRMA] atl1 driver; sleeping function]
Hi Ingo, I'm forwading this report from a Planet CCRMA user, this is happening to him with 2.6.21.6-rt21... -- Fernando Forwarded Message From: Matt Barber To: [EMAIL PROTECTED] Subject: [PlanetCCRMA] atl1 driver; sleeping function Date: Mon, 30 Jul 2007 06:09:58 -0400 Hello, I'm getting a set of BUG messages in my dmesg with the newest ccrma kernel. This is a new box, so I haven't tried the older ccrma kernels, but the bugs aren't there with Fedora stock. They look like this (probably at least a hundred more by now): BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [] dump_trace+0x64/0x105 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __rt_spin_lock+0x21/0x3d [] atl1_xmit_frame+0x66f/0x6c6 [atl1] [] dev_hard_start_xmit+0x1c6/0x225 [] __qdisc_run+0xb7/0x1cf [] dev_queue_xmit+0x14a/0x239 [] ip_output+0x207/0x243 [] ip_queue_xmit+0x3b2/0x402 [] tcp_transmit_skb+0x6e5/0x713 [] tcp_send_ack+0xeb/0xef [] tcp_rcv_established+0x52a/0x7ff [] tcp_v4_do_rcv+0x1bf/0x494 [] tcp_v4_rcv+0x863/0x8d6 [] ip_local_deliver+0x18f/0x23d [] ip_rcv+0x41d/0x456 [] netif_receive_skb+0x2cc/0x35e [] process_backlog+0x76/0xc9 [] net_rx_action+0xa7/0x1a5 [] ___do_softirq+0xfe/0x214 [] do_softirq_from_hardirq+0x48/0x61 [] do_irqd+0x21a/0x282 [] kthread+0xb0/0xd8 [] kernel_thread_helper+0x7/0x10 === printk: 6 messages suppressed. network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context firefox-bin(17517) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [] dump_trace+0x64/0x105 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __rt_spin_lock+0x21/0x3d [] atl1_xmit_frame+0x66f/0x6c6 [atl1] [] dev_hard_start_xmit+0x1c6/0x225 [] __qdisc_run+0xb7/0x1cf [] dev_queue_xmit+0x14a/0x239 [] ip_output+0x207/0x243 [] ip_queue_xmit+0x3b2/0x402 [] tcp_transmit_skb+0x6e5/0x713 [] tcp_push_one+0xb3/0xd8 [] tcp_sendmsg+0x7c8/0x9f9 [] inet_sendmsg+0x3b/0x45 [] sock_sendmsg+0xd0/0xeb [] sys_sendto+0x11b/0x13b [] sys_send+0x37/0x3b [] sys_socketcall+0x14a/0x261 [] syscall_call+0x7/0xb [] 0xb7fd8410 === network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [] dump_trace+0x64/0x105 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __rt_spin_lock+0x21/0x3d [] atl1_xmit_frame+0x66f/0x6c6 [atl1] [] dev_hard_start_xmit+0x1c6/0x225 [] __qdisc_run+0xb7/0x1cf [] dev_queue_xmit+0x14a/0x239 [] ip_output+0x207/0x243 [] ip_queue_xmit+0x3b2/0x402 [] tcp_transmit_skb+0x6e5/0x713 [] tcp_send_ack+0xeb/0xef [] tcp_rcv_established+0x52a/0x7ff [] tcp_v4_do_rcv+0x1bf/0x494 [] tcp_v4_rcv+0x863/0x8d6 [] ip_local_deliver+0x18f/0x23d [] ip_rcv+0x41d/0x456 [] netif_receive_skb+0x2cc/0x35e [] process_backlog+0x76/0xc9 [] net_rx_action+0xa7/0x1a5 [] ___do_softirq+0xfe/0x214 [] do_softirq_from_hardirq+0x48/0x61 [] do_irqd+0x21a/0x282 [] kthread+0xb0/0xd8 [] kernel_thread_helper+0x7/0x10 === printk: 14 messages suppressed. network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context firefox-bin(17517) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [] dump_trace+0x64/0x105 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __rt_spin_lock+0x21/0x3d [] atl1_xmit_frame+0x66f/0x6c6 [atl1] [] dev_hard_start_xmit+0x1c6/0x225 [] __qdisc_run+0xb7/0x1cf [] dev_queue_xmit+0x14a/0x239 [] ip_output+0x207/0x243 [] ip_queue_xmit+0x3b2/0x402 [] tcp_transmit_skb+0x6e5/0x713 [] tcp_push_one+0xb3/0xd8 [] tcp_sendmsg+0x7c8/0x9f9 [] inet_sendmsg+0x3b/0x45 [] sock_sendmsg+0xd0/0xeb [] sys_sendto+0x11b/0x13b [] sys_send+0x37/0x3b [] sys_socketcall+0x14a/0x261 [] syscall_call+0x7/0xb [] 0xb7fd8410 === BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [] dump_trace+0x64/0x105 [] show_trace_log_lvl+0x18/0x2c [] show_trace+0xf/0x11 [] dump_stack+0x12/0x14 [] __rt_spin_lock+0x21/0x3d [] atl1_xmit_frame+0x66f/0x6c6 [atl1] [] dev_hard_start_xmit+0x1c6/0x225 [] __qdisc_run+0xb7/0x1cf [] dev_queue_xmit+0x14a/0x239 [] ip_output+0x207/0x243 [] ip_queue_xmit+0x3b2/0x402 [] tcp_transmit_skb+0x6e5/0x713 [] __tcp_push_pending_frames+0x6ec/0x7af [] tcp_rcv_established+0x107/0x7ff [] tcp_v4_do_rcv+0x1bf/0x494 [] tcp_v4_rcv+0x863/0x8d6 []
[Fwd: [PlanetCCRMA] atl1 driver; sleeping function]
Hi Ingo, I'm forwading this report from a Planet CCRMA user, this is happening to him with 2.6.21.6-rt21... -- Fernando Forwarded Message From: Matt Barber To: [EMAIL PROTECTED] Subject: [PlanetCCRMA] atl1 driver; sleeping function Date: Mon, 30 Jul 2007 06:09:58 -0400 Hello, I'm getting a set of BUG messages in my dmesg with the newest ccrma kernel. This is a new box, so I haven't tried the older ccrma kernels, but the bugs aren't there with Fedora stock. They look like this (probably at least a hundred more by now): BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [c0405f88] dump_trace+0x64/0x105 [c0406041] show_trace_log_lvl+0x18/0x2c [c040664e] show_trace+0xf/0x11 [c04066cf] dump_stack+0x12/0x14 [c060511d] __rt_spin_lock+0x21/0x3d [f8a20e0c] atl1_xmit_frame+0x66f/0x6c6 [atl1] [c05a3d96] dev_hard_start_xmit+0x1c6/0x225 [c05b29bd] __qdisc_run+0xb7/0x1cf [c05a5661] dev_queue_xmit+0x14a/0x239 [c05c4a40] ip_output+0x207/0x243 [c05c41ea] ip_queue_xmit+0x3b2/0x402 [c05d26d7] tcp_transmit_skb+0x6e5/0x713 [c05d289a] tcp_send_ack+0xeb/0xef [c05d1617] tcp_rcv_established+0x52a/0x7ff [c05d7234] tcp_v4_do_rcv+0x1bf/0x494 [c05d9955] tcp_v4_rcv+0x863/0x8d6 [c05bff3a] ip_local_deliver+0x18f/0x23d [c05bfd72] ip_rcv+0x41d/0x456 [c05a3991] netif_receive_skb+0x2cc/0x35e [c05a524a] process_backlog+0x76/0xc9 [c05a5419] net_rx_action+0xa7/0x1a5 [c042e276] ___do_softirq+0xfe/0x214 [c042e6a6] do_softirq_from_hardirq+0x48/0x61 [c0459204] do_irqd+0x21a/0x282 [c043ad18] kthread+0xb0/0xd8 [c0405bbf] kernel_thread_helper+0x7/0x10 === printk: 6 messages suppressed. network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context firefox-bin(17517) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [c0405f88] dump_trace+0x64/0x105 [c0406041] show_trace_log_lvl+0x18/0x2c [c040664e] show_trace+0xf/0x11 [c04066cf] dump_stack+0x12/0x14 [c060511d] __rt_spin_lock+0x21/0x3d [f8a20e0c] atl1_xmit_frame+0x66f/0x6c6 [atl1] [c05a3d96] dev_hard_start_xmit+0x1c6/0x225 [c05b29bd] __qdisc_run+0xb7/0x1cf [c05a5661] dev_queue_xmit+0x14a/0x239 [c05c4a40] ip_output+0x207/0x243 [c05c41ea] ip_queue_xmit+0x3b2/0x402 [c05d26d7] tcp_transmit_skb+0x6e5/0x713 [c05d41ad] tcp_push_one+0xb3/0xd8 [c05c9f92] tcp_sendmsg+0x7c8/0x9f9 [c05e2ce1] inet_sendmsg+0x3b/0x45 [c059a86a] sock_sendmsg+0xd0/0xeb [c059b1bf] sys_sendto+0x11b/0x13b [c059b216] sys_send+0x37/0x3b [c059bb9e] sys_socketcall+0x14a/0x261 [c0404f7c] syscall_call+0x7/0xb [b7fd8410] 0xb7fd8410 === network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context IRQ-219(2243) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [c0405f88] dump_trace+0x64/0x105 [c0406041] show_trace_log_lvl+0x18/0x2c [c040664e] show_trace+0xf/0x11 [c04066cf] dump_stack+0x12/0x14 [c060511d] __rt_spin_lock+0x21/0x3d [f8a20e0c] atl1_xmit_frame+0x66f/0x6c6 [atl1] [c05a3d96] dev_hard_start_xmit+0x1c6/0x225 [c05b29bd] __qdisc_run+0xb7/0x1cf [c05a5661] dev_queue_xmit+0x14a/0x239 [c05c4a40] ip_output+0x207/0x243 [c05c41ea] ip_queue_xmit+0x3b2/0x402 [c05d26d7] tcp_transmit_skb+0x6e5/0x713 [c05d289a] tcp_send_ack+0xeb/0xef [c05d1617] tcp_rcv_established+0x52a/0x7ff [c05d7234] tcp_v4_do_rcv+0x1bf/0x494 [c05d9955] tcp_v4_rcv+0x863/0x8d6 [c05bff3a] ip_local_deliver+0x18f/0x23d [c05bfd72] ip_rcv+0x41d/0x456 [c05a3991] netif_receive_skb+0x2cc/0x35e [c05a524a] process_backlog+0x76/0xc9 [c05a5419] net_rx_action+0xa7/0x1a5 [c042e276] ___do_softirq+0xfe/0x214 [c042e6a6] do_softirq_from_hardirq+0x48/0x61 [c0459204] do_irqd+0x21a/0x282 [c043ad18] kthread+0xb0/0xd8 [c0405bbf] kernel_thread_helper+0x7/0x10 === printk: 14 messages suppressed. network driver disabled raw interrupts: atl1_xmit_frame+0x0/0x6c6 [atl1] BUG: sleeping function called from invalid context firefox-bin(17517) at kernel/rtmutex.c:613 in_atomic():0 [], irqs_disabled():1 [c0405f88] dump_trace+0x64/0x105 [c0406041] show_trace_log_lvl+0x18/0x2c [c040664e] show_trace+0xf/0x11 [c04066cf] dump_stack+0x12/0x14 [c060511d] __rt_spin_lock+0x21/0x3d [f8a20e0c] atl1_xmit_frame+0x66f/0x6c6 [atl1] [c05a3d96] dev_hard_start_xmit+0x1c6/0x225 [c05b29bd] __qdisc_run+0xb7/0x1cf [c05a5661] dev_queue_xmit+0x14a/0x239 [c05c4a40] ip_output+0x207/0x243 [c05c41ea] ip_queue_xmit+0x3b2/0x402 [c05d26d7] tcp_transmit_skb+0x6e5/0x713 [c05d41ad] tcp_push_one+0xb3/0xd8 [c05c9f92] tcp_sendmsg+0x7c8/0x9f9 [c05e2ce1] inet_sendmsg+0x3b/0x45 [c059a86a] sock_sendmsg+0xd0/0xeb [c059b1bf] sys_sendto+0x11b/0x13b [c059b216] sys_send+0x37/0x3b [c059bb9e]
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 22:34 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > apparently you caught that 3 seconds window where the .23-rc1-rt1 > > > release script moved old patches into the older/ directory :-) > > > > Yup, good timing... :-) Hard to do again... > > (BTW, will you keep 2.6.22.x patches going for a while?) > > yeah, that's the plan: to keep .22-rt updated until .23 is released. > (Thomas agrees with that approach too) Thank you thank you to all involved! That's very good news... -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 22:05 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > On Tue, 2007-07-24 at 12:34 -0700, Fernando Lopez-Lezcano wrote: > > > On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: > > > > * Rui Nuno Capela <[EMAIL PROTECTED]> wrote: > > > > > > > > > Maybe I was too quick, but `make all` on is failing here: > > > > > > > > does -rt6 work better? > > > > > > Hmmm, -rt6 seems to be gone... was about to download it and it > > > dissapeared. > > > > Never mind, I see it migrated back to the main page. > > apparently you caught that 3 seconds window where the .23-rc1-rt1 > release script moved old patches into the older/ directory :-) Yup, good timing... :-) Hard to do again... (BTW, will you keep 2.6.22.x patches going for a while?) -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 12:34 -0700, Fernando Lopez-Lezcano wrote: > On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: > > * Rui Nuno Capela <[EMAIL PROTECTED]> wrote: > > > > > Maybe I was too quick, but `make all` on is failing here: > > > > does -rt6 work better? > > Hmmm, -rt6 seems to be gone... was about to download it and it > dissapeared. Never mind, I see it migrated back to the main page. -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: > * Rui Nuno Capela <[EMAIL PROTECTED]> wrote: > > > Maybe I was too quick, but `make all` on is failing here: > > does -rt6 work better? Hmmm, -rt6 seems to be gone... was about to download it and it dissapeared. -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: * Rui Nuno Capela [EMAIL PROTECTED] wrote: Maybe I was too quick, but `make all` on is failing here: does -rt6 work better? Hmmm, -rt6 seems to be gone... was about to download it and it dissapeared. -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 12:34 -0700, Fernando Lopez-Lezcano wrote: On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: * Rui Nuno Capela [EMAIL PROTECTED] wrote: Maybe I was too quick, but `make all` on is failing here: does -rt6 work better? Hmmm, -rt6 seems to be gone... was about to download it and it dissapeared. Never mind, I see it migrated back to the main page. -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 22:05 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: On Tue, 2007-07-24 at 12:34 -0700, Fernando Lopez-Lezcano wrote: On Tue, 2007-07-24 at 09:39 +0200, Ingo Molnar wrote: * Rui Nuno Capela [EMAIL PROTECTED] wrote: Maybe I was too quick, but `make all` on is failing here: does -rt6 work better? Hmmm, -rt6 seems to be gone... was about to download it and it dissapeared. Never mind, I see it migrated back to the main page. apparently you caught that 3 seconds window where the .23-rc1-rt1 release script moved old patches into the older/ directory :-) Yup, good timing... :-) Hard to do again... (BTW, will you keep 2.6.22.x patches going for a while?) -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt5
On Tue, 2007-07-24 at 22:34 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: apparently you caught that 3 seconds window where the .23-rc1-rt1 release script moved old patches into the older/ directory :-) Yup, good timing... :-) Hard to do again... (BTW, will you keep 2.6.22.x patches going for a while?) yeah, that's the plan: to keep .22-rt updated until .23 is released. (Thomas agrees with that approach too) Thank you thank you to all involved! That's very good news... -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] slub crashes with recent -git
On Thu, 2007-07-19 at 21:42 +0200, Ingo Molnar wrote: > Linus, Christoph, > > recent slub commits in -git cause this bootup crash: > > Freeing unused kernel memory: 324k freed > Write protecting the kernel read-only data: 1294k Just curious, are the crashes even possible in 2.6.22.1? (I see the same patchable code snippet in the source). Just wondering if I should also apply this to 2.6.21.1-rt4... -- Fernando > [ cut here ] > kernel BUG at mm/slub.c:2401! > invalid opcode: [#1] > PREEMPT SMP > Modules linked in: > CPU:0 > EIP:0060:[]Not tainted VLI > EFLAGS: 00010046 (2.6.22 #1) > EIP is at ksize+0x13/0x42 > eax: ebx: ecx: 0020 edx: > esi: f76a4000 edi: 0004 ebp: f7b11e74 esp: f7b11e74 > ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > Process udevd (pid: 824, ti=f7b11000 task=f7ca5000 task.ti=f7b11000) > Stack: f7b11e94 c016c28b f768cb00 0020 f7ca5000 0004 f76a4000 > fff4 > f7b11eb4 c03cf158 0002 f76a4000 0020 f768cb00 f76a4000 > f7b11ed8 > f7b11ed0 c03cfbc6 f768cb00 f768cb00 c046bf80 f768cb00 000c > f7b11f6c > Call Trace: > [] show_trace_log_lvl+0x19/0x2e > [] show_stack_log_lvl+0x9d/0xa5 > [] show_registers+0x1f5/0x334 > [] die+0x118/0x1fc > [] do_trap+0x8e/0xa8 > [] do_invalid_op+0x88/0x92 > [] error_code+0x72/0x78 > [] krealloc+0x27/0x6d > [] netlink_realloc_groups+0x61/0xd9 > [] netlink_bind+0x4f/0x121 > [] sys_bind+0x67/0x86 > [] sys_socketcall+0x8f/0x244 > [] sysenter_past_esp+0x6b/0xb5 > === > Code: 40 02 00 75 03 8b 52 0c 8b 02 5d 84 c0 b8 00 00 00 00 0f 49 d0 89 d0 > c3 55 31 d2 83 f8 10 89 e5 74 34 e8 bc ff ff ff 85 c0 75 04 <0f> 0b eb fe 8b > 40 10 85 c0 75 04 0f 0b eb fe 8b 10 f6 c6 0c 74 > > i had to apply the patch below to make the kernel boot again. > > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > > Index: linux/mm/slub.c > === > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -2394,7 +2394,7 @@ size_t ksize(const void *object) > struct page *page; > struct kmem_cache *s; > > - if (object == ZERO_SIZE_PTR) > + if (object == ZERO_SIZE_PTR || !object) > return 0; > > page = get_object_page(object); > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] slub crashes with recent -git
On Thu, 2007-07-19 at 21:42 +0200, Ingo Molnar wrote: Linus, Christoph, recent slub commits in -git cause this bootup crash: Freeing unused kernel memory: 324k freed Write protecting the kernel read-only data: 1294k Just curious, are the crashes even possible in 2.6.22.1? (I see the same patchable code snippet in the source). Just wondering if I should also apply this to 2.6.21.1-rt4... -- Fernando [ cut here ] kernel BUG at mm/slub.c:2401! invalid opcode: [#1] PREEMPT SMP Modules linked in: CPU:0 EIP:0060:[c017dac3]Not tainted VLI EFLAGS: 00010046 (2.6.22 #1) EIP is at ksize+0x13/0x42 eax: ebx: ecx: 0020 edx: esi: f76a4000 edi: 0004 ebp: f7b11e74 esp: f7b11e74 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevd (pid: 824, ti=f7b11000 task=f7ca5000 task.ti=f7b11000) Stack: f7b11e94 c016c28b f768cb00 0020 f7ca5000 0004 f76a4000 fff4 f7b11eb4 c03cf158 0002 f76a4000 0020 f768cb00 f76a4000 f7b11ed8 f7b11ed0 c03cfbc6 f768cb00 f768cb00 c046bf80 f768cb00 000c f7b11f6c Call Trace: [c0105e3e] show_trace_log_lvl+0x19/0x2e [c0105ef0] show_stack_log_lvl+0x9d/0xa5 [c010628f] show_registers+0x1f5/0x334 [c01064e6] die+0x118/0x1fc [c0426e7f] do_trap+0x8e/0xa8 [c0106ac3] do_invalid_op+0x88/0x92 [c0426a92] error_code+0x72/0x78 [c016c28b] krealloc+0x27/0x6d [c03cf158] netlink_realloc_groups+0x61/0xd9 [c03cfbc6] netlink_bind+0x4f/0x121 [c03afe8d] sys_bind+0x67/0x86 [c03b11e3] sys_socketcall+0x8f/0x244 [c0104ef2] sysenter_past_esp+0x6b/0xb5 === Code: 40 02 00 75 03 8b 52 0c 8b 02 5d 84 c0 b8 00 00 00 00 0f 49 d0 89 d0 c3 55 31 d2 83 f8 10 89 e5 74 34 e8 bc ff ff ff 85 c0 75 04 0f 0b eb fe 8b 40 10 85 c0 75 04 0f 0b eb fe 8b 10 f6 c6 0c 74 i had to apply the patch below to make the kernel boot again. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] Index: linux/mm/slub.c === --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -2394,7 +2394,7 @@ size_t ksize(const void *object) struct page *page; struct kmem_cache *s; - if (object == ZERO_SIZE_PTR) + if (object == ZERO_SIZE_PTR || !object) return 0; page = get_object_page(object); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Wed, 2007-07-18 at 09:18 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > does lockdep pinpoint anything? > > > > Lots of stuff, and at the end the lock report for the problem. > > Hopefully some of this will help... I have attached the whole bootup > > sequence as logged in /var/log/messages. > > yeah, it pinpointed the bug. It seems to be an interaction between > RCU-preempt (Paul Cc:-ed) and sched_mc_power_savings_store(): > detach_destroy_domains() uses synchronize_sched() which uses > getaffinity, which takes sched_hotcpu_mutex, and > arch_reinit_sched_domains does it too - see the lockdep report below. > I've added a quick workaround below as well, which should keep your box > from hanging. I can confirm that flash9 does not hang with the patch. Thanks!!! I presume the same would apply to 2.6.21.x and, say, rt21. I'll test. But (of course, there's always a but somewhere) I just experienced a complete hang - 2.6.22.1-rt4 with the little patch. This time there was something in the logs, maybe it will help? This was when finishing the install of an additional kernel module rpm package (ipw3945 drivers). -- Fernando Jul 18 10:48:15 localhost kernel: BUG: sleeping function called from invalid context modprobe(5001) at kernel/rtmutex.c:636 Jul 18 10:48:15 localhost kernel: in_atomic():1 [0001], irqs_disabled():0 Jul 18 10:48:15 localhost kernel: [] show_trace_log_lvl +0x1a/0x2f Jul 18 10:48:15 localhost kernel: [] show_trace+0x12/0x14 Jul 18 10:48:15 localhost kernel: [] dump_stack+0x16/0x18 Jul 18 10:48:15 localhost kernel: [] __might_sleep+0xeb/0xf2 Jul 18 10:48:15 localhost kernel: [] __rt_spin_lock+0x24/0x40 Jul 18 10:48:15 localhost kernel: [] rt_spin_lock+0x8/0xa Jul 18 10:48:15 localhost kernel: [] get_zone_pcp+0x23/0x33 Jul 18 10:48:15 localhost kernel: [] free_hot_cold_page +0xcf/0x148 Jul 18 10:48:15 localhost kernel: [] free_hot_page+0xa/0xc Jul 18 10:48:15 localhost kernel: [] __free_pages+0x25/0x30 Jul 18 10:48:15 localhost kernel: [] free_pages+0x29/0x2b Jul 18 10:48:15 localhost kernel: [] quicklist_trim+0xd0/0xf5 Jul 18 10:48:15 localhost kernel: [] check_pgt_cache +0x1e/0x20 Jul 18 10:48:15 localhost kernel: [] free_pgtables+0x52/0x147 Jul 18 10:48:15 localhost kernel: [] unmap_region+0xe6/0x135 Jul 18 10:48:15 localhost kernel: [] do_munmap+0x153/0x1b4 Jul 18 10:48:15 localhost kernel: [] do_mremap+0x413/0x4c3 Jul 18 10:48:15 localhost kernel: [] sys_mremap+0x36/0x56 Jul 18 10:48:15 localhost kernel: [] syscall_call+0x7/0xb Jul 18 10:48:15 localhost kernel: === Jul 18 10:48:16 localhost kernel: BUG: sleeping function called from invalid context head(5652) at kernel/rtmutex.c:636 Jul 18 10:48:16 localhost kernel: in_atomic():1 [0001], irqs_disabled():0 Jul 18 10:48:16 localhost kernel: [] show_trace_log_lvl +0x1a/0x2f Jul 18 10:48:16 localhost kernel: [] show_trace+0x12/0x14 Jul 18 10:48:16 localhost kernel: [] dump_stack+0x16/0x18 Jul 18 10:48:16 localhost kernel: [] __might_sleep+0xeb/0xf2 Jul 18 10:48:16 localhost kernel: [] __rt_spin_lock+0x24/0x40 Jul 18 10:48:16 localhost kernel: [] rt_spin_lock+0x8/0xa Jul 18 10:48:16 localhost kernel: [] get_zone_pcp+0x23/0x33 Jul 18 10:48:16 localhost kernel: [] free_hot_cold_page +0xcf/0x148 Jul 18 10:48:16 localhost kernel: [] free_hot_page+0xa/0xc Jul 18 10:48:16 localhost kernel: [] __free_pages+0x25/0x30 Jul 18 10:48:16 localhost kernel: [] free_pages+0x29/0x2b Jul 18 10:48:16 localhost kernel: [] quicklist_trim+0xd0/0xf5 Jul 18 10:48:16 localhost kernel: [] check_pgt_cache +0x1e/0x20 Jul 18 10:48:16 localhost kernel: [] free_pgtables+0x52/0x147 Jul 18 10:48:16 localhost kernel: [] unmap_region+0xe6/0x135 Jul 18 10:48:16 localhost kernel: [] do_munmap+0x153/0x1b4 Jul 18 10:48:16 localhost kernel: [] sys_munmap+0x30/0x3f Jul 18 10:48:16 localhost kernel: [] syscall_call+0x7/0xb Jul 18 10:48:16 localhost kernel: === Jul 18 10:50:22 localhost syslogd 1.4.2: restart. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Wed, 2007-07-18 at 09:18 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: does lockdep pinpoint anything? Lots of stuff, and at the end the lock report for the problem. Hopefully some of this will help... I have attached the whole bootup sequence as logged in /var/log/messages. yeah, it pinpointed the bug. It seems to be an interaction between RCU-preempt (Paul Cc:-ed) and sched_mc_power_savings_store(): detach_destroy_domains() uses synchronize_sched() which uses getaffinity, which takes sched_hotcpu_mutex, and arch_reinit_sched_domains does it too - see the lockdep report below. I've added a quick workaround below as well, which should keep your box from hanging. I can confirm that flash9 does not hang with the patch. Thanks!!! I presume the same would apply to 2.6.21.x and, say, rt21. I'll test. But (of course, there's always a but somewhere) I just experienced a complete hang - 2.6.22.1-rt4 with the little patch. This time there was something in the logs, maybe it will help? This was when finishing the install of an additional kernel module rpm package (ipw3945 drivers). -- Fernando Jul 18 10:48:15 localhost kernel: BUG: sleeping function called from invalid context modprobe(5001) at kernel/rtmutex.c:636 Jul 18 10:48:15 localhost kernel: in_atomic():1 [0001], irqs_disabled():0 Jul 18 10:48:15 localhost kernel: [c0405f34] show_trace_log_lvl +0x1a/0x2f Jul 18 10:48:15 localhost kernel: [c0406a09] show_trace+0x12/0x14 Jul 18 10:48:15 localhost kernel: [c0406a71] dump_stack+0x16/0x18 Jul 18 10:48:15 localhost kernel: [c0423bfc] __might_sleep+0xeb/0xf2 Jul 18 10:48:15 localhost kernel: [c0617242] __rt_spin_lock+0x24/0x40 Jul 18 10:48:15 localhost kernel: [c0617266] rt_spin_lock+0x8/0xa Jul 18 10:48:15 localhost kernel: [c04621c9] get_zone_pcp+0x23/0x33 Jul 18 10:48:15 localhost kernel: [c0462702] free_hot_cold_page +0xcf/0x148 Jul 18 10:48:15 localhost kernel: [c04627b2] free_hot_page+0xa/0xc Jul 18 10:48:15 localhost kernel: [c0462a82] __free_pages+0x25/0x30 Jul 18 10:48:15 localhost kernel: [c0462ab6] free_pages+0x29/0x2b Jul 18 10:48:15 localhost kernel: [c047abf3] quicklist_trim+0xd0/0xf5 Jul 18 10:48:15 localhost kernel: [c041f5d9] check_pgt_cache +0x1e/0x20 Jul 18 10:48:15 localhost kernel: [c046aedf] free_pgtables+0x52/0x147 Jul 18 10:48:15 localhost kernel: [c046cdf7] unmap_region+0xe6/0x135 Jul 18 10:48:15 localhost kernel: [c046d764] do_munmap+0x153/0x1b4 Jul 18 10:48:15 localhost kernel: [c046f3de] do_mremap+0x413/0x4c3 Jul 18 10:48:15 localhost kernel: [c046f4c4] sys_mremap+0x36/0x56 Jul 18 10:48:15 localhost kernel: [c0404fca] syscall_call+0x7/0xb Jul 18 10:48:15 localhost kernel: === Jul 18 10:48:16 localhost kernel: BUG: sleeping function called from invalid context head(5652) at kernel/rtmutex.c:636 Jul 18 10:48:16 localhost kernel: in_atomic():1 [0001], irqs_disabled():0 Jul 18 10:48:16 localhost kernel: [c0405f34] show_trace_log_lvl +0x1a/0x2f Jul 18 10:48:16 localhost kernel: [c0406a09] show_trace+0x12/0x14 Jul 18 10:48:16 localhost kernel: [c0406a71] dump_stack+0x16/0x18 Jul 18 10:48:16 localhost kernel: [c0423bfc] __might_sleep+0xeb/0xf2 Jul 18 10:48:16 localhost kernel: [c0617242] __rt_spin_lock+0x24/0x40 Jul 18 10:48:16 localhost kernel: [c0617266] rt_spin_lock+0x8/0xa Jul 18 10:48:16 localhost kernel: [c04621c9] get_zone_pcp+0x23/0x33 Jul 18 10:48:16 localhost kernel: [c0462702] free_hot_cold_page +0xcf/0x148 Jul 18 10:48:16 localhost kernel: [c04627b2] free_hot_page+0xa/0xc Jul 18 10:48:16 localhost kernel: [c0462a82] __free_pages+0x25/0x30 Jul 18 10:48:16 localhost kernel: [c0462ab6] free_pages+0x29/0x2b Jul 18 10:48:16 localhost kernel: [c047abf3] quicklist_trim+0xd0/0xf5 Jul 18 10:48:16 localhost kernel: [c041f5d9] check_pgt_cache +0x1e/0x20 Jul 18 10:48:16 localhost kernel: [c046aedf] free_pgtables+0x52/0x147 Jul 18 10:48:16 localhost kernel: [c046cdf7] unmap_region+0xe6/0x135 Jul 18 10:48:16 localhost kernel: [c046d764] do_munmap+0x153/0x1b4 Jul 18 10:48:16 localhost kernel: [c046d7f5] sys_munmap+0x30/0x3f Jul 18 10:48:16 localhost kernel: [c0404fca] syscall_call+0x7/0xb Jul 18 10:48:16 localhost kernel: === Jul 18 10:50:22 localhost syslogd 1.4.2: restart. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 22:12 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: > > > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > > > > I do get flash 9 (I know, not the best example) and tomboy to hang as > > > > reported by one of my Planet CCRMA users - flash 9 tested working on > > > > stock fedora 7 kernel - and both seem to hang in the same system call: > > > > > > > > sched_getaffinity(3528, 32, > > > > > > > > Full output of strace attached for both cases. > > > > > > hm, that's weird. Is it completely unkillable at that time? Could you do > > > a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get > > > a full task state dump via: > > > > > > echo t > /proc/sysrq-trigger > > > > Trace attached... the process stays in D state no matter what. > > hm, seems to be related to: > > Jul 17 12:51:18 localhost kernel: sched-powersa D [f0aaf930] 0005 6584 > 3420 3407 > > which blocks the cpu-hotplug mutex: > > Jul 17 12:51:18 localhost kernel: Call Trace: > Jul 17 12:51:18 localhost kernel: [] schedule+0xe0/0xfa > Jul 17 12:51:18 localhost kernel: [] rt_mutex_slowlock+0x164/0x20b > Jul 17 12:51:18 localhost kernel: [] rt_mutex_lock+0x3c/0x3f > Jul 17 12:51:18 localhost kernel: [] sched_getaffinity+0x14/0x94 > Jul 17 12:51:18 localhost kernel: [] __synchronize_sched+0xd/0x5a > Jul 17 12:51:18 localhost kernel: [] > arch_reinit_sched_domains+0x18/0x33 > Jul 17 12:51:18 localhost kernel: [] > sched_power_savings_store+0x3c/0x49 > Jul 17 12:51:18 localhost kernel: [] sysdev_class_store+0x1e/0x22 > Jul 17 12:51:18 localhost kernel: [] sysfs_write_file+0xa3/0xc6 > Jul 17 12:51:18 localhost kernel: [] vfs_write+0xa8/0x154 > Jul 17 12:51:18 localhost kernel: [] sys_write+0x41/0x67 > Jul 17 12:51:18 localhost kernel: [] syscall_call+0x7/0xb > > and firefox blocks on the same mutex too: > > Jul 17 12:51:18 localhost kernel: firefox-bin D [efc44670] 0012 6368 > 4388 1 > Jul 17 12:51:18 localhost kernel: Call Trace: > Jul 17 12:51:18 localhost kernel: [] schedule+0xe0/0xfa > Jul 17 12:51:18 localhost kernel: [] rt_mutex_slowlock+0x164/0x20b > Jul 17 12:51:18 localhost kernel: [] rt_mutex_lock+0x3c/0x3f > Jul 17 12:51:18 localhost kernel: [] sched_getaffinity+0x14/0x94 > Jul 17 12:51:18 localhost kernel: [] > sys_sched_getaffinity+0x1f/0x41 > Jul 17 12:51:18 localhost kernel: [] syscall_call+0x7/0xb > Jul 17 12:51:18 localhost kernel: [] 0xb7f0f410 > > does lockdep pinpoint anything? Lots of stuff, and at the end the lock report for the problem. Hopefully some of this will help... I have attached the whole bootup sequence as logged in /var/log/messages. -- Fernando trace3.txt.gz Description: GNU Zip compressed data
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 22:12 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: > > > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > > > > I do get flash 9 (I know, not the best example) and tomboy to hang as > > > > reported by one of my Planet CCRMA users - flash 9 tested working on > > > > stock fedora 7 kernel - and both seem to hang in the same system call: > > > > > > > > sched_getaffinity(3528, 32, > > > > > > > > Full output of strace attached for both cases. > > > > > > hm, that's weird. Is it completely unkillable at that time? Could you do > > > a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get > > > a full task state dump via: > > > > > > echo t > /proc/sysrq-trigger > > > > Trace attached... the process stays in D state no matter what. Just in case, it repeats under 2.6.22.1-rt4 (< rt4 did not boot into my t61 laptop, this one at least does that). I'm including the (probably redundant) dump. I have to build a new kernel with prove locking... -- Fernando > hm, seems to be related to: > > Jul 17 12:51:18 localhost kernel: sched-powersa D [f0aaf930] 0005 6584 > 3420 3407 > > which blocks the cpu-hotplug mutex: > > Jul 17 12:51:18 localhost kernel: Call Trace: > Jul 17 12:51:18 localhost kernel: [] schedule+0xe0/0xfa > Jul 17 12:51:18 localhost kernel: [] rt_mutex_slowlock+0x164/0x20b > Jul 17 12:51:18 localhost kernel: [] rt_mutex_lock+0x3c/0x3f > Jul 17 12:51:18 localhost kernel: [] sched_getaffinity+0x14/0x94 > Jul 17 12:51:18 localhost kernel: [] __synchronize_sched+0xd/0x5a > Jul 17 12:51:18 localhost kernel: [] > arch_reinit_sched_domains+0x18/0x33 > Jul 17 12:51:18 localhost kernel: [] > sched_power_savings_store+0x3c/0x49 > Jul 17 12:51:18 localhost kernel: [] sysdev_class_store+0x1e/0x22 > Jul 17 12:51:18 localhost kernel: [] sysfs_write_file+0xa3/0xc6 > Jul 17 12:51:18 localhost kernel: [] vfs_write+0xa8/0x154 > Jul 17 12:51:18 localhost kernel: [] sys_write+0x41/0x67 > Jul 17 12:51:18 localhost kernel: [] syscall_call+0x7/0xb > > and firefox blocks on the same mutex too: > > Jul 17 12:51:18 localhost kernel: firefox-bin D [efc44670] 0012 6368 > 4388 1 > Jul 17 12:51:18 localhost kernel: Call Trace: > Jul 17 12:51:18 localhost kernel: [] schedule+0xe0/0xfa > Jul 17 12:51:18 localhost kernel: [] rt_mutex_slowlock+0x164/0x20b > Jul 17 12:51:18 localhost kernel: [] rt_mutex_lock+0x3c/0x3f > Jul 17 12:51:18 localhost kernel: [] sched_getaffinity+0x14/0x94 > Jul 17 12:51:18 localhost kernel: [] > sys_sched_getaffinity+0x1f/0x41 > Jul 17 12:51:18 localhost kernel: [] syscall_call+0x7/0xb > Jul 17 12:51:18 localhost kernel: [] 0xb7f0f410 > > does lockdep pinpoint anything? > > Ingo trace2.txt.gz Description: GNU Zip compressed data
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > I do get flash 9 (I know, not the best example) and tomboy to hang as > > reported by one of my Planet CCRMA users - flash 9 tested working on > > stock fedora 7 kernel - and both seem to hang in the same system call: > > > > sched_getaffinity(3528, 32, > > > > Full output of strace attached for both cases. > > hm, that's weird. Is it completely unkillable at that time? Could you do > a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get > a full task state dump via: > > echo t > /proc/sysrq-trigger Trace attached... the process stays in D state no matter what. -- Fernando trace1.txt.gz Description: GNU Zip compressed data
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > I do get flash 9 (I know, not the best example) and tomboy to hang as > > reported by one of my Planet CCRMA users - flash 9 tested working on > > stock fedora 7 kernel - and both seem to hang in the same system call: > > > > sched_getaffinity(3528, 32, > > > > Full output of strace attached for both cases. > > hm, that's weird. Is it completely unkillable at that time? Could you do > a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get > a full task state dump via: > > echo t > /proc/sysrq-trigger > > thanks, kill -9 does nothing. If there's another way to kill something let me know :-) I'll try to get the dump asap. Hope you had a good time over the long weekend, you certainly deserve some rest (and congrats over the scheduler inclusing in mainline!) -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: I do get flash 9 (I know, not the best example) and tomboy to hang as reported by one of my Planet CCRMA users - flash 9 tested working on stock fedora 7 kernel - and both seem to hang in the same system call: sched_getaffinity(3528, 32, unfinished ... Full output of strace attached for both cases. hm, that's weird. Is it completely unkillable at that time? Could you do a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get a full task state dump via: echo t /proc/sysrq-trigger thanks, kill -9 does nothing. If there's another way to kill something let me know :-) I'll try to get the dump asap. Hope you had a good time over the long weekend, you certainly deserve some rest (and congrats over the scheduler inclusing in mainline!) -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: I do get flash 9 (I know, not the best example) and tomboy to hang as reported by one of my Planet CCRMA users - flash 9 tested working on stock fedora 7 kernel - and both seem to hang in the same system call: sched_getaffinity(3528, 32, unfinished ... Full output of strace attached for both cases. hm, that's weird. Is it completely unkillable at that time? Could you do a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get a full task state dump via: echo t /proc/sysrq-trigger Trace attached... the process stays in D state no matter what. -- Fernando trace1.txt.gz Description: GNU Zip compressed data
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 22:12 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: I do get flash 9 (I know, not the best example) and tomboy to hang as reported by one of my Planet CCRMA users - flash 9 tested working on stock fedora 7 kernel - and both seem to hang in the same system call: sched_getaffinity(3528, 32, unfinished ... Full output of strace attached for both cases. hm, that's weird. Is it completely unkillable at that time? Could you do a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get a full task state dump via: echo t /proc/sysrq-trigger Trace attached... the process stays in D state no matter what. Just in case, it repeats under 2.6.22.1-rt4 ( rt4 did not boot into my t61 laptop, this one at least does that). I'm including the (probably redundant) dump. I have to build a new kernel with prove locking... -- Fernando hm, seems to be related to: Jul 17 12:51:18 localhost kernel: sched-powersa D [f0aaf930] 0005 6584 3420 3407 which blocks the cpu-hotplug mutex: Jul 17 12:51:18 localhost kernel: Call Trace: Jul 17 12:51:18 localhost kernel: [c0603f46] schedule+0xe0/0xfa Jul 17 12:51:18 localhost kernel: [c0604d0d] rt_mutex_slowlock+0x164/0x20b Jul 17 12:51:18 localhost kernel: [c0604a5c] rt_mutex_lock+0x3c/0x3f Jul 17 12:51:18 localhost kernel: [c0423bb4] sched_getaffinity+0x14/0x94 Jul 17 12:51:18 localhost kernel: [c045a647] __synchronize_sched+0xd/0x5a Jul 17 12:51:18 localhost kernel: [c0423732] arch_reinit_sched_domains+0x18/0x33 Jul 17 12:51:18 localhost kernel: [c0423789] sched_power_savings_store+0x3c/0x49 Jul 17 12:51:18 localhost kernel: [c0552cd4] sysdev_class_store+0x1e/0x22 Jul 17 12:51:18 localhost kernel: [c04b195b] sysfs_write_file+0xa3/0xc6 Jul 17 12:51:18 localhost kernel: [c047a64a] vfs_write+0xa8/0x154 Jul 17 12:51:18 localhost kernel: [c047ac65] sys_write+0x41/0x67 Jul 17 12:51:18 localhost kernel: [c0404f7c] syscall_call+0x7/0xb and firefox blocks on the same mutex too: Jul 17 12:51:18 localhost kernel: firefox-bin D [efc44670] 0012 6368 4388 1 Jul 17 12:51:18 localhost kernel: Call Trace: Jul 17 12:51:18 localhost kernel: [c0603f46] schedule+0xe0/0xfa Jul 17 12:51:18 localhost kernel: [c0604d0d] rt_mutex_slowlock+0x164/0x20b Jul 17 12:51:18 localhost kernel: [c0604a5c] rt_mutex_lock+0x3c/0x3f Jul 17 12:51:18 localhost kernel: [c0423bb4] sched_getaffinity+0x14/0x94 Jul 17 12:51:18 localhost kernel: [c0423c53] sys_sched_getaffinity+0x1f/0x41 Jul 17 12:51:18 localhost kernel: [c0404f7c] syscall_call+0x7/0xb Jul 17 12:51:18 localhost kernel: [b7f0f410] 0xb7f0f410 does lockdep pinpoint anything? Ingo trace2.txt.gz Description: GNU Zip compressed data
Re: v2.6.21.5-rt19 (sched_getaffinity?)
On Tue, 2007-07-17 at 22:12 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: On Tue, 2007-07-17 at 21:32 +0200, Ingo Molnar wrote: * Fernando Lopez-Lezcano [EMAIL PROTECTED] wrote: I do get flash 9 (I know, not the best example) and tomboy to hang as reported by one of my Planet CCRMA users - flash 9 tested working on stock fedora 7 kernel - and both seem to hang in the same system call: sched_getaffinity(3528, 32, unfinished ... Full output of strace attached for both cases. hm, that's weird. Is it completely unkillable at that time? Could you do a few things: enable CONFIG_PROVE_LOCKING (lockdep), and also try to get a full task state dump via: echo t /proc/sysrq-trigger Trace attached... the process stays in D state no matter what. hm, seems to be related to: Jul 17 12:51:18 localhost kernel: sched-powersa D [f0aaf930] 0005 6584 3420 3407 which blocks the cpu-hotplug mutex: Jul 17 12:51:18 localhost kernel: Call Trace: Jul 17 12:51:18 localhost kernel: [c0603f46] schedule+0xe0/0xfa Jul 17 12:51:18 localhost kernel: [c0604d0d] rt_mutex_slowlock+0x164/0x20b Jul 17 12:51:18 localhost kernel: [c0604a5c] rt_mutex_lock+0x3c/0x3f Jul 17 12:51:18 localhost kernel: [c0423bb4] sched_getaffinity+0x14/0x94 Jul 17 12:51:18 localhost kernel: [c045a647] __synchronize_sched+0xd/0x5a Jul 17 12:51:18 localhost kernel: [c0423732] arch_reinit_sched_domains+0x18/0x33 Jul 17 12:51:18 localhost kernel: [c0423789] sched_power_savings_store+0x3c/0x49 Jul 17 12:51:18 localhost kernel: [c0552cd4] sysdev_class_store+0x1e/0x22 Jul 17 12:51:18 localhost kernel: [c04b195b] sysfs_write_file+0xa3/0xc6 Jul 17 12:51:18 localhost kernel: [c047a64a] vfs_write+0xa8/0x154 Jul 17 12:51:18 localhost kernel: [c047ac65] sys_write+0x41/0x67 Jul 17 12:51:18 localhost kernel: [c0404f7c] syscall_call+0x7/0xb and firefox blocks on the same mutex too: Jul 17 12:51:18 localhost kernel: firefox-bin D [efc44670] 0012 6368 4388 1 Jul 17 12:51:18 localhost kernel: Call Trace: Jul 17 12:51:18 localhost kernel: [c0603f46] schedule+0xe0/0xfa Jul 17 12:51:18 localhost kernel: [c0604d0d] rt_mutex_slowlock+0x164/0x20b Jul 17 12:51:18 localhost kernel: [c0604a5c] rt_mutex_lock+0x3c/0x3f Jul 17 12:51:18 localhost kernel: [c0423bb4] sched_getaffinity+0x14/0x94 Jul 17 12:51:18 localhost kernel: [c0423c53] sys_sched_getaffinity+0x1f/0x41 Jul 17 12:51:18 localhost kernel: [c0404f7c] syscall_call+0x7/0xb Jul 17 12:51:18 localhost kernel: [b7f0f410] 0xb7f0f410 does lockdep pinpoint anything? Lots of stuff, and at the end the lock report for the problem. Hopefully some of this will help... I have attached the whole bootup sequence as logged in /var/log/messages. -- Fernando trace3.txt.gz Description: GNU Zip compressed data
Re: v2.6.22.1-rt3
On Fri, 2007-07-13 at 13:22 +0200, Thomas Gleixner wrote: > we are pleased to announce the v2.6.22.1-rt3 kernel > > Attention! > > Ingo is off for a long weekend and therefor the download location for > this release is: > > http://www.tglx.de/projects/preempt-rt/2.6.22.1 > > more info about the -rt patchset can be found in the RT wiki: > >http://rt.wiki.kernel.org > > This release is bugfix release: > > - update of the x8664 -hrt queue (resolve boot problems) > - gtod vsyscall fix from Gregory Haskins Same problem as reported yesterday in 2.6.22.1-rt2 in a T61 laptop, boot hangs, last BUG printed is similar to this (numbers changed since yesterday, of course, functions listed appear to be the same). No serial port available to dump everything... This was copied from the screen yesterday: BUG: spinlock lockup on CPU#1, swapper/0, c318da88 [] show_trace_log_lvl+0x1a/0x2f [] show_trace+-x12/0x14 [] dump_stack+0x16/0x18 [] _raw_spin_lock+0xc1/0xe2 [] __spin_lock_irq+0x14/0x16 [] __sched_tex_start+0xd5/0xaef [] schedule+0xe0/0xfa [] rt_spin_lock_slowlock+0xcf/0x14f [] __rt_spin_lock+0x3d/0x40 [] rt_spin_lock+0x8/0xa [] acpi_idle_enter_c3+0x12d/0x232 [] cpuidle_idle_call+0x56/0x79 [] cpu_idle+0x9d/0xda [] start_secondary+0x34e/0x356 [<>] 0x0 Same .config as before. -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.22.1-rt3
On Fri, 2007-07-13 at 13:22 +0200, Thomas Gleixner wrote: we are pleased to announce the v2.6.22.1-rt3 kernel Attention! Ingo is off for a long weekend and therefor the download location for this release is: http://www.tglx.de/projects/preempt-rt/2.6.22.1 more info about the -rt patchset can be found in the RT wiki: http://rt.wiki.kernel.org This release is bugfix release: - update of the x8664 -hrt queue (resolve boot problems) - gtod vsyscall fix from Gregory Haskins Same problem as reported yesterday in 2.6.22.1-rt2 in a T61 laptop, boot hangs, last BUG printed is similar to this (numbers changed since yesterday, of course, functions listed appear to be the same). No serial port available to dump everything... This was copied from the screen yesterday: BUG: spinlock lockup on CPU#1, swapper/0, c318da88 [c0405f34] show_trace_log_lvl+0x1a/0x2f [c0406a09] show_trace+-x12/0x14 [c0406a71] dump_stack+0x16/0x18 [c0617a91] _raw_spin_lock+0xc1/0xe2 [c061743f] __spin_lock_irq+0x14/0x16 [c061541d] __sched_tex_start+0xd5/0xaef [c061600e] schedule+0xe0/0xfa [c0616c15] rt_spin_lock_slowlock+0xcf/0x14f [c061724b] __rt_spin_lock+0x3d/0x40 [c0617256] rt_spin_lock+0x8/0xa [c052f95c] acpi_idle_enter_c3+0x12d/0x232 [c059af51] cpuidle_idle_call+0x56/0x79 [c04033a5] cpu_idle+0x9d/0xda [c0419e21] start_secondary+0x34e/0x356 [] 0x0 Same .config as before. -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19
On Sun, 2007-07-08 at 15:36 -0700, Fernando Lopez-Lezcano wrote: > On Sat, 2007-07-07 at 11:24 +0200, Ingo Molnar wrote: > > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > > Changes since 2.6.21.5-rt18: > > > > > > > > > > - Fixed a nasty and hard to track down slowness / boot problem on SMP > > > > > machines with CONFIG_NOHZ enabled. The problem was caused by the timer > > > > > wheel base lock held during the get_next_timer_interrupt() call in the > > > > > idle path, which eventually led to a bogus PI boosting of the idle > > > > > task > > > > > and in consequence a stale wrong scheduler selection for the affected > > > > > idle > > > > > task. > > > > > > > > > > Kudos to Carsten Emde, who patiently and meticulously isolated the > > > > > problem and provided the traces, which allowed to identify the root > > > > > cause. > > > > > > > > > > Problem solution: Prevent idle task boosting > > > > > > Maybe someone remember me whining about troubles with 2.6.21-rt2..18 > > > > on my Core2 T7200 laptop (fujitsu-siemens amilo i1520). > > > > > > > > Althought I'm still with my fingers crossed, I can tell the good > > > > news are that 2.6.21.5-rt19 (and -rt20) does behave far better now > > > > on the very same box. > > > > > > Yes, it works much better indeed... > > > > > > Ingo: is there a place where I can read about the changes in different > > > rtxx releases? What is new/better/fixed in rt20? (I see scheduler > > > stuff in a diff from rt19 to rt20 but I don't really know what it > > > means). > > > > and rt18 was a -rt-only NOHZ fix, that bug got introduced in rt11 when > > CFS was merged. > > > > i _think_ Rui might have seen two separate problems. Perhaps by the time > > we fixed the first problem (which Rui saw since -rt2) we introduced the > > other one via -rt11 - which then got fixed in -rt19. > > Ahh, CFS is now part of rt, I was obviously not paying attention... I'm > really trying to provide a "stable" rt kernel for audio usage and > including another subsystem into rt is - IMHO - not going to help. > What's the chance of splitting things? > > > btw., we'd love to get more feedback regarding CFS. CFS is a completely > > new scheduler for Linux. > > Then I'd rather have it separate from rt. Please? I would like to provide the least ammount of new functionality that is really necessary in my audio kernels. Audio related requirements include the rt patch but not a new scheduler. > > It has a design centered around keeping > > application latencies down, so it is ultimately real-time friendly, and > > it should also make things work better for desktop-ish and audio-ish > > stuff as well. (even under SCHED_OTHER) > > Maybe this is CFS related? (tail of a thread in the Planet CCRMA mailing > list): > > On Sun, 2007-07-08 at 15:26 -0400, Hector Centeno wrote: > > Ok, so just to confirm, that 2.6.21-0182.rt19.1.fc7.ccrmart works fine > > on my desktop but on my laptop it makes Firefox and Tomboy to crash. > > On the same laptop using 2.6.21-0182.rt17.1.fc7.ccrmart there is no > > problem. It looks to my untrained eye like it is CFS related, I'm attaching the last part of the strace of firefox while it tries to load a flash site. The firefox process is left in an unkillable (not even by -9) state. What else could I provide to debug the problem? (this is in a T61 laptop with the Intel 7700 processor). -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.6.21.5-rt19
On Sat, 2007-07-07 at 11:24 +0200, Ingo Molnar wrote: > * Fernando Lopez-Lezcano <[EMAIL PROTECTED]> wrote: > > > > Changes since 2.6.21.5-rt18: > > > > > > > > - Fixed a nasty and hard to track down slowness / boot problem on SMP > > > > machines with CONFIG_NOHZ enabled. The problem was caused by the timer > > > > wheel base lock held during the get_next_timer_interrupt() call in the > > > > idle path, which eventually led to a bogus PI boosting of the idle task > > > > and in consequence a stale wrong scheduler selection for the affected > > > > idle > > > > task. > > > > > > > > Kudos to Carsten Emde, who patiently and meticulously isolated the > > > > problem and provided the traces, which allowed to identify the root > > > > cause. > > > > > > > > Problem solution: Prevent idle task boosting > > > > Maybe someone remember me whining about troubles with 2.6.21-rt2..18 > > > on my Core2 T7200 laptop (fujitsu-siemens amilo i1520). > > > > > > Althought I'm still with my fingers crossed, I can tell the good > > > news are that 2.6.21.5-rt19 (and -rt20) does behave far better now > > > on the very same box. > > > > Yes, it works much better indeed... > > > > Ingo: is there a place where I can read about the changes in different > > rtxx releases? What is new/better/fixed in rt20? (I see scheduler > > stuff in a diff from rt19 to rt20 but I don't really know what it > > means). > > and rt18 was a -rt-only NOHZ fix, that bug got introduced in rt11 when > CFS was merged. > > i _think_ Rui might have seen two separate problems. Perhaps by the time > we fixed the first problem (which Rui saw since -rt2) we introduced the > other one via -rt11 - which then got fixed in -rt19. Ahh, CFS is now part of rt, I was obviously not paying attention... I'm really trying to provide a "stable" rt kernel for audio usage and including another subsystem into rt is - IMHO - not going to help. What's the chance of splitting things? > btw., we'd love to get more feedback regarding CFS. CFS is a completely > new scheduler for Linux. Then I'd rather have it separate from rt. > It has a design centered around keeping > application latencies down, so it is ultimately real-time friendly, and > it should also make things work better for desktop-ish and audio-ish > stuff as well. (even under SCHED_OTHER) Maybe this is CFS related? (tail of a thread in the Planet CCRMA mailing list): On Sun, 2007-07-08 at 15:26 -0400, Hector Centeno wrote: > Ok, so just to confirm, that 2.6.21-0182.rt19.1.fc7.ccrmart works fine > on my desktop but on my laptop it makes Firefox and Tomboy to crash. > On the same laptop using 2.6.21-0182.rt17.1.fc7.ccrmart there is no > problem. > > Cheers, > > Hector > > > On 7/7/07, Hector Centeno <[EMAIL PROTECTED]> wrote: > Hi Fernando, > > I do have Flash installed but for me Firefox crashes when > trying to > access gmail (which AFAIK doesn't use Flash, does it?). Right > now > Firefox is frozen and I'm typing this email using Konkeror (in > Gnome). > This is ps' output: > > hector3595 1.1 2.2 194352 46336 ?D16:25 > 0:03 > /usr/lib/firefox-2.0.0.4/firefox-bin > > I think the problem is not present in my Desktop but I have to > double > check. In the same laptop using the stock fedora kernel both > Tomboy > and Firefox work fine. My laptop has a centrino duo processor, > 2 gigs > of ram and the Inte GMA950 graphics chip. > > Hector I managed to completely hang firefox (fc7) with flash 9 installed (unkillable even with -9). Does not seem to happen with flash 7. Have not tried yet with gmail and flash uninstalled. I'll try to strace it to see when/why it hangs. -- Fernando > So it would be nice if you could keep an extra eye on any scheduling > artifacts or regressions, and make sure your favorite workload is still > handled by the Linux scheduler in the utmost best way. I'd like to hear > about any sort of "scheduling behavior / interactivity" regression you > might see, relative to the vanilla kernel. Or if you can see no such > problems then a line of "it works as well as the previous scheduler" is > important info to us too. Thanks! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/