slab error in verify_redzone_free(): cache `radix_tree_node': memory outside object was overwritten

2012-11-14 Thread Soeren Sonnenburg
Hi there!

I am on a core i7 system bl67 intel board and it all keeps oopsing on
me. On 3.2.33 I get on 3.6.6 I get rcu errors (though rcu stress test
didn't show anything) or traces that include cpuidle / apic.

Does anyone have an idea what that could be? The system is just running
a plain console and some disk i/o is going on all the time.

Thanks,
Soeren

slab error in verify_redzone_free(): cache `radix_tree_node': memory outside 
object was overwritten
Pid: 0, comm: swapper/3 Not tainted 3.2.33 #1
Call Trace:
   [] ? __slab_error.isra.53+0x1b/0x30
 [] ? cache_free_debugcheck+0x27e/0x280
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? kmem_cache_free+0x5b/0x1e0
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? __do_softirq+0x95/0x120
 [] ? lapic_next_event+0x18/0x20
 [] ? clockevents_program_event+0x6f/0x110
 [] ? call_softirq+0x1c/0x30
 [] ? do_softirq+0x65/0xa0
 [] ? irq_exit+0x8e/0xb0
 [] ? smp_apic_timer_interrupt+0x68/0xa0
 [] ? apic_timer_interrupt+0x6e/0x80
   [] ? intel_idle+0xed/0x160
 [] ? intel_idle+0xcb/0x160
 [] ? cpuidle_idle_call+0x8b/0x100
 [] ? cpu_idle+0x6a/0xf0
8801ba9366d8: redzone 1:0xd84156c5635688c0, redzone 2:0xf14156c5635688c0.
slab error in verify_redzone_free(): cache `radix_tree_node': memory outside 
object was overwritten
Pid: 16, comm: ksoftirqd/3 Not tainted 3.2.33 #1
Call Trace:
 [] ? __slab_error.isra.53+0x1b/0x30
 [] ? cache_free_debugcheck+0x27e/0x280
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? kmem_cache_free+0x5b/0x1e0
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? __do_softirq+0x95/0x120
 [] ? run_ksoftirqd+0x10a/0x230
 [] ? __do_softirq+0x120/0x120
 [] ? __do_softirq+0x120/0x120
 [] ? kthread+0x7e/0x90
 [] ? kernel_thread_helper+0x4/0x10
 [] ? kthread_worker_fn+0x180/0x180
 [] ? gs_change+0x13/0x13
8801ba936248: redzone 1:0xd84156c5635688c0, redzone 2:0x964156c5635688c0.
slab error in verify_redzone_free(): cache `radix_tree_node': memory outside 
object was overwritten
Pid: 0, comm: swapper/3 Not tainted 3.2.33 #1
Call Trace:
   [] ? __slab_error.isra.53+0x1b/0x30
 [] ? cache_free_debugcheck+0x27e/0x280
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? kmem_cache_free+0x5b/0x1e0
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? __do_softirq+0x95/0x120
 [] ? lapic_next_event+0x18/0x20
 [] ? clockevents_program_event+0x6f/0x110
 [] ? call_softirq+0x1c/0x30
 [] ? do_softirq+0x65/0xa0
 [] ? irq_exit+0x8e/0xb0
 [] ? smp_apic_timer_interrupt+0x68/0xa0
 [] ? apic_timer_interrupt+0x6e/0x80
   [] ? intel_idle+0xed/0x160
 [] ? intel_idle+0xcb/0x160
 [] ? cpuidle_idle_call+0x8b/0x100
 [] ? cpu_idle+0x6a/0xf0
8801ba9366d8: redzone 1:0xd84156c5635688c0, redzone 2:0xf04156c5635688c0.
slab error in verify_redzone_free(): cache `radix_tree_node': memory outside 
object was overwritten
Pid: 0, comm: swapper/3 Not tainted 3.2.33 #1
Call Trace:
   [] ? __slab_error.isra.53+0x1b/0x30
 [] ? cache_free_debugcheck+0x27e/0x280
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? kmem_cache_free+0x5b/0x1e0
 [] ? __rcu_process_callbacks+0x174/0x390
 [] ? __do_softirq+0x95/0x120
 [] ? lapic_next_event+0x18/0x20
 [] ? clockevents_program_event+0x6f/0x110
 [] ? call_softirq+0x1c/0x30
 [] ? do_softirq+0x65/0xa0
 [] ? irq_exit+0x8e/0xb0
 [] ? smp_apic_timer_interrupt+0x68/0xa0
 [] ? apic_timer_interrupt+0x6e/0x80
   [] ? intel_idle+0xed/0x160
 [] ? intel_idle+0xcb/0x160
 [] ? cpuidle_idle_call+0x8b/0x100
 [] ? cpu_idle+0x6a/0xf0
8801ba936248: redzone 1:0xd84156c5635688c0, redzone 2:0x964156c5635688c0.
Slab corruption: radix_tree_node start=8801ba936b70, len=560
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [](__rcu_process_callbacks+0x174/0x390)
090: 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b 00  kkk.kkk.
0a0: 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b 00  kkk.kkk.
0b0: 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b 00  kkk.kkk.
0c0: 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b 00  kkk.kkk.
Prev obj: start=8801ba936928, len=560
Redzone: 0xfd4156c5635688c0/0xd84156c5635688c0.
Last user: [](radix_tree_preload+0x66/0xf0)
000: 01 00 00 00 00 00 00 3c 00 00 00 00 00 00 00 b8  ...<
010: 00 00 00 00 00 00 00 19 00 00 00 00 00 00 00 00  
Next obj: start=8801ba936db8, len=560
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [](radix_tree_preload+0x66/0xf0)
000: 01 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00  @...
010: 00 00 00 00 00 00 00 00 08 96 1a 0c 00 ea ff ff  

-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962


signature.asc
Description: This is a digitally signed message part


b43 regression: extremely slow wireless with BCM4322 802.11a/b/g/n

2012-08-27 Thread Soeren Sonnenburg
Hi,

since kernel version >= 3.4 wireless became extremely slow (download
rates of about 90k/sec when downloading a new kernel from kernel.org
instead of >1MB/s)https://bbs.archlinux.org/viewtopic.php?pid=1146085

There is a thread on archlinux
https://bbs.archlinux.org/viewtopic.php?pid=1146085 where people observe
the very same. Kernel 3.3.X was the last known good one.

*Sometimes* download rates are OK but most of the time they are not.

$lspci | grep Broad

04:00.0 Network controller: Broadcom Corporation BCM4322 802.11a/b/g/n Wireless 
LAN Controller (rev 01)


Any ideas?

Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962


signature.asc
Description: This is a digitally signed message part


3.4.4 - kernel BUG at mm/slab.c:3073

2012-07-09 Thread Soeren Sonnenburg
Happens on an intel dh67bl - config attached, I would happily report
more details / try out things.

[ cut here ]
kernel BUG at mm/slab.c:3073!
invalid opcode:  [#1] PREEMPT SMP 
CPU 1 
Modules linked in:

Pid: 570, comm: kswapd0 Not tainted 3.4.4 #1  /DH67BL
RIP: 0010:[]  [] 
cache_free_debugcheck+0x1e6/0x280
RSP: 0018:88040de0f980  EFLAGS: 00010086
RAX: 0010 RBX: 88041f044580 RCX: 0800
RDX: 73ff8800ba9348a8 RSI: 8800ba9348a8 RDI: 88041f044580
RBP: 8800ba9348a8 R08: 8800b62c5d40 R09: 
R10:  R11:  R12: 8800ba934000
R13: 81173cbd R14: 09f911029d74e35b R15: 09f911029d74e35b
FS:  () GS:88041fa4() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: ff600400 CR3: 0185f000 CR4: 000407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process kswapd0 (pid: 570, threadinfo 88040de0e000, task 88040f2ae090)
Stack:
 001f0001 88041f044580 0286 88040f0cc2b0
 8800ba9348b0 81173cbd 88040de0fb00 81141436
 00015430 ea0003f119f8 8800ba9348b0 88041ec03938
Call Trace:
 [] ? free_buffer_head+0x1d/0x70
 [] ? kmem_cache_free+0x66/0x200
 [] ? free_buffer_head+0x1d/0x70
 [] ? try_to_free_buffers+0x79/0xc0
 [] ? shrink_page_list+0x760/0x8d0
 [] ? get_parent_ip+0x9/0x20
 [] ? sub_preempt_count+0x87/0xb0
 [] ? update_isolated_counts+0x12e/0x170
 [] ? shrink_inactive_list+0x293/0x4c0
 [] ? shrink_mem_cgroup_zone+0x3f2/0x560
 [] ? get_parent_ip+0x9/0x20
 [] ? balance_pgdat+0x5ea/0x880
 [] ? get_parent_ip+0x9/0x20
 [] ? get_parent_ip+0x9/0x20
 [] ? kswapd+0x19e/0x310
 [] ? wake_up_bit+0x40/0x40
 [] ? balance_pgdat+0x880/0x880
 [] ? balance_pgdat+0x880/0x880
 [] ? kthread+0x9e/0xb0
 [] ? kernel_thread_helper+0x4/0x10
 [] ? kthread_freezable_should_stop+0x60/0x60
 [] ? gs_change+0x13/0x13
Code: 74 9d 02 11 f9 09 48 89 ee 4c 89 7c 05 f8 48 89 df 49 be 5b e3 74 9d 02 
11 f9 09 e8 d5 ee ff ff 4c 89 30 8b 43 14 e9 e5 fe ff ff <0f> 0b eb fe 0f 0b eb 
fe 0f 0b eb fe 48 8b 40 30 e9 b5 fe ff ff 
RIP  [] cache_free_debugcheck+0x1e6/0x280
 RSP 
---[ end trace fc9ef30b19cc00e3 ]---

-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962


config.gz
Description: GNU Zip compressed data


signature.asc
Description: This is a digitally signed message part


google chrome / chromium hangs on 3.5.0-rc6 but not 3.4.x - threading issues?

2012-07-08 Thread Soeren Sonnenburg
Hi,

there seems to be some weird interaction between the latest (git
current!) linux kernel and recent chrome/chromium releases (>= 2X.X).

Basically chrome fails to open various web sites like https://github.com
just hanging (waiting). This problem is not there with older kernel
version (e.g. linux 3.4.4).

Opening chrome single-threaded works all nicely. A gdb backtrace of such
hang and some more analysis can be found under

https://code.google.com/p/chromium/issues/detail?id=136258
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679827

It should be noted that chrome 18.X is still working with the new
kernel. However, klogd also seems to have some trouble with the newer
kernel utilizing 100% of the CPU but not with the older ones but I don't
know whether the problem is related.

Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962


signature.asc
Description: This is a digitally signed message part


Re: 2.6.25-current-git hangs on boot

2008-02-24 Thread Soeren Sonnenburg
On Sun, 2008-02-24 at 12:18 +0100, Rafael J. Wysocki wrote:
> On Sunday, 24 of February 2008, Soeren Sonnenburg wrote:
> > On Sat, 2008-02-23 at 20:00 +0100, Oliver Pinter wrote:
> > > the pci=nommconf kernel parameter helped it?
> > 
> > yes indeed, this switch reliably helps to over come the hang at *this
> > stage* (I tried booting with booth the switch and w/o).
> > 
> > however with 50% chance I still see a hang directly after
> > 
> > cpuidle: using governor ladder
> 
> Do you have CONFIG_CPU_IDLE set?  If you have, please try to unset it and
> retest.

Yes I had. When disabling that option booting 10 times in a row worked
without problems.

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-current-git hangs on boot

2008-02-23 Thread Soeren Sonnenburg
On Sat, 2008-02-23 at 20:00 +0100, Oliver Pinter wrote:
> the pci=nommconf kernel parameter helped it?

yes indeed, this switch reliably helps to over come the hang at *this
stage* (I tried booting with booth the switch and w/o).

however with 50% chance I still see a hang directly after

cpuidle: using governor ladder

note that I've never seen these hangs on 2.6.24* ...

Soeren

> On 2/23/08, Soeren Sonnenburg <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > trying out newest git, I see a hang with
> > ACPI: SSDT 7feb9c10, 02ae (r1 APPLE CPU0Ist 3000 intl 20050309)
> > ACPI: SSDT 7feb9910, 02c3 (r1 APPLE CPU0Cst 3001 intl 20050309)
> > ...
> > ACPI: Processor [CPU0] (supports 8 throttling states)
> > ACPI: SSDT 7feb9f10, 0087 (r1 APPLE CPU1Ist 3000 intl 20050309)
> > ACPI: SSDT 7feb8f10, 0085 (r1 APPLE CPU1Cst 3001 intl 20050309)
> > ...
> > ACPI: ACPI0007:01 is registered as cooling_device1
> > ACPI: Processor [CPU1] (supports 8 throttling states)
> >
> > as the last message...
> >
> > Any ideas?
> > Soeren
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.25-current-git hangs on boot

2008-02-23 Thread Soeren Sonnenburg
Hi,

trying out newest git, I see a hang with 
ACPI: SSDT 7feb9c10, 02ae (r1 APPLE CPU0Ist 3000 intl 20050309)
ACPI: SSDT 7feb9910, 02c3 (r1 APPLE CPU0Cst 3001 intl 20050309)
...
ACPI: Processor [CPU0] (supports 8 throttling states)
ACPI: SSDT 7feb9f10, 0087 (r1 APPLE CPU1Ist 3000 intl 20050309)
ACPI: SSDT 7feb8f10, 0085 (r1 APPLE CPU1Cst 3001 intl 20050309)
...
ACPI: ACPI0007:01 is registered as cooling_device1
ACPI: Processor [CPU1] (supports 8 throttling states)

as the last message...

Any ideas?
Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.25-git-current several Section mismatch in reference from ...

2008-02-22 Thread Soeren Sonnenburg
compiling with

make CONFIG_DEBUG_SECTION_MISMATCH=y

I see the following warnings:

  CC  kernel/stacktrace.o
  CC  kernel/irq/handle.o
  LD  mm/built-in.o
WARNING: mm/built-in.o(.meminit.text+0x89e): Section mismatch in reference from 
the function free_area_init_core() to the function .init.text:setup_usemap()
The function __meminit free_area_init_core() references
a function __init setup_usemap().
If setup_usemap is only used by free_area_init_core then
annotate setup_usemap with a matching annotation.

WARNING: mm/built-in.o(.data+0x10a4): Section mismatch in reference from the 
variable cpu_callback_nb.23582 to the function .devinit.text:cpu_callback()
The variable cpu_callback_nb.23582 references
the function __devinit cpu_callback()
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, 

  CC  ipc/util.o
  CC  fs/mpage.o
[...]
  LDS arch/x86/kernel/vmlinux.lds
  CC  kernel/power/swsusp.o
  CC  fs/nfsctl.o
  LD  arch/x86/kernel/built-in.o
WARNING: arch/x86/kernel/built-in.o(.text+0x13277): Section mismatch in 
reference from the function cpu_exit_clear() to the function 
.cpuinit.text:cpu_uninit()
The function cpu_exit_clear() references
the function __cpuinit cpu_uninit().
This is often because cpu_exit_clear lacks a __cpuinit 
annotation or the annotation of cpu_uninit is wrong.

  LD  security/built-in.o
[...]
  CC  kernel/latencytop.o
  CC [M]  fs/cifs/dir.o
  CC  block/genhd.o
  LD  kernel/built-in.o
fs/cifs/dir.c: In function ‘cifs_ci_compare’:
fs/cifs/dir.c:594: warning: passing argument 1 of ‘__constant_memcpy’ discards 
qualifiers from pointer target type
fs/cifs/dir.c:594: warning: passing argument 1 of ‘__memcpy’ discards 
qualifiers from pointer target type
fs/cifs/dir.c:594: warning: passing argument 1 of ‘__constant_memcpy’ discards 
qualifiers from pointer target type
fs/cifs/dir.c:594: warning: passing argument 1 of ‘__memcpy’ discards 
qualifiers from pointer target type
WARNING: kernel/built-in.o(.text+0x30269): Section mismatch in reference from 
the function take_cpu_down() to the variable .cpuinit.data:cpu_chain
The function take_cpu_down() references
the variable __cpuinitdata cpu_chain.
This is often because take_cpu_down lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.text+0x302f7): Section mismatch in reference from 
the function _cpu_down() to the variable .cpuinit.data:cpu_chain
The function _cpu_down() references
the variable __cpuinitdata cpu_chain.
This is often because _cpu_down lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.text+0x3036d): Section mismatch in reference from 
the function _cpu_down() to the variable .cpuinit.data:cpu_chain
The function _cpu_down() references
the variable __cpuinitdata cpu_chain.
This is often because _cpu_down lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.text+0x303cf): Section mismatch in reference from 
the function _cpu_down() to the variable .cpuinit.data:cpu_chain
The function _cpu_down() references
the variable __cpuinitdata cpu_chain.
This is often because _cpu_down lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.text+0x303fb): Section mismatch in reference from 
the function _cpu_down() to the variable .cpuinit.data:cpu_chain
The function _cpu_down() references
the variable __cpuinitdata cpu_chain.
This is often because _cpu_down lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.text+0x3050e): Section mismatch in reference from 
the function unregister_cpu_notifier() to the variable .cpuinit.data:cpu_chain
The function unregister_cpu_notifier() references
the variable __cpuinitdata cpu_chain.
This is often because unregister_cpu_notifier lacks a __cpuinitdata 
annotation or the annotation of cpu_chain is wrong.

WARNING: kernel/built-in.o(.data+0x270): Section mismatch in reference from the 
variable profile_cpu_callback_nb.19213 to the function 
.devinit.text:profile_cpu_callback()
The variable profile_cpu_callback_nb.19213 references
the function __devinit profile_cpu_callback()
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, 

WARNING: kernel/built-in.o(.data+0x1b48): Section mismatch in reference from 
the variable workqueue_cpu_callback_nb.13434 to the function 
.devinit.text:workqueue_cpu_callback()
The variable workqueue_cpu_callback_nb.13434 references
the function __devinit workqueue_cpu_callback()
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer,

Re: 2.6.25-rc2 regression - hang on suspend

2008-02-22 Thread Soeren Sonnenburg
On Fri, 2008-02-22 at 16:56 +0100, Rafael J. Wysocki wrote:
> On Friday, 22 of February 2008, Soeren Sonnenburg wrote:
> > On Fri, 2008-02-22 at 00:06 +0100, Rafael J. Wysocki wrote: 
> > > On Thursday, 21 of February 2008, Soeren Sonnenburg wrote:
> > > > On Thu, 2008-02-21 at 01:31 +0100, Rafael J. Wysocki wrote:
> > > > > On Wednesday, 20 of February 2008, Soeren Sonnenburg wrote:
> > > > > > On Wed, 2008-02-20 at 00:50 +0100, Rafael J. Wysocki wrote:
> > [...] 
> > Also when compiling these many kernels via make -j4 I noted that I could
> > hardly move the mouse / use the keyboard, but saw random jumps and
> > key-repetitions...
> 
> This last bit is most likely a scheduler issue.  Do you have 
> CONFIG_GROUP_SCHED
> set by chance?  If you do, please try to unset it and see if that helps.

Yes I had. Disabling this helped a lot -- the kernel seems to behave
normally with this option unset.

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 regression - hang on suspend

2008-02-21 Thread Soeren Sonnenburg
On Fri, 2008-02-22 at 00:06 +0100, Rafael J. Wysocki wrote: 
> On Thursday, 21 of February 2008, Soeren Sonnenburg wrote:
> > On Thu, 2008-02-21 at 01:31 +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, 20 of February 2008, Soeren Sonnenburg wrote:
> > > > On Wed, 2008-02-20 at 00:50 +0100, Rafael J. Wysocki wrote:
[...] 
> > Using echo none >/sys/power/pm_test and then
> > echo mem >/sys/power/state I see it hang on ata1 errors again. Waiting
> > about 10-30 seconds it progresses further and finally arrives at 
> > 
> > CPU0 attaching NULL sched-domain
> > CPU1 attaching NULL sched-domain
> > 
> > then hangs.
> 
> Please see if compiling the kernel with CONFIG_SMP unset makes suspend
> work.

*Argh*, this bug is not behaving nicely :( Whatever happened,
git-current now suspends correctly with and without CONFIG_SMP  and all
may CONFIG_PREEMPT_RCU=y and CONFIG_CLASSIC_RCU=y attempts. Also no sata
errors anymore.

However it is not reliably waking up (at least when all of the above
except CLASSIC_RCU is on). Sometimes the display remains black on the
console, but X still works and sometimes it hangs completely on resume.

Also when compiling these many kernels via make -j4 I noted that I could
hardly move the mouse / use the keyboard, but saw random jumps and
key-repetitions...

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 regression - hang on suspend

2008-02-21 Thread Soeren Sonnenburg
On Thu, 2008-02-21 at 01:31 +0100, Rafael J. Wysocki wrote:
> On Wednesday, 20 of February 2008, Soeren Sonnenburg wrote:
> > On Wed, 2008-02-20 at 00:50 +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, 19 of February 2008, Soeren Sonnenburg wrote:
> > > > On Tue, 2008-02-19 at 22:06 +0100, Rafael J. Wysocki wrote:
> > > > > On Tuesday, 19 of February 2008, Soeren Sonnenburg wrote:
> > > > > > Hi,
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > > since 2.6.25-rc1 (first version I tried) and still in rc2
> (and git), I
> > > > > > see a hang on s2ram already when trying to suspend.
> > > > > 
> > > > > Does it work with 2.6.24?
> > > > 
> > > > yes.
> > > 
> > > Please take the current mainline (there are a couple of nasty bugs
> fixed in
> > > it), configure it with CONFIG_PM_DEBUG set, boot it with
> "no_console_suspend",
> > > run
> > > 
> > > # echo 8 > /proc/sys/kernel/printk
> > > # echo devices > /sys/power/pm_test
> > > # echo mem > /sys/power/state
> > > 
> > > If it hangs, it should leave a stack trace before and I need that trace 
> > > to see
> > > what's going on.  If it doesn't hang, I'll tell you what to do next.
> > 
> > I tried with 2.6.24.2 with CONFIG_PM_DEBUG set, following your steps and
> > yes it works flawlessly (though the display did not come back I could
> > suspend/resume multiple times without problems, and finally s2ram -f -p
> > brought the display back).
> 
> Hm, there's no /sys/power/pm_test in 2.6.24.2 (and the "current mainline"
> means the latest -git kernel possible or the current top of the Linus' tree),
> so in fact you tested 2.6.24 again, that is known to work ...

Great :( 

Anyway testing linus' git-current, I see that it does not hang. However
after the echo mem >/sys/power/state I am seeing:
[...]
PM: Finishing wakeup.
Restarting tasks ... done.
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: device not ready (errno=-16), forcing hardreset


and this a couple of times. Using echo none >/sys/power/pm_test and then
echo mem >/sys/power/state I see it hang on ata1 errors again. Waiting
about 10-30 seconds it progresses further and finally arrives at 

CPU0 attaching NULL sched-domain
CPU1 attaching NULL sched-domain

then hangs.

So what next?
Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 regression - hang on suspend

2008-02-19 Thread Soeren Sonnenburg
On Wed, 2008-02-20 at 00:50 +0100, Rafael J. Wysocki wrote:
> On Tuesday, 19 of February 2008, Soeren Sonnenburg wrote:
> > On Tue, 2008-02-19 at 22:06 +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, 19 of February 2008, Soeren Sonnenburg wrote:
> > > > Hi,
> > > 
> > > Hi,
> > > 
> > > > since 2.6.25-rc1 (first version I tried) and still in rc2 (and git), I
> > > > see a hang on s2ram already when trying to suspend.
> > > 
> > > Does it work with 2.6.24?
> > 
> > yes.
> 
> Please take the current mainline (there are a couple of nasty bugs fixed in
> it), configure it with CONFIG_PM_DEBUG set, boot it with "no_console_suspend",
> run
> 
> # echo 8 > /proc/sys/kernel/printk
> # echo devices > /sys/power/pm_test
> # echo mem > /sys/power/state
> 
> If it hangs, it should leave a stack trace before and I need that trace to see
> what's going on.  If it doesn't hang, I'll tell you what to do next.

I tried with 2.6.24.2 with CONFIG_PM_DEBUG set, following your steps and
yes it works flawlessly (though the display did not come back I could
suspend/resume multiple times without problems, and finally s2ram -f -p
brought the display back).

So what next?

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 regression - hang on suspend

2008-02-19 Thread Soeren Sonnenburg
On Tue, 2008-02-19 at 22:06 +0100, Rafael J. Wysocki wrote:
> On Tuesday, 19 of February 2008, Soeren Sonnenburg wrote:
> > Hi,
> 
> Hi,
> 
> > since 2.6.25-rc1 (first version I tried) and still in rc2 (and git), I
> > see a hang on s2ram already when trying to suspend.
> 
> Does it work with 2.6.24?

yes.

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.25-rc2 regression - hang on suspend

2008-02-19 Thread Soeren Sonnenburg
Hi,

since 2.6.25-rc1 (first version I tried) and still in rc2 (and git), I
see a hang on s2ram already when trying to suspend.

This is on a macbookpro 1,1  - which steps should I do next to help
isolating the problem?

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 24rc8: unregister_netdevice: waiting for ... to become free. Usage count = 1?

2008-01-29 Thread Soeren Sonnenburg
On Tue, 2008-01-22 at 22:44 -0800, David Miller wrote:
> From: Soeren Sonnenburg <[EMAIL PROTECTED]>
> Date: Wed, 23 Jan 2008 07:42:21 +0100
> 
> > Dear all,
> > 
> > since some 2.6.24rc version I suddenly experience such messages on
> > console when trying to shutdown a vpn connection:
> > 
> > unregister_netdevice: waiting for tun0 to become free. Usage count = 1
> > 
> > or when removing an usb wlan dongle (although it was ifconfig wlan0
> > down'd before)
> 
> Current GIT already has a fix for this, attached below:

hmmhhh, I am still seeing this problem on 2.6.24 with at least with the
madwifi driver...

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 24rc8: unregister_netdevice: waiting for ... to become free. Usage count = 1?

2008-01-22 Thread Soeren Sonnenburg
On Tue, 2008-01-22 at 22:44 -0800, David Miller wrote:
> From: Soeren Sonnenburg <[EMAIL PROTECTED]>
> Date: Wed, 23 Jan 2008 07:42:21 +0100
> 
> > Dear all,
> > 
> > since some 2.6.24rc version I suddenly experience such messages on
> > console when trying to shutdown a vpn connection:
> > 
> > unregister_netdevice: waiting for tun0 to become free. Usage count = 1
> > 
> > or when removing an usb wlan dongle (although it was ifconfig wlan0
> > down'd before)
> 
> Current GIT already has a fix for this, attached below:

Thank you very much for pointing this out!

git pull ; make ; ...
Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


24rc8: unregister_netdevice: waiting for ... to become free. Usage count = 1?

2008-01-22 Thread Soeren Sonnenburg
Dear all,

since some 2.6.24rc version I suddenly experience such messages on
console when trying to shutdown a vpn connection:

unregister_netdevice: waiting for tun0 to become free. Usage count = 1

or when removing an usb wlan dongle (although it was ifconfig wlan0
down'd before)

unregister_netdevice: waiting for wlan0 to become free. Usage count = 1

Then only when all potential connections going over that iface are gone
these messages disappear (sometimes this does not happen and the kernel
then hangs on reboot...)

Is this intended?

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc8 oops ext3_clear_inode+0x25/0xa0

2008-01-19 Thread Soeren Sonnenburg
On Sat, 2008-01-19 at 22:00 -0600, Eric Sandeen wrote:
> Soeren Sonnenburg wrote:
> > Dear all,
> > 
> > I've just got this oops (causing the machine to hang finally)...
> > 
> > Any ideas?
> > Soeren
> 
> I've seen an awful lot of oopses out there on this path,
> kswapd->shrink_icache_memory; some get a little further and oops in
> ext3_discard_reservation.
> 
> A few were chalked up to bad memory, but others were not.  Do you happen
> to use suspend/resume?

Indeed, I suspended/resumed this machine a couple of times before seeing
this... And indeed it sometimes (on high activity, i.e. network/cpu/disk
load as happens when backups are done) oopses/freezes - but only when I
have suspended at least once... 

So I am quite confident it is not the memory - but yes if something
corrupts memory on a suspend/resume cycle on this macbookpro1,1 then the
effect could be the same :(

> Thanks to kerneloops.org... :)
> 
> All code
> 
>0: 12 11   adc(%ecx),%dl
>2: f8  clc
>3: ff 66 90jmp*0xff90(%esi)
>6: 83 ec 0csub$0xc,%esp
>9: 89 1c 24mov%ebx,(%esp)
>c: 8d 98 60 ff ff ff   lea0xff60(%eax),%ebx
>   12: 89 74 24 04 mov%esi,0x4(%esp)
>   16: 89 c6   mov%eax,%esi
>   18: 89 7c 24 08 mov%edi,0x8(%esp)
>   1c: 8b 53 70mov0x70(%ebx),%edx
>   1f: 8b 7b 54mov0x54(%ebx),%edi
>   22: 85 d2   test   %edx,%edx
>   24: 74 16   je 0x3c
>   26: 83 fa ffcmp$0x,%edx
>   29: 74 11   je 0x3c
>   2b:*f0 ff 0alock decl (%edx) <-- trapping 
> instruction
> 
> Looks like it blew up in (inlined) posix_acl_release(), I think
> EXT3_I(inode)->i_acl passed to it was 66e88e66, in %edx.
> 
> I think %edi is the i_block_alloc_info, 0f01c883, which also looks
> crunchy.  Use after free perhaps?
> 
> > BUG: unable to handle kernel paging request at virtual address 66e88e66
> 
> Nice symmetric number, anyway.  :)
> 
> I've seen enough of these now, something real seems to be going on but I
> don't know what yet.

And I unfortunately have no idea how to trace this down further/how to
help you with this...

Soeren

> -Eric
> 
> > printing eip: c01fac85 *pde =  
> > Oops: 0002 [#1] PREEMPT SMP 
> > Modules linked in: hci_usb hidp rfcomm l2cap bluetooth tun cpufreq_stats 
> > coretemp xfrm_user xfrm4_tunnel tunnel4 ipcomp esp4 ah4 aes_generic hfsplus 
> > binfmt_misc fuse ebtable_broute bridge llc ebtable_nat ebtable_filter 
> > ebtables eeprom applesmc hwmon input_polldev snd_hda_intel snd_pcm_oss 
> > snd_mixer_oss snd_pcm snd_timer appletouch evdev i2c_i801 snd soundcore 
> > snd_page_alloc sky2 video intel_agp output agpgart
> > 
> > Pid: 205, comm: kswapd0 Not tainted (2.6.24-rc8-sonne #7)
> > EIP: 0060:[] EFLAGS: 00010213 CPU: 1
> > EIP is at ext3_clear_inode+0x25/0xa0
> > EAX: c008f0a0 EBX: c008f000 ECX:  EDX: 66e88e66
> > ESI: c008f0a0 EDI: 0f01c883 EBP: 004d ESP: f7d29ebc
> >  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> > Process kswapd0 (pid: 205, ti=f7d28000 task=f7fdf540 task.ti=f7d28000)
> > Stack: c008f0a0  f7d29ef8 c0192d62 004d c008f0a0 c008f0a8 
> > c019309a 
> >e98b9ac8 0080 0080 f7d29ef8 c01932ec  0080 
> > c008f2b0 
> >ea1bdcd8 0002d438 013f c04ac24c 00d0 c0166e4c 2e0b 
> >  
> > Call Trace:
> >  [] clear_inode+0x62/0x140
> >  [] dispose_list+0x1a/0xe0
> >  [] shrink_icache_memory+0x18c/0x250
> >  [] shrink_slab+0x12c/0x1a0
> >  [] kswapd+0x32d/0x4d0
> >  [] autoremove_wake_function+0x0/0x40
> >  [] complete+0x3d/0x60
> >  [] kswapd+0x0/0x4d0
> >  [] kthread+0x42/0x70
> >  [] kthread+0x0/0x70
> >  [] kernel_thread_helper+0x7/0x14
> >  ===
> > Code: 12 11 f8 ff 66 90 83 ec 0c 89 1c 24 8d 98 60 ff ff ff 89 74 24 04 89 
> > c6 89 7c 24 08 8b 53 70 8b 7b 54 85 d2 74 16 83 fa ff 74 11  ff 0a 0f 
> > 94 c0 84 c0 75 51 c7 43 70 ff ff ff ff 8b 53 74 85 
> > EIP: [] ext3_clear_inode+0x25/0xa0 SS:ESP 0068:f7d29ebc
> > ---[ end trace 8dd028de7ae6e34e ]---
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.24-rc8 oops ext3_clear_inode+0x25/0xa0

2008-01-18 Thread Soeren Sonnenburg
Dear all,

I've just got this oops (causing the machine to hang finally)...

Any ideas?
Soeren

BUG: unable to handle kernel paging request at virtual address 66e88e66
printing eip: c01fac85 *pde =  
Oops: 0002 [#1] PREEMPT SMP 
Modules linked in: hci_usb hidp rfcomm l2cap bluetooth tun cpufreq_stats 
coretemp xfrm_user xfrm4_tunnel tunnel4 ipcomp esp4 ah4 aes_generic hfsplus 
binfmt_misc fuse ebtable_broute bridge llc ebtable_nat ebtable_filter ebtables 
eeprom applesmc hwmon input_polldev snd_hda_intel snd_pcm_oss snd_mixer_oss 
snd_pcm snd_timer appletouch evdev i2c_i801 snd soundcore snd_page_alloc sky2 
video intel_agp output agpgart

Pid: 205, comm: kswapd0 Not tainted (2.6.24-rc8-sonne #7)
EIP: 0060:[] EFLAGS: 00010213 CPU: 1
EIP is at ext3_clear_inode+0x25/0xa0
EAX: c008f0a0 EBX: c008f000 ECX:  EDX: 66e88e66
ESI: c008f0a0 EDI: 0f01c883 EBP: 004d ESP: f7d29ebc
 DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
Process kswapd0 (pid: 205, ti=f7d28000 task=f7fdf540 task.ti=f7d28000)
Stack: c008f0a0  f7d29ef8 c0192d62 004d c008f0a0 c008f0a8 c019309a 
   e98b9ac8 0080 0080 f7d29ef8 c01932ec  0080 c008f2b0 
   ea1bdcd8 0002d438 013f c04ac24c 00d0 c0166e4c 2e0b  
Call Trace:
 [] clear_inode+0x62/0x140
 [] dispose_list+0x1a/0xe0
 [] shrink_icache_memory+0x18c/0x250
 [] shrink_slab+0x12c/0x1a0
 [] kswapd+0x32d/0x4d0
 [] autoremove_wake_function+0x0/0x40
 [] complete+0x3d/0x60
 [] kswapd+0x0/0x4d0
 [] kthread+0x42/0x70
 [] kthread+0x0/0x70
 [] kernel_thread_helper+0x7/0x14
 ===
Code: 12 11 f8 ff 66 90 83 ec 0c 89 1c 24 8d 98 60 ff ff ff 89 74 24 04 89 c6 
89 7c 24 08 8b 53 70 8b 7b 54 85 d2 74 16 83 fa ff 74 11  ff 0a 0f 94 c0 84 
c0 75 51 c7 43 70 ff ff ff ff 8b 53 74 85 
EIP: [] ext3_clear_inode+0x25/0xa0 SS:ESP 0068:f7d29ebc
---[ end trace 8dd028de7ae6e34e ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4: bluetooth device gone after suspend to ram

2007-12-12 Thread Soeren Sonnenburg
On Wed, 2007-12-12 at 17:08 +0100, Marcel Holtmann wrote:
> Hi Oliver,
> 
> > > I noticed that on my macbook pro1,1 the bluetooth device is gone after
> > > suspend to ram, i.e.
> > 
> > Is this a regression?
> > Does it work if you unload hci_usb before you suspend?
> > If so, please recompile with CONFIG_USB_DEBUG and provide
> > dmesg.

No it was always like this.

> sometimes ACPI is involved and will killswitch the Bluetooth device on
> suspend. Sometimes the distros to a manual killswitch. And in case of
> Bluetooth a killswitch means physically removing the power from the
> device.
> 
> In case of a MacBook it can happen that this device goes back into HID
> mode and thus you need to call hid2hci.

indeed after calling hid2hci the device is visible again via 
hciconfig -a

Thanks!!!
Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5 "videobuf_read_start" [drivers/media/video/videobuf-dvb.ko] undefined!

2007-12-12 Thread Soeren Sonnenburg
On Wed, 2007-12-12 at 00:20 -0500, Shane wrote:
> In 2.6.24-rc5+, I hit this problem with videobuf_read_start
> not being exported. Patch attached, only compile tested.
> 
>   CHK include/linux/version.h
>   CHK include/linux/utsrelease.h
>   CALLscripts/checksyscalls.sh
>   CHK include/linux/compile.h
>   CC [M]  drivers/media/video/videobuf-core.o
>   Building modules, stage 2.
> Kernel: arch/x86/boot/bzImage is ready  (#1)
>   MODPOST 202 modules
> ERROR: "videobuf_read_start" [drivers/media/video/videobuf-dvb.ko] undefined!
> make[1]: *** [__modpost] Error 1
> make: *** [modules] Error 2

FWIW, I've seen the same thing and Shane's patch fixes things for me.

Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.24-rc4: bluetooth device gone after suspend to ram

2007-12-12 Thread Soeren Sonnenburg
Dear all,

I noticed that on my macbook pro1,1 the bluetooth device is gone after
suspend to ram, i.e.

/usr/sbin/hciconfig -a

normally lists hci0:Type: USB ...

but after suspend does nothing.

Here it does not help to remove the modules and to reload them. Also the
driver reloads without giving any error msg, but still nothing. As
others on mactel-users had the same problem (on mac mini/mbp2,2) it may
be worth investigating:

relevant dmesg:

Bluetooth: Core ver 2.11
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.9
Bluetooth: L2CAP socket layer initialized
Bluetooth: HIDP (Human Interface Emulation) ver 1.2
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.8
Bluetooth: HCI USB driver ver 2.9

Any ideas?
Soeren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23: no more free evdev devices - evdev leak?

2007-11-11 Thread Soeren Sonnenburg
On Tue, 2007-11-06 at 15:52 +0100, Jiri Kosina wrote:
> On Tue, 6 Nov 2007, Dmitry Torokhov wrote:
> 
> > Could you please try sticking a printk in 
> > hidinput_disconnect(drivers/hid/hid-input.c) to verify that 
> > input_unregister_device is in fact being called?
> 
> Also, is 2.6.23 the only kernel you are experiencing this with please?

So far yes. I am now on 2.6.24rc2 and although the input device numbers
steadily increase s2ram does reliably work even when done >10 times in a
row and no problems with evdev devices not being available.

However resume is slower as some ata device (I think my cdrom) is timing
out. So it could be related to a too fast resume from suspend.

So the problem may re-appear if matthew garrets fix for
[PATCH] Don't fail ata device revalidation for bad _GTF methods

will be committed.

Is there still a need to do tests with the printk's in 2.6.23?

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23: no more free evdev devices - evdev leak?

2007-10-31 Thread Soeren Sonnenburg
Dear all,

whenever I do a suspend resume cycle the input device's numbers are
increased until I finally run out of evdev devices. Is this a kernel
problem or some userspace program (udev/...) creating new devices all
the time?

here is the dmesg:

Soeren

input: Power Button (FF) as /devices/virtual/input/input0
input: Lid Switch as /devices/virtual/input/input1
input: Power Button (CM) as /devices/virtual/input/input2
input: Sleep Button (CM) as /devices/virtual/input/input3
input: Macintosh mouse button emulation as /devices/virtual/input/input4
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.0/input/input5
input: USB HID v1.11 Keyboard [Apple Computer Apple Internal Keyboard / 
Trackpad] on usb-:00:1d.0-2
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.2/input/input6
input: USB HID v1.11 Device [Apple Computer Apple Internal Keyboard / Trackpad] 
on usb-:00:1d.0-2
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.0/input/input7
input: USB HID v1.11 Keyboard [HID 05ac:1000] on usb-:00:1d.3-1
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.1/input/input8
input: USB HID v1.11 Mouse [HID 05ac:1000] on usb-:00:1d.3-1
input: appletouch as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.1/input/input9
input: Video Bus as /devices/virtual/input/input10
input: applesmc as /devices/platform/applesmc.768/input/input11
input: appletouch disconnected
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.0/input/input12
input: USB HID v1.11 Keyboard [Apple Computer Apple Internal Keyboard / 
Trackpad] on usb-:00:1d.0-2
input: appletouch as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.1/input/input13
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.2/input/input14
input: USB HID v1.11 Device [Apple Computer Apple Internal Keyboard / Trackpad] 
on usb-:00:1d.0-2
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.0/input/input15
input: USB HID v1.11 Keyboard [HID 05ac:1000] on usb-:00:1d.3-1
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.1/input/input16
input: USB HID v1.11 Mouse [HID 05ac:1000] on usb-:00:1d.3-1
input: appletouch disconnected
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.0/input/input17
input: USB HID v1.11 Keyboard [Apple Computer Apple Internal Keyboard / 
Trackpad] on usb-:00:1d.0-2
input: appletouch as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.1/input/input18
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.2/input/input19
input: USB HID v1.11 Device [Apple Computer Apple Internal Keyboard / Trackpad] 
on usb-:00:1d.0-2
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.0/input/input20
input: USB HID v1.11 Keyboard [HID 05ac:1000] on usb-:00:1d.3-1
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.1/input/input21
input: USB HID v1.11 Mouse [HID 05ac:1000] on usb-:00:1d.3-1
input: appletouch disconnected
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.0/input/input22
input: USB HID v1.11 Keyboard [Apple Computer Apple Internal Keyboard / 
Trackpad] on usb-:00:1d.0-2
input: appletouch as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.1/input/input23
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.2/input/input24
input: USB HID v1.11 Device [Apple Computer Apple Internal Keyboard / Trackpad] 
on usb-:00:1d.0-2
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.0/input/input25
input: USB HID v1.11 Keyboard [HID 05ac:1000] on usb-:00:1d.3-1
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.1/input/input26
input: USB HID v1.11 Mouse [HID 05ac:1000] on usb-:00:1d.3-1
input: appletouch disconnected
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.0/input/input27
input: USB HID v1.11 Keyboard [Apple Computer Apple Internal Keyboard / 
Trackpad] on usb-:00:1d.0-2
input: appletouch as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.1/input/input28
input: Apple Computer Apple Internal Keyboard / Trackpad as 
/devices/pci:00/:00:1d.0/usb2/2-2/2-2:1.2/input/input29
input: USB HID v1.11 Device [Apple Computer Apple Internal Keyboard / Trackpad] 
on usb-:00:1d.0-2
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.0/input/input30
input: USB HID v1.11 Keyboard [HID 05ac:1000] on usb-:00:1d.3-1
input: HID 05ac:1000 as 
/devices/pci:00/:00:1d.3/usb5/5-1/5-1:1.1/input/input31

Re: [PATCH 3/3] faster workaround

2007-10-24 Thread Soeren Sonnenburg
On Tue, 2007-10-23 at 17:08 +0900, Tejun Heo wrote:
> Jeff Garzik wrote:
> > Alan Cox wrote:
> >>> 2) Once we identified, over time, the set of drives affected by this
> >>> 3112 quirk (aka drives that didn't fully comply to SATA spec), the
> >>> debugging of corruption cases largely shifted to the standard
> >>> routine: update the BIOS, replace the
> >>> cables/RAM/power/mainboard/slot/etc. to be certain of problem location.
> >>
> >> Except for the continued series of later SI + Nvidia chipset (mostly)
> >> pattern which seems unanswered but also being later chips I assume
> >> unrelated to this problem.
> > 
> > The SIL_FLAG_MOD15WRITE flag is set in sil_port_info[] is set according
> > to the best info we have from SiI, which indicates that 3114 and 3512 do
> > not have the same problem as the 3112.
> 
> I don't think this data corruption problem w/ sil3114 is related to
> m15w.  m15w workaround slows down things quite a bit and is likely to
> hide problems on PCI bus side.  There are reports of data corruption
> with 3114 on nvidia (most common), via and now amd chipsets.  There's
> one on intel too but IIRC wasn't too definite.

err wait, the motherboard I am having is also via based. however the
m15w workaround just slowed down everything but the problem still
appeared.

also to be sure that it is really some problem related to the particular
seagate drive vs. sil3114 I created a 200G file of zeros on one of the
known to work seagates. and indeed it was all zeros... 

> According to a user, freebsd didn't have data corruption problem on the
> same hardware.  I copied PCI FIFO setup code (ours is broken BTW) but it
> didn't fix the problem.
> 
> I'll try to reproduce the problem locally and hunt it down.

Thanks in advance...
Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata sil3114 vs. certain seagate drives results in filesystem corruptions

2007-10-23 Thread Soeren Sonnenburg
On Mon, 2007-10-22 at 12:59 +0200, Bernd Schubert wrote: 
> On Monday 22 October 2007 12:36:32 Soeren Sonnenburg wrote:
> > On Mon, 2007-10-22 at 11:48 +0200, Bernd Schubert wrote:
> > > Hello,
> > >
> > > On Monday 22 October 2007 04:12:44 Tejun Heo wrote:
> > > > Helo,
> > > > [...]
> > > >
> > > > > Now when I write large files of zeros to root(sda&sdb) and read the
> > > > > file back in it contains a few nonzero entries:
> > > > >
> > > > > # dd if=/dev/zero of=/foo bs=1M count=2000
> > > > > # hexdump /foo
> > > > > 000        
> > > > > *
> > > > > 1GB random parts, within large blocks of zeroes>
> > > > >
> > > > > I can reliably trigger this on the md0 / devmapper-root setup when I
> > > > > write about 2GB of data (note that this machine has 1.5G of memory -
> > > > > and still 1GB is often enough to see this problem). Here it does not
> > > > > matter where in the filesystem I do these writes.
> > >
> > > Thats almost the same test as I'm always doing. Only I do not write only
> > > 2GB,
> >
> > Well when I read your mail I thought that I could be seeing exactly the
> > same bug... it still may be. However ``my'' problem does not go away
> > with the mod15fix ...
> 
> Yeah, pity it did not fix it :( I will try to port Tejuns patch 
> (http://home-tj.org/wiki/index.php/Sil_m15w#Patches) to 2.6.23 today or 
> tomorrow. If you are testing anyway, could you then also try this?

Hmmhh, dmesg said the m15 fix was turned on (at least it appeared for
the 2 drives in question in dmesg), so I fear it is something different.
On the other hand this is a 'production' machine so I am not too eager
to try very experimental things...

> > > but as much as it fits onto the disk. On reading back this file, the
> > > filesystem will report errors somewhere between 50GB and 230GB (disk size
> > > is 250GB).
> >
> > Wow, I really see lots of corruptions (well every 1-2 GB a couple of
> > bytes are corrupted). Are you getting similiarly many in the 50G - 230G
> > region?
> >
> > > > Thanks.  I'll try to reproduce the problem here.  What's your
> > > > motherboard?
> > >
> > > All tested S2882 boards here.
> >
> > I assume all equipped with lots of memory and mostly empty pci slots?
> 
> Yes, all pci-slots are free and the systems to have between 4 and 16GB memory 
> (ecc, monitored with edac). Well, those are cluster systems (actually tyan 
> names those B2882).
> Do you think the configuration is related? Here it also happens with odirect, 
> we tested this to minimize memory effects.

Mine is just a a7v8x with via KT400 chipset... really old, but several
of the pci slots are filled, so the problem may be more likely to happen
it may happen here... on the other hand I never tried writing 50-250G on
the drives I considered OK. Will do. Also what could be helpful is that
we both see patterns in the corruptions, like corruptions are always 512
bytes long or so (IIRC in my case they were only up to 64 bytes).

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata sil3114 vs. certain seagate drives results in filesystem corruptions

2007-10-22 Thread Soeren Sonnenburg
On Mon, 2007-10-22 at 13:02 +0200, Bernd Schubert wrote:
> On Monday 22 October 2007 12:36:32 Soeren Sonnenburg wrote:
> > > but as much as it fits onto the disk. On reading back this file, the
> > > filesystem will report errors somewhere between 50GB and 230GB (disk size
> > > is 250GB).
> >
> > Wow, I really see lots of corruptions (well every 1-2 GB a couple of
> > bytes are corrupted). Are you getting similiarly many in the 50G - 230G
> > region?
> 
> I never tested what is corrupted. Well, a diff over 250GB would take quite a 
> lot of time...

Actually hexdump does not display duplicate lines, so if your file is
really all zeros it will only display a single line + the count, however
I think it is not so optimized...

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata sil3114 vs. certain seagate drives results in filesystem corruptions

2007-10-22 Thread Soeren Sonnenburg
On Mon, 2007-10-22 at 11:48 +0200, Bernd Schubert wrote:
> Hello,
> 
> On Monday 22 October 2007 04:12:44 Tejun Heo wrote:
> > Helo,
> > [...]
> > > Now when I write large files of zeros to root(sda&sdb) and read the file
> > > back in it contains a few nonzero entries:
> > >
> > > # dd if=/dev/zero of=/foo bs=1M count=2000
> > > # hexdump /foo
> > > 000        
> > > *
> > > 1GB random parts, within large blocks of zeroes>
> > >
> > > I can reliably trigger this on the md0 / devmapper-root setup when I
> > > write about 2GB of data (note that this machine has 1.5G of memory - and
> > > still 1GB is often enough to see this problem). Here it does not matter
> > > where in the filesystem I do these writes.
> 
> Thats almost the same test as I'm always doing. Only I do not write only 2GB, 

Well when I read your mail I thought that I could be seeing exactly the
same bug... it still may be. However ``my'' problem does not go away
with the mod15fix ...

> but as much as it fits onto the disk. On reading back this file, the 
> filesystem will report errors somewhere between 50GB and 230GB (disk size is 
> 250GB).

Wow, I really see lots of corruptions (well every 1-2 GB a couple of
bytes are corrupted). Are you getting similiarly many in the 50G - 230G
region?

> > Thanks.  I'll try to reproduce the problem here.  What's your motherboard?
> 
> All tested S2882 boards here.

I assume all equipped with lots of memory and mostly empty pci slots?

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata sil3114 vs. certain seagate drives results in filesystem corruptions

2007-10-21 Thread Soeren Sonnenburg
On Mon, 2007-10-22 at 11:12 +0900, Tejun Heo wrote:
> Helo,
> 
> Soeren Sonnenburg wrote:
> > I finally managed to find a *reproducible* setup and way to trigger
> > random corruptions using a sata sil 3114 controller connected to 4
> > seagate drives
> > 
> > port 1: ST3400832AS sda
> > port 2: ST3400620AS sdb
> > port 3: ST3750640AS sdc
> > port 4: ST3750640AS sdd
> > 
> > sda & sdb form md0 via a raid1 setup followed by an additional
> > devicemapper layer ( root ). sdc and sdb are separate and also have an
> > additional device mapper layer ( public ) and ( backups ).
> > 
> > Now when I write large files of zeros to root(sda&sdb) and read the file
> > back in it contains a few nonzero entries:
> > 
> > # dd if=/dev/zero of=/foo bs=1M count=2000
> > # hexdump /foo
> > 000        
> > *
> > 1GB random parts, within large blocks of zeroes> 
> > 
> > I can reliably trigger this on the md0 / devmapper-root setup when I
> > write about 2GB of data (note that this machine has 1.5G of memory - and
> > still 1GB is often enough to see this problem). Here it does not matter
> > where in the filesystem I do these writes.
> 
> Thanks.  I'll try to reproduce the problem here.  What's your motherboard?

It is an asus a7v8x with a AMD Athlon(TM) XP 3000+ and admittingly
almost completely filled pci slots (4 dvb cards, 1 with the sil3114; 1
empty; in the agp slot a radeon 9200). Nevertheless I would not expect
the power supply to be the problem (it got replaced recently by a 500W
one), enough cooling (it is winter in germany + several fans).

> > Now promise_sata is converted to new EH, so I simply gave it a go, i.e.
> > I attached ST3400832AS and ST3400620AS to the promise controller and
> > rebooted and redid the experiments from above.
> > 
> > No data corruptions whatsoever. I even ran the dd on all three devmapped
> > mount points simultaneously with a size of 30GB each, still no
> > corruption. However the error messages I've seen a year ago are back for
> > the ST3400832AS and ST3400620AS attached to the promise controller (see
> > below).
> [--snip--]
> > ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x100 action 0x2
> > ata1.00: port_status 0x2020
> > ata1.00: cmd 25/00:00:c0:b6:74/00:01:20:00:00/e0 tag 0 cdb 0x0 data 131072 
> > in
> >  res 51/0c:00:c0:b6:74/0c:01:20:00:00/e0 Emask 0x10 (ATA bus error)
> > ata1: soft resetting port
> 
> Yeah, still the same.  Your drives don't like the way promise controller
> speaks to them (e.g. promise generates signals which are ) but now that
> sata_promise has proper EH.  It can recover from those errors.  As long
> as nothing worse happens, it should be okay.

These errors only appear when I generate some stress (like with the dd).
The machine is now up 2 days 8hrs and no further such warnings in the
log.

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sata sil3114 vs. certain seagate drives results in filesystem corruptions

2007-10-19 Thread Soeren Sonnenburg
Dear all,

I finally managed to find a *reproducible* setup and way to trigger
random corruptions using a sata sil 3114 controller connected to 4
seagate drives

port 1: ST3400832AS sda
port 2: ST3400620AS sdb
port 3: ST3750640AS sdc
port 4: ST3750640AS sdd

sda & sdb form md0 via a raid1 setup followed by an additional
devicemapper layer ( root ). sdc and sdb are separate and also have an
additional device mapper layer ( public ) and ( backups ).

Now when I write large files of zeros to root(sda&sdb) and read the file
back in it contains a few nonzero entries:

# dd if=/dev/zero of=/foo bs=1M count=2000
# hexdump /foo
000        
*
1GB random parts, within large blocks of zeroes> 

I can reliably trigger this on the md0 / devmapper-root setup when I
write about 2GB of data (note that this machine has 1.5G of memory - and
still 1GB is often enough to see this problem). Here it does not matter
where in the filesystem I do these writes.

As a test I did the same on sdc / devmapper-public and
sdd/devmapper-backups with even 30G of zeros. Nothing, no errors
everything is perfectly OK.

So I thought that this is also the the sil mod15write problem
http://home-tj.org/wiki/index.php/Sil_m15w and applied patches 1 & 2
from http://lkml.org/lkml/2007/10/11/115 (adding my two disks) and
rebooted. Now there was some MOD15 stuff in dmesg for the two disks but
still apart from the disks being even slower it was of no use - the
corruption problem was still there (I then also tried patch 3 from Bernd
but that immediately caused oopses fs/errors). So it looks like the
problem I am having is different...

Now I remembered that this machine also has two idle promise pdc20376
sata ports where I first tried the ST3400832AS (sda) and ST3400620AS
(sdb) on about a year ago
http://lists.openwall.net/linux-kernel/2006/08/27/106 . At that time I
just saw random error messages and then finally hangs - quoting Tejon
Heo:

 "I see.  your drive is reporting error for some reason and libata is 
failing to recover."

Now promise_sata is converted to new EH, so I simply gave it a go, i.e.
I attached ST3400832AS and ST3400620AS to the promise controller and
rebooted and redid the experiments from above.

No data corruptions whatsoever. I even ran the dd on all three devmapped
mount points simultaneously with a size of 30GB each, still no
corruption. However the error messages I've seen a year ago are back for
the ST3400832AS and ST3400620AS attached to the promise controller (see
below).

Please find all the details below:

- uname

Linux 2.6.23.1 #3 PREEMPT Fri Oct 19 20:39:45 CEST 2007 i686 GNU/Linux

- lspci

00:0e.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] 
Serial ATA Controller (rev 02)
00:0e.0 0104: 1095:3114 (rev 02)

00:08.0 RAID bus controller: Promise Technology, Inc. PDC20376 (FastTrak 376) 
(rev 02)
00:08.0 0104: 105a:3376 (rev 02)

- proc interrupts

 17:4434549   IO-APIC-fasteoi   sata_promise, sata_sil, ohci1394

- dmesg

sata_sil :00:0e.0: version 2.3
ACPI: PCI Interrupt :00:0e.0[A] -> GSI 17 (level, low) -> IRQ 17
sata_sil :00:0e.0: Applying R_ERR on DMA activate FIS errata fix
scsi3 : sata_sil
scsi4 : sata_sil
scsi5 : sata_sil
scsi6 : sata_sil
ata4: SATA max UDMA/100 cmd 0xf882e080 ctl 0xf882e08a bmdma 0xf882e000 irq 17
ata5: SATA max UDMA/100 cmd 0xf882e0c0 ctl 0xf882e0ca bmdma 0xf882e008 irq 17
ata6: SATA max UDMA/100 cmd 0xf882e280 ctl 0xf882e28a bmdma 0xf882e200 irq 17
ata7: SATA max UDMA/100 cmd 0xf882e2c0 ctl 0xf882e2ca bmdma 0xf882e208 irq 17
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-7: ST3400832AS, 3.01, max UDMA/133
ata4.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/100
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata5.00: ATA-7: ST3400620AS, 3.AAE, max UDMA/133
ata5.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata5.00: configured for UDMA/100
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata6.00: ATA-7: ST3750640AS, 3.AAE, max UDMA/133
ata6.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata6.00: configured for UDMA/100
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata7.00: ATA-7: ST3750640AS, 3.AAC, max UDMA/133
ata7.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata7.00: configured for UDMA/100
scsi 3:0:0:0: Direct-Access ATA  ST3400832AS  3.01 PQ: 0 ANSI: 5
sd 3:0:0:0: [sda] 781422768 512-byte hardware sectors (400088 MB)
sd 3:0:0:0: [sda] Write Protect is off
sd 3:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
sd 3:0:0:0: [sda] 781422768 512-byte hardware sectors (400088 MB)
sd 3:0:0:0: [sda] Write Protect is off
sd 3:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
 sda: unknown partition table
sd 3:0:0:0: [sda] Attached SCSI disk
sd 3:0:0:0:

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-10-09 Thread Soeren Sonnenburg

On Tue, 2007-10-09 at 15:09 +0200, Tomasz Chmielewski wrote:
> Soeren Sonnenburg wrote:
> 
> >> Fixing recursive fault but reboot is needed!
> > 
> > Hmmhh, so now I rebooted and again tried to
> > 
> > $ make
> > 
> > the new kernel which again triggered this(?) BUG:
> 
> I had a similar issue with 2.6.22.9, but as I had a proprietary nvidia 
> module loaded, I didn't report it. X was not enabled, though.
> 
> At this moment, the machine was spawning quite a bit of bash / awk etc. 
> processes with large variables (50 MB or so), and used memory and CPU a lot.
> 
> Normally, it's my desktop machine, and it's rarely on for more than ~12 
> hours, but this time, I left it on for a couple of days.
> 
> After this happened, these bash / awk processes died. After I restarted 
> the script again, I lost ssh access to the machine, and I saw no more 

I am afraid you are seeing some kind of hardware failure/bad driver
behavior, just the symptom is the same.

I am saying this as I have an uptime of 22 days with that very same
machine now. And all I changed was unloading the asus p7131 dvb-t driver
(saa71xx).

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-24 Thread Soeren Sonnenburg

On Fri, 2007-09-14 at 07:22 +1000, Nick Piggin wrote:
> On Friday 14 September 2007 16:02, Soeren Sonnenburg wrote:
> > On Thu, 2007-09-13 at 09:51 +1000, Nick Piggin wrote:
> > > On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote:
> > > > Dear all,
> > > >
> > > > I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine
> > > > (config attached).
> > > >
> > > > Any ideas / which further information needed ?
> > >
> > > Thanks for the report. Is it reproduceable? It seems like the
> > > locks_free_lock call that's oopsing is coming from __posix_lock_file.
> > > The actual function looks fine, but the lock being freed could have
> > > been corrupted if there was slab corruption, or a hardware corruption.
> > >
> > > You could: try running memtest86+ overnight. And try the following
> > > patch and turn on slab debugging then try to reproduce the problem.
> >
> > OK so far I've run memtest86+ 1.40 from freedos for 8 hrs (v1.70 hung on
> > startup) - nothing.
> 
> Thanks.
> 
> > Could this corruption be caused by a pci card/driver? I am asking as I
> > am using a new dvb-t card (asus p7131) and the oops happened after 5 or
> > 6 days of uptime just about a day after watching some movie (very bad
> > reception/lots of errors).
> 
> It could be caused by that, definitely. slab debugging plus my earlier
> patch may help to narrow it down. (or stress testing with / without the
> dvb card in action).

OK, it is the dvb card. I have 1 week of uptime now without any errors.
Only change is the dvb driver (saa7146) not loaded.

:(
Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [3/3] 2.6.23-rc6: known regressions v2

2007-09-18 Thread Soeren Sonnenburg
On Mon, 2007-09-17 at 23:03 +0200, Thomas Gleixner wrote:
> On Mon, 2007-09-17 at 22:51 +0200, Arkadiusz Miskiewicz wrote:
> > > > clockevents: fix resume logic
> > >
> > > Linus pulled a series of patches which are addressing this issue into
> > > his tree yesterday. Can you please retest against current git ?
> > 
> > Looks like the problem is fixed in current git for me. Thanks!
> 
> Thanks for testing.

I can confirm that the problem is fixed for me too in current git when
using the free avivo driver (with the binary only fglrx driver the
machine just hangs on resume or even sometimes on sleep). So it might be
worth to purposely keep that such that people transit from fglrx to the
free avivo/radeonhd drivers. Anyway s2ram with avivo takes 2-3 sec to
suspend and 4-5 for resume (when fglrx worked it was more in the 10-20
secs).

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-17 Thread Soeren Sonnenburg
On Sun, 2007-09-16 at 18:15 +1000, Nick Piggin wrote:
> On Saturday 15 September 2007 20:22, Soeren Sonnenburg wrote:
> > On Sat, 2007-09-15 at 09:47 +0000, Soeren Sonnenburg wrote:
> 
> > > Memtest did not find anything after 16 passes so I finally stopped
> it
> > > applied your patch and used
> > >
> > > CONFIG_DEBUG_SLAB=y
> > > CONFIG_DEBUG_SLAB_LEAK=y
> > >
> > > and booted into the new kernel.
> > >
> > > A few hours later the machine hung (due to nmi watchdog rebooted),
> so I
[...]
> > > swap_dup: Bad swap file entry 28c8af9d
> 
> Hmm, this is another telltale symptom of either bad hardware
> or a memory scribbling bug.

Since this morning, the machine is running with the dvb driver for that
certain card unloaded...

Anyway you convinced me that it is the bad saa7134_dvb drivers (driving
the asus p7131) fault. As the driver seems huge, I wonder whether there
are a) other config debug options that could aid in debugging b) what
the names of certain io functions are that may cause this...

Thanks a lot!
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-15 Thread Soeren Sonnenburg
On Sat, 2007-09-15 at 09:47 +, Soeren Sonnenburg wrote:
> On Fri, 2007-09-14 at 07:22 +1000, Nick Piggin wrote:
> > On Friday 14 September 2007 16:02, Soeren Sonnenburg wrote:
> > > On Thu, 2007-09-13 at 09:51 +1000, Nick Piggin wrote:
> > > > On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote:
> > > > > Dear all,
> > > > >
> > > > > I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine
> > > > > (config attached).
> > > > >
> > > > > Any ideas / which further information needed ?
> > > >
> > > > Thanks for the report. Is it reproduceable? It seems like the
> > > > locks_free_lock call that's oopsing is coming from __posix_lock_file.
> > > > The actual function looks fine, but the lock being freed could have
> > > > been corrupted if there was slab corruption, or a hardware corruption.
> > > >
> > > > You could: try running memtest86+ overnight. And try the following
> > > > patch and turn on slab debugging then try to reproduce the problem.
> > >
> > > OK so far I've run memtest86+ 1.40 from freedos for 8 hrs (v1.70 hung on
> > > startup) - nothing.
> > 
> > Thanks.
> > 
> > > Could this corruption be caused by a pci card/driver? I am asking as I
> > > am using a new dvb-t card (asus p7131) and the oops happened after 5 or
> > > 6 days of uptime just about a day after watching some movie (very bad
> > > reception/lots of errors).
> > 
> > It could be caused by that, definitely. slab debugging plus my earlier
> > patch may help to narrow it down. (or stress testing with / without the
> > dvb card in action).
> > 
> > 
> > > However this machine used to have uptimes of months before the dvb card
> > > was in there and the kernel version upgrade (don't know which version
> > > that was...).
> > >
> > > Anyway I am not sure if this is reproducible, but I will keep memtest
> > > running today and then proceed as you said...
> > 
> > OK. Don't put too much effort into memtest if it hasn't caught anything
> > by now -- it's really only exercising your CPU and memory, so even if it
> > is your video hardware, it probably won't find the problem.
> 
> Memtest did not find anything after 16 passes so I finally stopped it
> applied your patch and used
> 
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SLAB_LEAK=y
> 
> and booted into the new kernel.
> 
> A few hours later the machine hung (due to nmi watchdog rebooted), so I
> restarted and disabled the watchdog and while compiling a kernel with a
> ``more minimal'' config I got this (not sure whether this is related/the
> cause .../ note that I don't use a swapfile/partition).
> 
> I would need more guidance on what to try now...
> 
> Thanks!
> Soeren
> 
> swap_dup: Bad swap file entry 28c8af9d
> VM: killing process cc1
> Eeek! page_mapcount(page) went negative! (-1)
>   page pfn = 36233
>   page->flags = 4834
>   page->count = 2
>   page->mapping = c1cfed14
>   vma->vm_ops = run_init_process+0x3feff000/0x14
> [ cut here ]
> kernel BUG at mm/rmap.c:628!
> invalid opcode:  [#1]
> Modules linked in: ipt_iprange ipt_REDIRECT capi kernelcapi capifs ipt_REJECT 
> xt_tcpudp xt_state xt_limit ipt_LOG ipt_MASQUERADE iptable_mangle iptable_nat 
> nf_conntrack_ipv4 iptable_filter ip_tables x_tables b44 ohci1394 ieee1394 
> nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lcd tda827x saa7134_dvb 
> dvb_pll video_buf_dvb tda1004x tuner ves1820 usb_storage usblp budget_ci 
> budget_core saa7134 compat_ioctl32 dvb_ttpci dvb_core saa7146_vv video_buf 
> saa7146 ttpci_eeprom ir_kbd_i2c videodev v4l2_common v4l1_compat ir_common 
> via_agp agpgart
> CPU:0
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010246   (2.6.22.6 #2)
> EIP is at page_remove_rmap+0xd4/0x101
> eax:    ebx: c16c4660   ecx:    edx: 
> esi: d4570b30   edi: d6560a78   ebp: b740   esp: d6265eac
> ds: 007b   es: 007b   fs:   gs:   ss: 0068
> Process cc1 (pid: 26095, ti=d6264000 task=d67af5b0 task.ti=d6264000)
> Stack: c0422e26 c1cfed14 c16c4660 b729e000 c013f5b8 36233cce  
> d4570b30 
>d6265f20  0001 f4ffcb70 f483a3b8 c04f44b8  
>  
>f4ffcb70 00303ff4 b7c18000  d6265f20 f4a8c510 f483a3b8 
> 0009 
> Call Trace:
>  [] unmap_vmas+0x23f/0x404
>  [] exit_mmap+0x5f/0xc9
>  [] mmput+0x1b/0x5e
>  [] do_exit+0x1a0/0x606
>  [] do_pag

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-15 Thread Soeren Sonnenburg
On Fri, 2007-09-14 at 07:22 +1000, Nick Piggin wrote:
> On Friday 14 September 2007 16:02, Soeren Sonnenburg wrote:
> > On Thu, 2007-09-13 at 09:51 +1000, Nick Piggin wrote:
> > > On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote:
> > > > Dear all,
> > > >
> > > > I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine
> > > > (config attached).
> > > >
> > > > Any ideas / which further information needed ?
> > >
> > > Thanks for the report. Is it reproduceable? It seems like the
> > > locks_free_lock call that's oopsing is coming from __posix_lock_file.
> > > The actual function looks fine, but the lock being freed could have
> > > been corrupted if there was slab corruption, or a hardware corruption.
> > >
> > > You could: try running memtest86+ overnight. And try the following
> > > patch and turn on slab debugging then try to reproduce the problem.
> >
> > OK so far I've run memtest86+ 1.40 from freedos for 8 hrs (v1.70 hung on
> > startup) - nothing.
> 
> Thanks.
> 
> > Could this corruption be caused by a pci card/driver? I am asking as I
> > am using a new dvb-t card (asus p7131) and the oops happened after 5 or
> > 6 days of uptime just about a day after watching some movie (very bad
> > reception/lots of errors).
> 
> It could be caused by that, definitely. slab debugging plus my earlier
> patch may help to narrow it down. (or stress testing with / without the
> dvb card in action).
> 
> 
> > However this machine used to have uptimes of months before the dvb card
> > was in there and the kernel version upgrade (don't know which version
> > that was...).
> >
> > Anyway I am not sure if this is reproducible, but I will keep memtest
> > running today and then proceed as you said...
> 
> OK. Don't put too much effort into memtest if it hasn't caught anything
> by now -- it's really only exercising your CPU and memory, so even if it
> is your video hardware, it probably won't find the problem.

Memtest did not find anything after 16 passes so I finally stopped it
applied your patch and used

CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y

and booted into the new kernel.

A few hours later the machine hung (due to nmi watchdog rebooted), so I
restarted and disabled the watchdog and while compiling a kernel with a
``more minimal'' config I got this (not sure whether this is related/the
cause .../ note that I don't use a swapfile/partition).

I would need more guidance on what to try now...

Thanks!
Soeren

swap_dup: Bad swap file entry 28c8af9d
VM: killing process cc1
Eeek! page_mapcount(page) went negative! (-1)
  page pfn = 36233
  page->flags = 4834
  page->count = 2
  page->mapping = c1cfed14
  vma->vm_ops = run_init_process+0x3feff000/0x14
[ cut here ]
kernel BUG at mm/rmap.c:628!
invalid opcode:  [#1]
Modules linked in: ipt_iprange ipt_REDIRECT capi kernelcapi capifs ipt_REJECT 
xt_tcpudp xt_state xt_limit ipt_LOG ipt_MASQUERADE iptable_mangle iptable_nat 
nf_conntrack_ipv4 iptable_filter ip_tables x_tables b44 ohci1394 ieee1394 
nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lcd tda827x saa7134_dvb dvb_pll 
video_buf_dvb tda1004x tuner ves1820 usb_storage usblp budget_ci budget_core 
saa7134 compat_ioctl32 dvb_ttpci dvb_core saa7146_vv video_buf saa7146 
ttpci_eeprom ir_kbd_i2c videodev v4l2_common v4l1_compat ir_common via_agp 
agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.22.6 #2)
EIP is at page_remove_rmap+0xd4/0x101
eax:    ebx: c16c4660   ecx:    edx: 
esi: d4570b30   edi: d6560a78   ebp: b740   esp: d6265eac
ds: 007b   es: 007b   fs:   gs:   ss: 0068
Process cc1 (pid: 26095, ti=d6264000 task=d67af5b0 task.ti=d6264000)
Stack: c0422e26 c1cfed14 c16c4660 b729e000 c013f5b8 36233cce  d4570b30 
   d6265f20  0001 f4ffcb70 f483a3b8 c04f44b8   
   f4ffcb70 00303ff4 b7c18000  d6265f20 f4a8c510 f483a3b8 0009 
Call Trace:
 [] unmap_vmas+0x23f/0x404
 [] exit_mmap+0x5f/0xc9
 [] mmput+0x1b/0x5e
 [] do_exit+0x1a0/0x606
 [] do_page_fault+0x49c/0x518
 [] __do_softirq+0x35/0x75
 [] do_page_fault+0x0/0x518
 [] error_code+0x6a/0x70
 ===
Code: c0 74 0d 8b 50 08 b8 56 2e 42 c0 e8 ac f4 fe ff 8b 46 48 85 c0 74 14 8b 
40 10 85 c0 74 0d 8b 50 2c b8 75 2e 42 c0 e8 91 f4 fe ff <0f> 0b eb fe 8b 53 10 
8b 03 83 e2 01 c1 e8 1e f7 da 83 c2 04 69 
EIP: [] page_remove_rmap+0xd4/0x101 SS:ESP 0068:d6265eac
Fixing recursive fault but reboot is needed!


-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-13 Thread Soeren Sonnenburg
On Thu, 2007-09-13 at 09:51 +1000, Nick Piggin wrote:
> On Thursday 13 September 2007 19:20, Soeren Sonnenburg wrote:
> > Dear all,
> >
> > I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine
> > (config attached).
> >
> > Any ideas / which further information needed ?
> 
> Thanks for the report. Is it reproduceable? It seems like the
> locks_free_lock call that's oopsing is coming from __posix_lock_file.
> The actual function looks fine, but the lock being freed could have
> been corrupted if there was slab corruption, or a hardware corruption.
> 
> You could: try running memtest86+ overnight. And try the following
> patch and turn on slab debugging then try to reproduce the problem.

OK so far I've run memtest86+ 1.40 from freedos for 8 hrs (v1.70 hung on
startup) - nothing.

Could this corruption be caused by a pci card/driver? I am asking as I
am using a new dvb-t card (asus p7131) and the oops happened after 5 or
6 days of uptime just about a day after watching some movie (very bad
reception/lots of errors). 

However this machine used to have uptimes of months before the dvb card
was in there and the kernel version upgrade (don't know which version
that was...).

Anyway I am not sure if this is reproducible, but I will keep memtest
running today and then proceed as you said...

Thanks,
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22.6: kernel BUG at fs/locks.c:171

2007-09-13 Thread Soeren Sonnenburg
Dear all,

I've just seen this in dmesg on a AMD K7 / kernel 2.6.22.6 machine
(config attached).

Any ideas / which further information needed ?

Soeren

[ cut here ]
kernel BUG at fs/locks.c:171!
invalid opcode:  [#1]
Modules linked in: ipt_iprange ipt_REDIRECT capi kernelcapi capifs ipt_REJECT 
xt_tcpudp xt_state xt_limit ipt_LOG ipt_MASQUERADE iptable_mangle iptable_nat 
nf_conntrack_ipv4 iptable_filter ip_tables x_tables b44 ohci1394 ieee1394 
nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lcd tda827x saa7134_dvb dvb_pll 
video_buf_dvb tuner tda1004x ves1820 usb_storage usblp saa7134 compat_ioctl32 
budget_ci budget_core dvb_ttpci dvb_core saa7146_vv video_buf saa7146 
ttpci_eeprom via_agp ir_kbd_i2c videodev v4l2_common v4l1_compat ir_common 
agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010206   (2.6.22.6 #1)
EIP is at locks_free_lock+0xb/0x3b
eax: e1d07f9c   ebx: e1d07f80   ecx: f5f5e2f0   edx: 
esi:    edi:    ebp:    esp: da3d7f04
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process mrtg-load (pid: 19688, ti=da3d6000 task=f5e3a030 task.ti=da3d6000)
Stack:  c015972b 0002 c04889c8 c012b920 f5f5e290 c048541c f0ed3ca0 
   01485414  e1d07f80  f0f39f58 44ef35f1 f62fc2ac  
    f5f5e290  d23106c0 c015a891  0007 0004 
Call Trace:
 [] __posix_lock_file+0x44e/0x47f
 [] getnstimeofday+0x2b/0xaf
 [] fcntl_setlk+0xff/0x1f6
 [] do_setitimer+0xfa/0x226
 [] sys_fcntl64+0x74/0x85
 [] syscall_call+0x7/0xb
 ===
Code: 74 1b 8b 15 30 93 48 c0 8d 43 04 89 53 04 89 42 04 a3 30 93 48 c0 c7 40 
04 30 93 48 c0 5b 5e c3 53 89 c3 8d 40 1c 39 43 1c 74 04 <0f> 0b eb fe 8d 43 0c 
39 43 0c 74 04 0f 0b eb fe 8d 43 04 39 43 
EIP: [] locks_free_lock+0xb/0x3b SS:ESP 0068:da3d7f04
BUG: unable to handle kernel paging request at virtual address 9ee420b0
 printing eip:
c014ab7d
*pde = 
Oops: 0002 [#2]
Modules linked in: ipt_iprange ipt_REDIRECT capi kernelcapi capifs ipt_REJECT 
xt_tcpudp xt_state xt_limit ipt_LOG ipt_MASQUERADE iptable_mangle iptable_nat 
nf_conntrack_ipv4 iptable_filter ip_tables x_tables b44 ohci1394 ieee1394 
nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lcd tda827x saa7134_dvb dvb_pll 
video_buf_dvb tuner tda1004x ves1820 usb_storage usblp saa7134 compat_ioctl32 
budget_ci budget_core dvb_ttpci dvb_core saa7146_vv video_buf saa7146 
ttpci_eeprom via_agp ir_kbd_i2c videodev v4l2_common v4l1_compat ir_common 
agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010082   (2.6.22.6 #1)
EIP is at free_block+0x61/0xfb
eax: a75b2c19   ebx: c1cf6c10   ecx: e1d070c4   edx: 9ee420ac
esi: e1d07000   edi: dfde6960   ebp: dfde7620   esp: dfd87f44
ds: 007b   es: 007b   fs:   gs:   ss: 0068
Process events/0 (pid: 4, ti=dfd86000 task=dfdc4a50 task.ti=dfd86000)
Stack: 0012  0018  c1cf6c10 c1cf6c10 0018 c1cf6c00 
   dfde7620 c014ac86  dfde6960 dfde7620 c0521d20  c014b869 
     dfde69e0 c0521d20 c014b827 c0125955 dfdc4b5c 8f0c99c0 
Call Trace:
 [] drain_array+0x6f/0x89
 [] cache_reap+0x42/0xde
 [] cache_reap+0x0/0xde
 [] run_workqueue+0x6b/0xdf
 [] worker_thread+0x0/0xbd
 [] worker_thread+0xb2/0xbd
 [] autoremove_wake_function+0x0/0x35
 [] kthread+0x36/0x5a
 [] kthread+0x0/0x5a
 [] kernel_thread_helper+0x7/0x10
 ===
Code: 8b 02 25 00 40 02 00 3d 00 40 02 00 75 03 8b 52 0c 8b 02 84 c0 78 04 0f 
0b eb fe 8b 72 1c 8b 54 24 28 8b 46 04 8b 7c 95 4c 8b 16 <89> 42 04 89 10 2b 4e 
0c c7 06 00 01 10 00 c7 46 04 00 02 20 00 
EIP: [] free_block+0x61/0xfb SS:ESP 0068:dfd87f44
[ cut here ]
kernel BUG at fs/locks.c:171!
invalid opcode:  [#3]
Modules linked in: ipt_iprange ipt_REDIRECT capi kernelcapi capifs ipt_REJECT 
xt_tcpudp xt_state xt_limit ipt_LOG ipt_MASQUERADE iptable_mangle iptable_nat 
nf_conntrack_ipv4 iptable_filter ip_tables x_tables b44 ohci1394 ieee1394 
nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lcd tda827x saa7134_dvb dvb_pll 
video_buf_dvb tuner tda1004x ves1820 usb_storage usblp saa7134 compat_ioctl32 
budget_ci budget_core dvb_ttpci dvb_core saa7146_vv video_buf saa7146 
ttpci_eeprom via_agp ir_kbd_i2c videodev v4l2_common v4l1_compat ir_common 
agpgart
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010287   (2.6.22.6 #1)
EIP is at locks_free_lock+0xb/0x3b
eax: e1d07f40   ebx: e1d07f24   ecx: dfde7620   edx: c16bebc0
esi:    edi:    ebp: f5f5e0c4   esp: f1309efc
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process nmbd (pid: 3522, ti=f1308000 task=f12ba590 task.ti=f1308000)
Stack:  c015972b f10b8d4c c1f0d380 02e58f5c f5f5e3a4 07e8  
   010b8d4c f5f5e120 e1d07f24 0001 00a8  f5f5eca0  
    f5f5e3a4  f635a260 c015a13f  000e 000a 
Call Trace:
 [] __posix_lock_file+0x44e/0x47f
 [] fcntl_setlk64+0xf

Re: [4/4] 2.6.23-rc4: known regressions

2007-08-30 Thread Soeren Sonnenburg
On Wed, 2007-08-29 at 17:27 +0200, Michal Piotrowski wrote:

> Power management
> 
> Subject : something broke resume from s2ram on mbp c1d (??? :))
> References  : http://lkml.org/lkml/2007/8/28/67
> Last known good : 2.6.23-rc3
> Submitter   : Soeren Sonnenburg <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : Rafael J. Wysocki <[EMAIL PROTECTED]>
> Status  : unknown

> Subject : resume from ram much slower
> References  : http://lkml.org/lkml/2007/8/10/275
> Last known good : 2.6.23-rc1 ?
> Submitter   : Arkadiusz Miskiewicz <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : Rafael J. Wysocki <[EMAIL PROTECTED]>
> Status  : problem is being debugged


I am not sure whether the problem I am having is not the very same as
the one Arkadiusz is seeing. At least I've found resume from s2ram to be
working a couple of times. Only sometimes it took long to resume, that
is >30 seconds (around 5 - which I already consider long - is normal). 

anyway this this is with the closed source fglrx kernel module, as
without the machine freezes when X is running on resume...

well and fglrx seems to cause this ...

BUG: scheduling while atomic: Xorg/0x0002/3408
 [] schedule+0x5d2/0x6d0
 [] __wake_up+0x38/0x50
 [] irqmgr_wrap_shutdown+0xe1/0x150 [fglrx]
 [] firegl_release_helper+0x55f/0x7d0 [fglrx]
 [] firegl_takedown+0x5b/0xc40 [fglrx]
 [] firegl_release+0x12f/0x190 [fglrx]
 [] ip_firegl_release+0xf/0x20 [fglrx]
 [] __fput+0x91/0x160
 [] filp_close+0x49/0x80
 [] put_files_struct+0x9c/0xc0
 [] do_exit+0x12e/0x7c0
 [] IRQMGR_WorkerThreadRoutine+0x29/0x30 [fglrx]
 [] kasThreadRoutineHelper+0x0/0x20 [fglrx]
 [] kasThreadRoutineHelper+0x0/0x20 [fglrx]
 [] IRQMGR_CallbackWrapper+0xe/0x20 [fglrx]
 [] kasThreadRoutineHelper+0x0/0x20 [fglrx]
 [] kernel_thread_helper+0xd/0x18

well...
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc4

2007-08-28 Thread Soeren Sonnenburg
On Tue, 2007-08-28 at 12:33 +0200, Rafael J. Wysocki wrote:
> On Tuesday, 28 August 2007 12:09, Soeren Sonnenburg wrote:
[...]
> > I hope I find time to do a bisect soon...
> 
> Is this a regression from 2.6.23-rc3, or from an earlier kernel?

it is a regression from rc3, all kernels I tested up to 23-rc3 (plus
some unknown git revision) were working ok.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc4

2007-08-28 Thread Soeren Sonnenburg
On Mon, 2007-08-27 at 18:58 -0700, Linus Torvalds wrote:
> Ok, I lost it, and let two weeks pass between -rc releases. My bad.
> 
> As a result, -rc4 is a bit bigger than it would/should have been, but 
> hopefully it's all good, and we've fixed most regressions. There's some 
> arch updates (MIPS, power, sparc64, s390) and an ACPI update, but the 
> rest of it is mainly lots of small fixes (mostly to various random 
> drivers). With some scheduler and networking noise.
> 
> I think the shortlog is _just_ too big to be posted on the kernel mailing 
> list, but since it can mostly be described with the one word "boring", 
> it's not a huge loss. As usual, just do
> 
>   git shortlog v2.6.23-rc3..v2.6.23-rc4
> 
> if you have the git trees to get the all the details on extraneous 
> semicolons, missed or duplicate include files, kzalloc conversions, new 
> PCI ID's etc etc.

argh, something broke resume from s2ram on my mbp c1d... sometimes the
machine resumes when opening the lid and then pressing the power button,
but sometimes it is just dead (and it used to resume fine most of the
time when *only* opening the lid).

I hope I find time to do a bisect soon...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Bad pte = e900b50d, process = ???

2007-08-20 Thread Soeren Sonnenburg
Dear all,

I suddenly got flodded with

Bad pte = e900b50d, process = ???, vm_flags = 100173, vaddr = bfc87ee2
 [] vm_normal_page+0x3e/0x53
 [] follow_page+0x90/0x147
 [] get_user_pages+0x20f/0x261
 [] access_process_vm+0x7e/0x163
 [] vma_merge+0x171/0x17f
 [] proc_pid_cmdline+0x57/0xe7
 [] proc_info_read+0x4a/0x9c
 [] proc_info_read+0x0/0x9c
 [] vfs_read+0xa6/0x128
 [] sys_read+0x41/0x67
 [] syscall_call+0x7/0xb
 ===

until the machine become unresponsive... 

Has anyone seen this before/what are the reasons for this ?

This is on kernel 2.6.22.1...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] appletouch powersaving - please apply for 2.6.23-rc1 - take #4

2007-07-19 Thread Soeren Sonnenburg
On Wed, 2007-07-18 at 15:57 -0700, Andrew Morton wrote:
> On Tue, 17 Jul 2007 09:10:18 +0200
> Soeren Sonnenburg <[EMAIL PROTECTED]> wrote:
> 
> > the attached minimally intrusive patch is based on Matthew Garret's
> > patch 'Make appletouch shut up when it has nothing to say' patches (e.g.
> > http://lkml.org/lkml/2007/5/13/117): Matthews description follows /
> > second paragraph lists my additional changes.
> > 
> > The appletouch geyser3 devices found in the Intel Macs (and possibly some 
> > later 
> > PPC ones?) send a constant stream of packets after the first touch. This 
> > results in the kernel waking up around once every couple of milliseconds 
> > to process them, making it almost impossible to spend any significant 
> > period of time in C3 state on a dynamic HZ kernel. Sending the mode 
> > initialization code makes the device shut up until it's touched again. 
> > This patch does so after receiving 10 packets with no interesting 
> > content.
> > 
> > In addition it now empties the work queue via cancel_work_sync on module
> > exit, keeps all error checking and only reports BTN_LEFT presses if bit
> > 1 in the status byte (last byte in packet) is set. This fixes the random
> > left clicks issue. Furthermore it invalidates touchpad data before the
> > mode switch, which fixes the touchpad runs amok issue.
> 
> Please feed this through scripts/checkpatch.pl and consider addressing
> all the things which it reports.

So I did. Updated patch which differs in

dev->valid = 0; (note the space around =) and further removed debug code
(which never got triggered but checkpatch.pl complained about)

  
attached.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
The attached patch is based on Matthew Garret's patch 'Make appletouch shut
up when it has nothing to say' patches (e.g. 
http://lkml.org/lkml/2007/5/13/117): Matthews description follows / second
paragraph lists my additional changes.

The appletouch geyser3 devices found in the Intel Macs (and possibly some
later PPC ones?) send a constant stream of packets after the first touch. 
This results in the kernel waking up around once every couple of
milliseconds to process them, making it almost impossible to spend any
significant period of time in C3 state on a dynamic HZ kernel.  Sending the
mode initialization code makes the device shut up until it's touched again.
 This patch does so after receiving 10 packets with no interesting content.

In addition it now empties the work queue via cancel_work_sync on module
exit, keeps all error checking and only reports BTN_LEFT presses if bit 1
in the status byte (last byte in packet) is set.  This fixes the random
left clicks issue.  Furthermore it invalidates touchpad data before the
mode switch, which fixes the touchpad runs amok issue.

Credits:
Sven Anders found out that one should only check for bit 1 for BTN_LEFT.
Matthew Garrett did the initial 'Make appletouch shut up when it has
nothing to say' so I am adding him to the signed-off lines (hope that is
the correct way).

Signed-off-by: Soeren Sonnenburg <[EMAIL PROTECTED]>
Signed-off-by: Matthew Garrett <[EMAIL PROTECTED]>
Cc: Nicolas Boichat <[EMAIL PROTECTED]>
Cc: Michael Hanselmann <[EMAIL PROTECTED]>
Cc: Peter Osterlund <[EMAIL PROTECTED]>
Cc: Frank Arnold <[EMAIL PROTECTED]>
Cc: Stelian Pop <[EMAIL PROTECTED]>
Cc: Johannes Berg <[EMAIL PROTECTED]>
Cc: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: Dmitry Torokhov <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>

diff --git a/drivers/input/mouse/appletouch.c b/drivers/input/mouse/appletouch.c
index e321526..f336b7b 100644
--- a/drivers/input/mouse/appletouch.c
+++ b/drivers/input/mouse/appletouch.c
@@ -155,6 +155,8 @@ struct atp {
 	int			xy_acc[ATP_XSENSORS + ATP_YSENSORS];
 	int			overflowwarn;	/* overflow warning printed? */
 	int			datalen;	/* size of an USB urb transfer */
+	int			idlecount;  /* number of empty packets */
+	struct work_struct  work;
 };
 
 #define dbg_dump(msg, tab) \
@@ -208,6 +210,55 @@ static inline int atp_is_geyser_3(struct atp *dev)
 		(productId == GEYSER4_JIS_PRODUCT_ID);
 }
 
+/*
+ * By default Geyser 3 device sends standard USB HID mouse
+ * packets (Report ID 2). This code changes device mode, so it
+ * sends raw sensor reports (Report ID 5).
+ */
+static int atp_geyser3_init(struct usb_device *udev)
+{
+	char data[8];
+	int size;
+
+	size = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+			ATP_GEYSER3_MODE_READ_REQUEST_ID,
+			USB_DIR_IN 

Re: [PATCH] appletouch powersaving - please apply for 2.6.23-rc1 take #3

2007-07-17 Thread Soeren Sonnenburg
On Tue, 2007-07-17 at 21:48 -0400, Dmitry Torokhov wrote:
> On Tuesday 17 July 2007 14:16, Soeren Sonnenburg wrote:
> > On Tue, 2007-07-17 at 11:01 -0400, Dmitry Torokhov wrote:
> [...]
> > > How many boxes did you try this patch on?
> > 
> > Mine plus 1 other. However please note that Matthews patch has been
> > (which is what this patch is based on) is in the mactel-patches
> > repository for quite some time now and that the not-yet-cleanup up
> > variant of this patch was posted to mactel-devel...
> > 
> > So the modeswitch part should work...
> > 
> 
> OK, can I please get signed-off-bys for the latest version so I can
> apply it?

They are the same as in the initial patch:

Signed-off-by: Soeren Sonnenburg <[EMAIL PROTECTED]>
Signed-off-by: Matthew Garrett <[EMAIL PROTECTED]>

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] appletouch powersaving - please apply for 2.6.23-rc1 take #3

2007-07-17 Thread Soeren Sonnenburg
On Tue, 2007-07-17 at 11:01 -0400, Dmitry Torokhov wrote:
> Hi,
> 
> On 7/17/07, Soeren Sonnenburg <[EMAIL PROTECTED]> wrote:
> >
> >  err_free_buffer:
> > @@ -656,6 +699,7 @@ static void atp_disconnect(struct usb_interface *iface)
> >
> >usb_set_intfdata(iface, NULL);
> >if (dev) {
> > +   cancel_work_sync(&dev->work);
> >usb_kill_urb(dev->urb);
> >input_unregister_device(dev->input);
> >usb_buffer_free(dev->udev, dev->datalen,
> >
> 
> This should go into atp_close() and I think you need to do
> cancel_work_sync after calling usb_kill_urb() otherwise you risk it
> being submitted while you gettingto kill the urb.

good catch. modified patch accordingly+attached.

> How many boxes did you try this patch on?

Mine plus 1 other. However please note that Matthews patch has been
(which is what this patch is based on) is in the mactel-patches
repository for quite some time now and that the not-yet-cleanup up
variant of this patch was posted to mactel-devel...

So the modeswitch part should work...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
diff --git a/drivers/input/mouse/appletouch.c b/drivers/input/mouse/appletouch.c
index e321526..7f180f7 100644
--- a/drivers/input/mouse/appletouch.c
+++ b/drivers/input/mouse/appletouch.c
@@ -155,6 +155,8 @@ struct atp {
 	int			xy_acc[ATP_XSENSORS + ATP_YSENSORS];
 	int			overflowwarn;	/* overflow warning printed? */
 	int			datalen;	/* size of an USB urb transfer */
+	int			idlecount;  /* number of empty packets */
+	struct work_struct  work;
 };
 
 #define dbg_dump(msg, tab) \
@@ -208,6 +210,63 @@ static inline int atp_is_geyser_3(struct atp *dev)
 		(productId == GEYSER4_JIS_PRODUCT_ID);
 }
 
+/*
+ * By default Geyser 3 device sends standard USB HID mouse
+ * packets (Report ID 2). This code changes device mode, so it
+ * sends raw sensor reports (Report ID 5).
+ */
+static int atp_geyser3_init(struct usb_device *udev)
+{
+	char data[8];
+	int size;
+	int i;
+
+	size = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+ATP_GEYSER3_MODE_READ_REQUEST_ID,
+USB_DIR_IN | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
+ATP_GEYSER3_MODE_REQUEST_VALUE,
+ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+
+	if (size != 8) {
+		printk("appletouch atp_geyser3_init READ error\n");
+		for (i=0; i<8; i++)
+			printk("appletouch[%d]: %d\n", i, (int) data[i]);
+
+		err("Could not do mode read request from device"
+		" (Geyser 3 mode)");
+		return -EIO;
+	}
+
+	/* Apply the mode switch */
+	data[0] = ATP_GEYSER3_MODE_VENDOR_VALUE;
+
+	size = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
+ATP_GEYSER3_MODE_WRITE_REQUEST_ID,
+USB_DIR_OUT | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
+ATP_GEYSER3_MODE_REQUEST_VALUE,
+ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+
+	if (size != 8) {
+		printk("appletouch atp_geyser3_init WRITE error\n");
+		for (i=0; i<8; i++)
+			printk("appletouch[%d]: %d\n", i, (int) data[i]);
+		err("Could not do mode write request to device"
+		" (Geyser 3 mode)");
+		return -EIO;
+	}
+	return 0;
+}
+
+/* Reinitialise the device if it's a geyser 3 */
+static void atp_reinit(struct work_struct *work)
+{
+	struct atp *dev = container_of(work, struct atp, work);
+	struct usb_device *udev = dev->udev;
+
+	dev->idlecount = 0;
+	atp_geyser3_init(udev);
+}
+
 static int atp_calculate_abs(int *xy_sensors, int nb_sensors, int fact,
 			 int *z, int *fingers)
 {
@@ -449,11 +508,21 @@ static void atp_complete(struct urb* urb)
 
 		/* reset the accumulator on release */
 		memset(dev->xy_acc, 0, sizeof(dev->xy_acc));
-	}
 
-	input_report_key(dev->input, BTN_LEFT,
-			 !!dev->data[dev->datalen - 1]);
+		/* Geyser 3 will continue to send packets continually after
+		   the first touch unless reinitialised. Do so if it's been
+		   idle for a while in order to avoid waking the kernel up
+		   several hundred times a second */
+		if (atp_is_geyser_3(dev)) {
+			dev->idlecount++;
+			if (dev->idlecount == 10) {
+dev->valid=0;
+schedule_work (&dev->work);
+			}
+		}
+	}
 
+	input_report_key(dev->input, BTN_LEFT, dev->data[dev->datalen-1] & 1);
 	input_sync(dev->input);
 
 exit:
@@ -480,6 +549,7 @@ static void atp_close(struct input_dev *input)
 	struct atp *dev = input_get_drvdata(input);
 
 	usb_kill_urb(dev->urb);
+	cancel_work_sync(&dev->work);
 	dev->open = 0;
 }
 
@@ -528,40 +598,10 @@ static int atp_probe(struct usb_interface *iface, const struct usb_device_id *id
 		dev->datalen = 81;
 
 	if (atp_is_geyser_3(dev)) {
-		/*
-		 * By default Geyser 3 device send

Re: [PATCH] appletouch powersaving - please apply for 2.6.23-rc1

2007-07-17 Thread Soeren Sonnenburg
On Tue, 2007-07-17 at 15:03 +0200, Johannes Berg wrote:
> Hi,
> 
> Good stuff :)
> 
> > +   int idlecount;  /* number of empty packets */
> 
> should probably use tabs here.

fixed.

> > +   size = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
> > +  ATP_GEYSER3_MODE_WRITE_REQUEST_ID,
> > +  USB_DIR_OUT | USB_TYPE_CLASS | 
> > USB_RECIP_INTERFACE,
> > +  ATP_GEYSER3_MODE_REQUEST_VALUE,
> > +  ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
> > +   
> 
> trailing whitespace.

fixed
 
> > +   input_report_key(dev->input, BTN_LEFT, dev->data[dev->datalen-1] & 1);
> > +
> 
> 
> > @@ -449,10 +511,19 @@ static void atp_complete(struct urb* urb)
> >  
> > /* reset the accumulator on release */
> > memset(dev->xy_acc, 0, sizeof(dev->xy_acc));
> > -   }
> >  
> > -   input_report_key(dev->input, BTN_LEFT,
> > -!!dev->data[dev->datalen - 1]);
> 
> Any hint as to why you move this? The different test, yes, ok, you
> explained that, but moving it?

OK, Sven Anders also asked why the move... and well the reason was that
when I was trying to figure out what goes wrong I memset everything
including dev->data to zero which required the move ...

Anyway as there is no goto/return inbetween I fail to see that this will
make any difference. So I moved the code back down where it was. 

The new patch containing these cleanups is attached.

Best,
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
diff --git a/drivers/input/mouse/appletouch.c b/drivers/input/mouse/appletouch.c
index e321526..34c1eca 100644
--- a/drivers/input/mouse/appletouch.c
+++ b/drivers/input/mouse/appletouch.c
@@ -155,6 +155,8 @@ struct atp {
 	int			xy_acc[ATP_XSENSORS + ATP_YSENSORS];
 	int			overflowwarn;	/* overflow warning printed? */
 	int			datalen;	/* size of an USB urb transfer */
+	int			idlecount;  /* number of empty packets */
+	struct work_struct  work;
 };
 
 #define dbg_dump(msg, tab) \
@@ -208,6 +210,63 @@ static inline int atp_is_geyser_3(struct atp *dev)
 		(productId == GEYSER4_JIS_PRODUCT_ID);
 }
 
+/*
+ * By default Geyser 3 device sends standard USB HID mouse
+ * packets (Report ID 2). This code changes device mode, so it
+ * sends raw sensor reports (Report ID 5).
+ */
+static int atp_geyser3_init(struct usb_device *udev)
+{
+	char data[8];
+	int size;
+	int i;
+
+	size = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+ATP_GEYSER3_MODE_READ_REQUEST_ID,
+USB_DIR_IN | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
+ATP_GEYSER3_MODE_REQUEST_VALUE,
+ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+
+	if (size != 8) {
+		printk("appletouch atp_geyser3_init READ error\n");
+		for (i=0; i<8; i++)
+			printk("appletouch[%d]: %d\n", i, (int) data[i]);
+
+		err("Could not do mode read request from device"
+		" (Geyser 3 mode)");
+		return -EIO;
+	}
+
+	/* Apply the mode switch */
+	data[0] = ATP_GEYSER3_MODE_VENDOR_VALUE;
+
+	size = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
+ATP_GEYSER3_MODE_WRITE_REQUEST_ID,
+USB_DIR_OUT | USB_TYPE_CLASS | USB_RECIP_INTERFACE,
+ATP_GEYSER3_MODE_REQUEST_VALUE,
+ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+
+	if (size != 8) {
+		printk("appletouch atp_geyser3_init WRITE error\n");
+		for (i=0; i<8; i++)
+			printk("appletouch[%d]: %d\n", i, (int) data[i]);
+		err("Could not do mode write request to device"
+		" (Geyser 3 mode)");
+		return -EIO;
+	}
+	return 0;
+}
+
+/* Reinitialise the device if it's a geyser 3 */
+static void atp_reinit(struct work_struct *work)
+{
+	struct atp *dev = container_of(work, struct atp, work);
+	struct usb_device *udev = dev->udev;
+
+	dev->idlecount = 0;
+	atp_geyser3_init(udev);
+}
+
 static int atp_calculate_abs(int *xy_sensors, int nb_sensors, int fact,
 			 int *z, int *fingers)
 {
@@ -449,11 +508,21 @@ static void atp_complete(struct urb* urb)
 
 		/* reset the accumulator on release */
 		memset(dev->xy_acc, 0, sizeof(dev->xy_acc));
-	}
 
-	input_report_key(dev->input, BTN_LEFT,
-			 !!dev->data[dev->datalen - 1]);
+		/* Geyser 3 will continue to send packets continually after
+		   the first touch unless reinitialised. Do so if it's been
+		   idle for a while in order to avoid waking the kernel up
+		   several hundred times a second */
+		if (atp_is_geyser_3(dev)) {
+			dev->idlecount++;
+			if (dev->idlecount == 10) {
+dev->valid=0;
+schedule_work (&dev->work);
+			}
+		}
+	}
 
+	input_report_key(dev->input, BTN_LEFT, dev->data[dev->datalen-1] & 1);
 	input_sync(dev->input);
 
 exit:
@@ -528,40 +597,10 @@ static int atp_probe(struct usb_interface *iface, const struct usb_device_id *id
 		dev->datalen = 81;
 
 	if (atp_is_geyser_3(dev)) {
-		/*
-		 * By default Geyser 3 device sends standard USB HID mouse
-		 * packets (Report ID 2). This code changes device m

[PATCH] appletouch powersaving - please apply for 2.6.23-rc1

2007-07-17 Thread Soeren Sonnenburg
Hi,

the attached minimally intrusive patch is based on Matthew Garret's
patch 'Make appletouch shut up when it has nothing to say' patches (e.g.
http://lkml.org/lkml/2007/5/13/117): Matthews description follows /
second paragraph lists my additional changes.

The appletouch geyser3 devices found in the Intel Macs (and possibly some later 
PPC ones?) send a constant stream of packets after the first touch. This 
results in the kernel waking up around once every couple of milliseconds 
to process them, making it almost impossible to spend any significant 
period of time in C3 state on a dynamic HZ kernel. Sending the mode 
initialization code makes the device shut up until it's touched again. 
This patch does so after receiving 10 packets with no interesting 
content.

In addition it now empties the work queue via cancel_work_sync on module
exit, keeps all error checking and only reports BTN_LEFT presses if bit
1 in the status byte (last byte in packet) is set. This fixes the random
left clicks issue. Furthermore it invalidates touchpad data before the
mode switch, which fixes the touchpad runs amok issue.

Credits:
Sven Anders found out that one should only check for bit 1 for BTN_LEFT.
Matthew Garrett did the initial 'Make appletouch shut up when it has
nothing to say' so I am adding him to the signed-off lines (hope that is
the correct way).

Patch follows inline and attached.

Soeren.

Signed-off-by: Soeren Sonnenburg <[EMAIL PROTECTED]>
Signed-off-by: Matthew Garrett <[EMAIL PROTECTED]>

diff --git a/drivers/input/mouse/appletouch.c b/drivers/input/mouse/appletouch.c
index e321526..0426054 100644
--- a/drivers/input/mouse/appletouch.c
+++ b/drivers/input/mouse/appletouch.c
@@ -155,6 +155,8 @@ struct atp {
int xy_acc[ATP_XSENSORS + ATP_YSENSORS];
int overflowwarn;   /* overflow warning printed? */
int datalen;/* size of an USB urb transfer 
*/
+   int idlecount;  /* number of empty packets */
+   struct work_struct  work;
 };
 
 #define dbg_dump(msg, tab) \
@@ -208,6 +210,64 @@ static inline int atp_is_geyser_3(struct atp *dev)
(productId == GEYSER4_JIS_PRODUCT_ID);
 }
 
+/*
+ * By default Geyser 3 device sends standard USB HID mouse
+ * packets (Report ID 2). This code changes device mode, so it
+ * sends raw sensor reports (Report ID 5).
+ */
+static int atp_geyser3_init(struct usb_device *udev)
+{
+   char data[8];
+   int size;
+   int i;
+
+   size = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
+  ATP_GEYSER3_MODE_READ_REQUEST_ID,
+  USB_DIR_IN | USB_TYPE_CLASS | 
USB_RECIP_INTERFACE,
+  ATP_GEYSER3_MODE_REQUEST_VALUE,
+  ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+
+
+   if (size != 8) {
+   printk("appletouch atp_geyser3_init READ error\n");
+   for (i=0; i<8; i++)
+   printk("appletouch[%d]: %d\n", i, (int) data[i]);
+
+   err("Could not do mode read request from device"
+   " (Geyser 3 mode)");
+   return -EIO;
+   }
+
+   /* Apply the mode switch */
+   data[0] = ATP_GEYSER3_MODE_VENDOR_VALUE;
+
+   size = usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
+  ATP_GEYSER3_MODE_WRITE_REQUEST_ID,
+  USB_DIR_OUT | USB_TYPE_CLASS | 
USB_RECIP_INTERFACE,
+  ATP_GEYSER3_MODE_REQUEST_VALUE,
+  ATP_GEYSER3_MODE_REQUEST_INDEX, &data, 8, 5000);
+   
+   if (size != 8) {
+   printk("appletouch atp_geyser3_init WRITE error\n");
+   for (i=0; i<8; i++)
+   printk("appletouch[%d]: %d\n", i, (int) data[i]);
+   err("Could not do mode write request to device"
+   " (Geyser 3 mode)");
+   return -EIO;
+   }
+   return 0;
+}
+
+/* Reinitialise the device if it's a geyser 3 */
+static void atp_reinit(struct work_struct *work)
+{
+   struct atp *dev = container_of(work, struct atp, work);
+   struct usb_device *udev = dev->udev;
+
+   dev->idlecount = 0;
+   atp_geyser3_init(udev);
+}
+
 static int atp_calculate_abs(int *xy_sensors, int nb_sensors, int fact,
 int *z, int *fingers)
 {
@@ -418,6 +478,8 @@ static void atp_complete(struct urb* urb)
y = atp_calculate_abs(dev->xy_acc + ATP_XSENSORS, ATP_YSENSORS,
  ATP_YFACT, &y_z, &y_f);
 
+   input_report_key(dev->input, BTN_LEFT, dev->data[dev->datalen-1] & 1);
+
if (x && y) {
if (dev->x_old

Re: ata1: soft resetting port

2007-07-04 Thread Soeren Sonnenburg
On Thu, 2007-07-05 at 03:01 +0900, Tejun Heo wrote:
> Soeren Sonnenburg wrote:
> > On Tue, 2007-07-03 at 15:40 +0900, Tejun Heo wrote:
> >> Soeren Sonnenburg wrote:
> >>> Dear List,
> >>>
> >>> since the switch to 
> >>>
> >>> CONFIG_ATA=y
> >>> CONFIG_ATA_ACPI=y
> >>> CONFIG_ATA_PIIX=y,
> >>>
> >>> the ATA_PIIX driver manages both, internal sata disk aswell as cd/dvd
> >>> rom. However I am being flooded with the error messages below (well they
> >>> appear from time to time, dominating dmesg). 
> >>>
> >>> This happens on kernel 2.6.22-rc5, I am copying relevant parts from dmesg:
> >> Does 2.6.22-rc7 fare better?
> > 
> > Yes indeed. The only thing I've seen in the last two days was the
> > following on resume:
> > 
> > pci_express :00:1c.2:pcie03: resuming
> > sr 0:0:0:0: resuming
> > sd 2:0:1:0: resuming
> > sd 2:0:1:0: [sda] Starting disk
> > ata1.00: configured for UDMA/33
> > ata3.01: revalidation failed (errno=-2)
> > ata3: failed to recover some devices, retrying in 5 secs
> > ata3.01: configured for UDMA/133
> 
> Hmmm... That's NODEV_HINT being triggered after resume.  Probably the
> device isn't ready to respond yet at that point.  How reproducible is
> the problem?

quite reproducible:

$ dmesg | grep 'revalidation failed' | wc -l
4

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ata1: soft resetting port

2007-07-04 Thread Soeren Sonnenburg
On Tue, 2007-07-03 at 15:40 +0900, Tejun Heo wrote:
> Soeren Sonnenburg wrote:
> > Dear List,
> > 
> > since the switch to 
> > 
> > CONFIG_ATA=y
> > CONFIG_ATA_ACPI=y
> > CONFIG_ATA_PIIX=y,
> > 
> > the ATA_PIIX driver manages both, internal sata disk aswell as cd/dvd
> > rom. However I am being flooded with the error messages below (well they
> > appear from time to time, dominating dmesg). 
> > 
> > This happens on kernel 2.6.22-rc5, I am copying relevant parts from dmesg:
> 
> Does 2.6.22-rc7 fare better?

Yes indeed. The only thing I've seen in the last two days was the
following on resume:

pci_express :00:1c.2:pcie03: resuming
sr 0:0:0:0: resuming
sd 2:0:1:0: resuming
sd 2:0:1:0: [sda] Starting disk
ata1.00: configured for UDMA/33
ata3.01: revalidation failed (errno=-2)
ata3: failed to recover some devices, retrying in 5 secs
ata3.01: configured for UDMA/133

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hwmon/coretemp: Fix a broken error path - microcode update fix

2007-06-25 Thread Soeren Sonnenburg
On Mon, 2007-06-25 at 20:20 +0200, Rudolf Marek wrote:
> > Hi Rudolf,
> > 
> > just one more update:
> > 
> > When I put my machine into s2ram and make it resume, one of the coretemp
> > sensors gets lost. Ahh and I am already rmmod coretemp / loading
> > microcode after resume / insmod coretemp...
> 
> Hello, If I understand correctly you unload the driver before suspend. 
> Resume, 
> update microcode, load the driver correct?

no I did not unload the driver but rmmod it after resume before I load
the microcode.

> Please can you check dmesg if for example one core complains about bad 
> microcode 
> version?

indeed core1 still complains. and I just now loaded the new microcode
( rmmod of coretemp before and insmod after) and voila coretemp displays
both cpu's again. So I guess the microcode reloading is done to soon
after resume...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hwmon/coretemp: Fix a broken error path - microcode update fix

2007-06-25 Thread Soeren Sonnenburg
On Thu, 2007-06-21 at 22:57 +0200, Rudolf Marek wrote:
> Hello Soeren,
[...]
> Soeren pointed at some T60, T60p BIOS update and luckily, there is a
> easy way 
> how to extract the microcode update and even convert it into the .txt
> format as 
> microcode update utility (http://www.urbanmyth.org/microcode/)
> expects.
> Attached scripts generates the mcode.txt file which may be used by the
> update 
> utility. Please can you give a try?

Hi Rudolf,

just one more update:

When I put my machine into s2ram and make it resume, one of the coretemp
sensors gets lost. Ahh and I am already rmmod coretemp / loading
microcode after resume / insmod coretemp...

Any idea's on that ?

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hwmon/coretemp: Fix a broken error path - microcode update fix

2007-06-21 Thread Soeren Sonnenburg
On Thu, 2007-06-21 at 22:57 +0200, Rudolf Marek wrote:
> Hello Soeren,
> 
> Sorry for the delay.
> 
> I'm ccing all lists maybe some other people are interested. There is known 
> errata AE18 which prevents coretemp from working correctly on some mobile 
> Core 
> processors (family 6 model e). My driver refuses to load and now thanks to 
> soeren will not crash ;) However what to do when no microcode update (no new 
> BIOS) is available?
> 
> Soeren pointed at some T60, T60p BIOS update and luckily, there is a easy way 
> how to extract the microcode update and even convert it into the .txt format 
> as 
> microcode update utility (http://www.urbanmyth.org/microcode/) expects.
> Attached scripts generates the mcode.txt file which may be used by the update 
> utility. Please can you give a try?

great! it works:

sensors excerpt :

coretemp-isa-
Adapter: ISA adapter
temp1:   +62°C  (high =  +100°C) 

coretemp-isa-0001
Adapter: ISA adapter
temp1:   +64°C  (high =  +100°C)

> It seems that there is microcode update for CPUID 06E8 version 0x39 just as 
> my 
> driver is checking. So if your CPUID is 06e8 too you should get the coretemp 
> driver working.

how do I find that out ? I mean sensors seem to work but how do I know
which CPUID+version I have ?

> If so I will post a patch and document the script in documentation directory 
> (or 
> at least some general instructions how to do that)
> 
> Please tell me your stepping:
> cat /proc/cpuinfo  | grep stepping

$ cat /proc/cpuinfo  | grep stepping
stepping: 8
stepping: 8

Thank you *very* much!
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ata1: soft resetting port

2007-06-19 Thread Soeren Sonnenburg
Dear List,

since the switch to 

CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_ATA_PIIX=y,

the ATA_PIIX driver manages both, internal sata disk aswell as cd/dvd
rom. However I am being flooded with the error messages below (well they
appear from time to time, dominating dmesg). 

This happens on kernel 2.6.22-rc5, I am copying relevant parts from dmesg:

libata version 2.21 loaded.
ata_piix :00:1f.1: version 2.11
ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x000140c0 irq 14
ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x000140c8 irq 15
ata1.00: ATAPI: HL-DT-ST DVDRW GWA4080M, AA26, max UDMA/33
ata1.00: configured for UDMA/33
ATA: abnormal status 0x7F on port 0x00010177
scsi 0:0:0:0: CD-ROMHL-DT-ST DVDRW GWA4080M   AA26 PQ: 0 ANSI: 5
sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
sr 0:0:0:0: Attached scsi CD-ROM sr0
sr 0:0:0:0: Attached scsi generic sg0 type 5
ata_piix :00:1f.2: MAP [ P0 P2 XX XX ]
ata_piix :00:1f.2: invalid MAP value 0
PCI: Setting latency timer of device :00:1f.2 to 64
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0x000140d8 ctl 0x000140f6 bmdma 0x00014020 irq 0
ata4: SATA max UDMA/133 cmd 0x000140d0 ctl 0x000140f2 bmdma 0x00014028 irq 0
ata3.01: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
ata3.01: ATA-7: ST9120821AS, 7.01, max UDMA/133
ata3.01: 234441648 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata3.01: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 234441648
ata3.01: configured for UDMA/133
ATA: abnormal status 0x7F on port 0x000140d7


the actual errors:


ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 51/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 00/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x43 data 12 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0 
 res 51/24:03:00:00:20/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x25 data 8 in
 res 51/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
ata1.00: (BMDMA stat 0x5)
ata1.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x43 data 12 in
 res 00/24:03:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
ata1: soft resetting port

-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: git-current: slub breaks s2ram with fglrx...

2007-06-17 Thread Soeren Sonnenburg
On Sun, 2007-06-17 at 17:33 +0200, Michal Piotrowski wrote:
> Hi Soeren,

Hi Michael,

[...]
> > I have one question: What if I don't load fglrx.ko but still use the
> > proprietary binary driver for xorg.
> 
> AFAIK it will not work.

well I *know* that X works without fglrx loaded (no 3d acceleration/ no
xv extensions though). I would be happily using the no-fglrx setup if
s2ram worked (and is more stable than with fglrx loaded). But currently
it just reboots on resume, so ...

> Shouldn't s2ram work with that ? And
> if not is this considered a kernel bug or still a ati-binary driver
> bug ?

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: git-current: slub breaks s2ram with fglrx...

2007-06-17 Thread Soeren Sonnenburg
On Sun, 2007-06-17 at 13:04 +0200, Rafael J. Wysocki wrote:
> On Sunday, 17 June 2007 11:49, Soeren Sonnenburg wrote:
> > OK,
[...]
> > slab & fglrx works
> > slub & console works
> > slub & X11+fglrx creates hangs on suspend (black screen - no further
> > idea ...)
> > 
> > It should be noted that withouth the proprietary  fglrx.ko module loaded
> > the machine just reboots on s2ram (though IIRC without fglrx.ko loaded
> > it worked with 2.6.19 or so...)
> > 
> > Is this a kernel thing or yet another bug in fglrx ?
> 
> I have no idea.
> 
> I guess we should let Christoph know about it (CC added), but I'm afraid
> we won't be able to figure out what's wrong without access to the driver's
> source code ...

I have one question: What if I don't load fglrx.ko but still use the
proprietary binary driver for xorg. Shouldn't s2ram work with that ? And
if not is this considered a kernel bug or still a ati-binary driver
bug ?

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


git-current: slub breaks s2ram with fglrx...

2007-06-17 Thread Soeren Sonnenburg
OK,

I've found the second root for my 

'2.6.22-rc regression: s2ram fails to suspend + fails to resume w/ Xorg':

The first one was just the wrong coretemp patch (already fixed by Jean).

The second one happens only with Xorg/fglrx loaded and slub enabled, as
I've found after a useless git bisect session, I've traced down the
other s2ram does not work problem to be caused by slub ... FYI: I am now
on git-current which should be rc5:

slab & fglrx works
slub & console works
slub & X11+fglrx creates hangs on suspend (black screen - no further
idea ...)

It should be noted that withouth the proprietary  fglrx.ko module loaded
the machine just reboots on s2ram (though IIRC without fglrx.ko loaded
it worked with 2.6.19 or so...)

Is this a kernel thing or yet another bug in fglrx ?

Soeren 
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hwmon/coretemp: Fix a broken error path

2007-06-17 Thread Soeren Sonnenburg
On Sat, 2007-06-16 at 23:17 +0200, Jean Delvare wrote:
> Hi Soeren,

Hi Jean,

[...]

> Thanks for reporting. Indeed this patch is broken, sorry for
> overlooking it. I tested it but my hardware is such that the faulty
> error path was never taken. Please test the following patch (on top of
> git-current) and confirm it fixes your problem:
> 
> Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
> Cc: Rudolf Marek <[EMAIL PROTECTED]>
> ---
>  drivers/hwmon/coretemp.c |1 +
>  1 file changed, 1 insertion(+)
> 
> --- linux-2.6.22-rc4.orig/drivers/hwmon/coretemp.c2007-06-05 
> 10:25:54.0 +0200
> +++ linux-2.6.22-rc4/drivers/hwmon/coretemp.c 2007-06-16 23:14:06.0 
> +0200
> @@ -185,6 +185,7 @@ static int __devinit coretemp_probe(stru
>   /* check for microcode update */
>   rdmsr_on_cpu(data->id, MSR_IA32_UCODE_REV, &eax, &edx);
>   if (edx < 0x39) {
> + err = -ENODEV;
>   dev_err(&pdev->dev,
>   "Errata AE18 not fixed, update BIOS or "
>   "microcode of the CPU!\n");

this patch indeed fixes the problem. Thanks!

Unfortunately the coretemp sensors are simply never there when that
patch is applied... which I guess was the intention ... As apple
probably won't update the bios does anyone know how to get a newer
microcode for a core duo cpu  (not core 2 duo!) ?

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


git-current: latest coretemp changes break s2ram

2007-06-16 Thread Soeren Sonnenburg
this commit makes coretemp fail on my macbook pro.

1) rmmod oopses (see below)
2) it breaks s2ram

Soeren

commit 67f363b1f6a31cf5027a97372f64bcced4f05ba6
Author: Rudolf Marek <[EMAIL PROTECTED]>
Date:   Sun May 27 22:17:43 2007 +0200

hwmon/coretemp: Add more safety checks

Add detection of AE18 Errata of Core processor and warns
users that the absolute readings might be wrong for Core2 processor.

Signed-off-by: Rudolf Marek <[EMAIL PROTECTED]>
Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>

[...]
PM: Adding info for platform:coretemp.0
coretemp coretemp.0: Errata AE18 not fixed, update BIOS or microcode of the CPU!
PM: Adding info for platform:coretemp.1
coretemp coretemp.1: Errata AE18 not fixed, update BIOS or microcode of the CPU!
[...]
BUG: unable to handle kernel NULL pointer dereference at virtual address 

 printing eip:
f887c09f
*pde = 
Oops:  [#1]
PREEMPT SMP 
Modules linked in: ohci1394 ieee1394 hfsplus binfmt_misc fuse eeprom coretemp 
applesmc hwmon snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm firewire_ohci 
firewire_core snd_timer i2c_i801 appletouch sky2 crc_itu_t snd soundcore 
snd_page_alloc intel_agp agpgart evdev
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.22-rc4-sonne #20)
EIP is at coretemp_remove+0x1f/0x60 [coretemp]
eax: f78c2800   ebx: f78c2890   ecx:    edx: f887d23c
esi: f78c2808   edi:    ebp: c0496760   esp: f7c07ef4
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process rmmod (pid: 2646, ti=f7c06000 task=dfc07440 task.ti=f7c06000)
Stack: f78c2808 f887d23c f78c2890 c02b714c c02b580b f78c2808 f78c2808 c02b5c5e 
   f78c2808 c02b50c0  c02b3508 f78c2800 f78c2800 f7a298f0 f7c06000 
   c02b73f0 f78c2800 f7a298b0 c02b7828 f7a298f0 f887c69a  f887d380 
Call Trace:
 [] platform_drv_remove+0xc/0x10
 [] __device_release_driver+0x6b/0xa0
 [] device_release_driver+0x1e/0x40
 [] bus_remove_device+0x50/0x80
 [] device_del+0x138/0x230
 [] platform_device_del+0x10/0x70
 [] platform_device_unregister+0x8/0x10
 [] coretemp_exit+0x3a/0x7d [coretemp]
 [] sys_delete_module+0x148/0x1b0
 [] do_page_fault+0x333/0x620
 [] do_munmap+0x197/0x1f0
 [] sysenter_past_esp+0x5f/0x85
 ===
Code: 87 f8 5e 5f 5d e9 72 6a b4 c7 66 90 83 ec 0c 89 1c 24 89 c3 89 74 24 04 
8d 70 08 81 c3 90 00 00 00 89 7c 24 08 8b be 20 01 00 00 <8b> 07 e8 5a 7f fc ff 
89 d8 ba e0 c6 87 f8 e8 3e 20 94 c7 31 c0 
EIP: [] coretemp_remove+0x1f/0x60 [coretemp] SS:ESP 0068:f7c07ef4

-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22-rc regression: s2ram fails to suspend + fails to resume w/ Xorg

2007-06-10 Thread Soeren Sonnenburg
Dear all,

I noticed 2 regressions to 2.6.21.X on my macbook pro:

1. on git-current something broke s2ram completely, i.e. s2ram does not
even suspend anymore but hangs (blinking cursor on console)

2. while on -rc3 s2ram is putting the machine to sleep and even makes it
reliably come back under console, it fails miserably *to resume* when I
am in X. as this is a ATI binary only X driver (no binary module
loaded!) I am now not sure whether this report is worth anything. but
yes it works even with the fglrx module loaded in 2.6.21.X ...

any ideas / things to try ?

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.


config.gz
Description: GNU Zip compressed data


Re: [Bug 8498], [Bug 8510], and Re: Can't s2ram 22-rc2

2007-05-21 Thread Soeren Sonnenburg
On Mon, 2007-05-21 at 10:51 -0400, Alan Stern wrote:
> On Sat, 19 May 2007, Greg KH wrote:

> It turns out that the patch I originally wrote to fix this is in
> conflict with one of Raphael's patches (make freezeable workqueues
> singlethread) already added to 2.6.22-rc2.  So here's an updated
> version for that kernel.
> 
> Andrey, Soeren, and Avuton: Please try this patch with 2.6.22-rc2 or 
> later and see if it fixes your problems.
> 
> Greg, if this works then I'll send it in the proper form for a patch, 
> and you can use it to replace
> 
>   usb-make-the-autosuspend-workqueue-thread-freezable.patch


works perfect, i.e. the machine survived 3 s2ram cycles and rebooted
cleanly :-))

apply! apply!

Soeren

> Alan Stern
> 
> 
> Index: 2.6.22-rc2/drivers/usb/core/usb.c
> ===
> --- 2.6.22-rc2.orig/drivers/usb/core/usb.c
> +++ 2.6.22-rc2/drivers/usb/core/usb.c
> @@ -205,7 +205,7 @@ struct device_type usb_device_type = {
>  
>  static int ksuspend_usb_init(void)
>  {
> - ksuspend_usb_wq = create_singlethread_workqueue("ksuspend_usbd");
> + ksuspend_usb_wq = create_freezeable_workqueue("ksuspend_usbd");
>   if (!ksuspend_usb_wq)
>   return -ENOMEM;
>   return 0;
> 
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Can't s2ram 22-rc2 [Was: 2.6.22-rc2 BUG: at slub_def.h:77]

2007-05-21 Thread Soeren Sonnenburg
On Mon, 2007-05-21 at 10:38 +0200, Jiri Slaby wrote:
> Played with Cc list.
> 
> Soeren Sonnenburg napsal(a):
> > On Mon, 2007-05-21 at 10:09 +0200, Jiri Slaby wrote:
> >> Soeren Sonnenburg napsal(a):
> >>> a regression (well I switched to slub) happens on boot...
> > 
> >>> BUG: at include/linux/slub_def.h:77 kmalloc_index()
> >>>  [] get_slab+0x1c8/0x250
> > [...]
> >> Could you try this patch:
> >> http://lkml.org/lkml/2007/5/19/171
> > 
> > yes, that does fix the issue (at least I don't see the BUG: anymore on
> > boot). however I can no longer s2ram on 22rc2. 
> 
> Whit what kind of error or what happens?

pci_device_suspend(): usb_hcd_pci_suspend+0x0/0x170() returns -16
suspend_device(): pci_device_suspend+0x0/0x60() returns -16
Could not suspend device :00:1d.0: error -16

00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 
02)

lspci/dmesg are in this mail:
http://marc.info/?l=linux-kernel&m=117973679209855&w=2

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


regression 2.6.22-rc2 suspend to ram broken ( usb_hcd_pci_suspend+0x0/0x170() returns -16 )

2007-05-21 Thread Soeren Sonnenburg
I suddenly can no longer s2ram with 2.6.22-rc2, as this seems caused by
usb_hcd_pci_suspend I am CC'ing linux usb-devel.

I am attaching a dmesg and lspci.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
Linux version 2.6.22-rc2-sonne ([EMAIL PROTECTED]) (gcc version 4.1.3 20070518 
(prerelease) (Debian 4.1.2-8)) #12 SMP PREEMPT Mon May 21 10:21:19 CEST 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000ede00 - 0010 (reserved)
 BIOS-e820: 0010 - 7efce000 (usable)
 BIOS-e820: 7efce000 - 7f1cf000 (ACPI NVS)
 BIOS-e820: 7f1cf000 - 7febf000 (ACPI data)
 BIOS-e820: 7febf000 - 7feef000 (ACPI NVS)
 BIOS-e820: 7feef000 - 7ff0 (ACPI data)
 BIOS-e820: 7ff0 - 8000 (reserved)
 BIOS-e820: e000 - f000 (reserved)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fed14000 - fed1a000 (reserved)
 BIOS-e820: fed1c000 - fed2 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820: ffe0 - 0001 (reserved)
1135MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 520142) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 -> 4096
  Normal   4096 ->   229376
  HighMem229376 ->   520142
early_node_map[1] active PFN ranges
0:0 ->   520142
On node 0 totalpages: 520142
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 2271 pages used for memmap
  HighMem zone: 288495 pages, LIFO batch:31
DMI 2.4 present.
ACPI: RSDP 000FE020, 0024 (r2 APPLE )
ACPI: XSDT 7FEFD120, 0074 (r1 APPLE   Apple00   55   113)
ACPI: FACP 7FEFB000, 00F4 (r3 APPLE   Apple00   55 Loki   5F)
ACPI: DSDT 7FEF, 48C0 (r1 APPLE  MacBookP10001 INTL 20050309)
ACPI: FACS 7FEC1000, 0040
ACPI: HPET 7FEFA000, 0038 (r1 APPLE   Apple001 Loki   5F)
ACPI: APIC 7FEF9000, 0068 (r1 APPLE   Apple001 Loki   5F)
ACPI: MCFG 7FEF8000, 003C (r1 APPLE   Apple001 Loki   5F)
ACPI: ASF! 7FEF7000, 009C (r32 APPLE   Apple001 Loki   5F)
ACPI: SBST 7FEF6000, 0030 (r1 APPLE   Apple001 Loki   5F)
ACPI: ECDT 7FEF5000, 0053 (r1 APPLE   Apple001 Loki   5F)
ACPI: SSDT 7FEBC000, 064F (r1 APPLE   SataPri 1000 INTL 20050309)
ACPI: SSDT 7FEBB000, 069C (r1 APPLE   SataSec 1000 INTL 20050309)
ACPI: SSDT 7FEEF000, 04DC (r1 APPLE CpuPm 3000 INTL 20050309)
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:14 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 6:14 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 1, version 32, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
ACPI: HPET id: 0x8086a201 base: 0xfed0
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 8800 (gap: 8000:6000)
Built 1 zonelists.  Total pages: 516079
Kernel command line: root=/dev/sda3 resume=/dev/sda5 rw S
mapped APIC to d000 (fee0)
mapped IOAPIC to c000 (fec0)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 2307.138 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 2055424k/2080568k available (2822k kernel code, 23792k reserved, 947k 
data, 252k init, 1163064k highmem)
virtual kernel memory layout:
fixmap  : 0xfff9d000 - 0xf000   ( 392 kB)
pkmap   : 0xff80 - 0xffc0   (4096 kB)
vmalloc : 0xf880 - 0xff7fe000   ( 111 MB)
lowmem  : 0xc000 - 0xf800   ( 896 MB)
  .init : 0xc04b6000 - 0xc04f5000   ( 252 kB)
  .data : 0xc03c1a0d - 0xc04ae818   ( 947 kB)
  .text : 0xc010 - 0xc03c1a0d   (2822 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, Processors=2, Nodes=1
hpet0: at MMIO 0xfed0, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14

Re: 2.6.22-rc2 BUG: at include/linux/slub_def.h:77 kmalloc_index()

2007-05-21 Thread Soeren Sonnenburg
On Mon, 2007-05-21 at 10:09 +0200, Jiri Slaby wrote:
> Soeren Sonnenburg napsal(a):
> > a regression (well I switched to slub) happens on boot...

> > BUG: at include/linux/slub_def.h:77 kmalloc_index()
> >  [] get_slab+0x1c8/0x250
[...]
> Could you try this patch:
> http://lkml.org/lkml/2007/5/19/171

yes, that does fix the issue (at least I don't see the BUG: anymore on
boot). however I can no longer s2ram on 22rc2. 

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


applesmc - sensors patch missing from 2.6.22-rc2

2007-05-21 Thread Soeren Sonnenburg
I wonder why the attached patch is missing from applesmc (written by
nicolas boichat) without which sensors simply won't work ...

( but these ones has been applied and are rather useless without the
attached one
http://www.mail-archive.com/[EMAIL PROTECTED]/msg12374.html
http://www.mail-archive.com/[EMAIL PROTECTED]/msg12371.html )


I am using this for some weeks on 2.6.21 now without problems

$ sensors
applesmc-isa-0300
Adapter: ISA adapter
temp1:   +28°C
temp2:   +40°C
temp3:   +66°C
temp4:   +58°C
temp5:   +65°C
temp6:   +62°C
temp7:   +69°C
temp8:   +49°C
temp9:   +45°C
temp10:  +58°C
temp11:  +40°C
temp12:  +39°C
fan1: 1060 RPM (safe = 1200 RPM, min = 1000 RPM, max = 6000 RPM)
fan2: 1060 RPM (safe = 1200 RPM, min = 1000 RPM, max = 6000 RPM)

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index 0c16067..366f4a1 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -491,6 +491,12 @@ out:
 
 /* Sysfs Files */
 
+static ssize_t applesmc_name_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "applesmc\n");
+}
+
 static ssize_t applesmc_position_show(struct device *dev,
    struct device_attribute *attr, char *buf)
 {
@@ -913,6 +919,8 @@ static struct led_classdev applesmc_backlight = {
 	.brightness_set		= applesmc_brightness_set,
 };
 
+static DEVICE_ATTR(name, 0444, applesmc_name_show, NULL);
+
 static DEVICE_ATTR(position, 0444, applesmc_position_show, NULL);
 static DEVICE_ATTR(calibrate, 0644,
 			applesmc_calibrate_show, applesmc_calibrate_store);
@@ -1197,6 +1205,8 @@ static int __init applesmc_init(void)
 		goto out_driver;
 	}
 
+	ret = sysfs_create_file(&pdev->dev.kobj, &dev_attr_name.attr);
+
 	/* Create key enumeration sysfs files */
 	ret = sysfs_create_group(&pdev->dev.kobj, &key_enumeration_group);
 	if (ret)


2.6.22-rc2 BUG: at include/linux/slub_def.h:77 kmalloc_index()

2007-05-20 Thread Soeren Sonnenburg
a regression (well I switched to slub) happens on boot...

...
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI Interrupt :00:1d.3[D] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device :00:1d.3 to 64
uhci_hcd :00:1d.3: UHCI Host Controller
uhci_hcd :00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd :00:1d.3: irq 16, io base 0x4040
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usb 1-4: new high speed USB device using ehci_hcd and address 3
BUG: at include/linux/slub_def.h:77 kmalloc_index()
 [] get_slab+0x1c8/0x250
 [] usb_get_configuration+0x854/0xfa0
 [] __kmalloc_track_caller+0x1c/0x80
 [] __kzalloc+0x1a/0x50
 [] usb_get_configuration+0x854/0xfa0
 [] usb_start_wait_urb+0x70/0xb0
 [] hub_port_init+0x90/0x5f0
 [] usb_new_device+0x14/0x100
 [] hub_thread+0x5ab/0xc30
 [] autoremove_wake_function+0x0/0x50
 [] hub_thread+0x0/0xc30
 [] kthread+0x42/0x70
 [] kthread+0x0/0x70
 [] kernel_thread_helper+0x7/0x14
 ===
usb 1-4: configuration #1 chosen from 1 choice
usb 2-2: new full speed USB device using uhci_hcd and address 2
usb 2-2: configuration #1 chosen from 1 choice
usb 4-2: new full speed USB device using uhci_hcd and address 2
...
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make appletouch shut up when it has nothing to say

2007-05-15 Thread Soeren Sonnenburg
On Sun, 2007-05-13 at 18:58 -0700, Pete Zaitcev wrote:
> On Sun, 13 May 2007 20:57:25 +0100, Matthew Garrett <[EMAIL PROTECTED]> wrote:
> 
> > Ok, I've tidied this up a little. [...]
> 
> Looks fine here... well, almost. Did you try rmmod (I don't even know if
> it's applicable, sorry)? Usually, when schedule_work is involved, you want
> to make sure that a scheduled work won't be run when the module is gone.
> More often, a device removal is the issue, but as I take it, such is not
> possible for a built-in device :-) . In most cases, all it takes is a
> strategically placed flush_scheduled_work().

I was using this patch for some days now and I realized that - from time
to time - the touchpad runs amok, i.e. I more or less unable to control
the mouse when that happens.

Then a rmmod appletouch (+ reload) fixes this, as well as a sleep/resume
cycle.

As I had to rmmod appletouch a lot and did not see crashes I think it
works... though this problem persists...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: converting appletouch to usb autosuspend...

2007-05-13 Thread Soeren Sonnenburg
On Sat, 2007-05-12 at 22:47 +0100, Alistair John Strachan wrote:
> On Saturday 12 May 2007 19:51:26 Soeren Sonnenburg wrote:
> > Dear all,
[...]
> > While we are at it usb related powerhogs on this macbook pro are
> > uhci_hcd (usb keyboard) and usb_hcd_poll_rh_status (rh_timer_func)
> > too...
> 
> I've found that hci_usb also hogs power on the Macbook; blacklisting this 
> module cuts down HZ considerably. I also found appletouch consumed ticks, 

I guess without loading appletouch ? Then there really is something in
there that needs to be fixed..

> just as you did.

What did you use instead of hci_usb then ? usbkbd ? This won't give you
the special keys etc...

> uhci_hcd then drops to noise; my Macbook's sitting on 10W with the backlight 
> on minimum, which is about what it can manage in OS X on maximum life 
> settings.

Thats quite some improvement... 

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make appletouch shut up when it has nothing to say

2007-05-13 Thread Soeren Sonnenburg
On Sun, 2007-05-13 at 20:57 +0100, Matthew Garrett wrote:
> Ok, I've tidied this up a little. I've separated the actual mode init 
> code into a separate function in order to avoid code duplication, and no 
> longer creating a new workqueue. The only other change is something that 
> I /think/ is actually a bug in the driver to begin with, but I'd like 
> some more feedback on that first - the first packet sent after the mode 
> change has 0x20 in the final byte. This seems to be interpreted as a 
> left mouse button press. As a result, moving the touchpad sends a false 
> press after every reinitialisation, or (approximately) every time the 
> pointer is moved. As far as I can tell this also happens with the 
> existing code, but is probably not noticable there because it won't 
> appear again after the first touch on the pad. Just skipping that case 
> seems to work fine.

This patch indeed fixes the problem and I have yet to observe problems
with it... However I don't know whether a re-init is the intended way of
dealing with it...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


converting appletouch to usb autosuspend...

2007-05-12 Thread Soeren Sonnenburg
Dear all,

I've realized using the great powertop ( http://www.linuxpowertop.org/ )
that loading the appletouch driver (and touching it once) makes consumes
about 0.3 W even when not touching the pad. As rmmod'ing appletouch
fixes this I wonder why the driver does not do this alone. So my
question is what does one have to do to convert a driver (such as
appletouch) to make use of usb autosuspend except for

 .supports_autosuspend = 1

...

While we are at it usb related powerhogs on this macbook pro are
uhci_hcd (usb keyboard) and usb_hcd_poll_rh_status (rh_timer_func)
too...

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21

2007-04-26 Thread Soeren Sonnenburg
On Thu, 2007-04-26 at 06:08 +0200, Adrian Bunk wrote:
[...]
> What I will NOT do:
> Waste my time with tracking 2.6.22-rc regressions.

Adrian, please reconsider. Without you the issues I've reported (most
likely to the wrong people) would have been missed too. And also keep in
mind that it takes 2 to tango. If reporters can nail down issues to
single config options/patches or go further by adding kprint's chances
that things get fixed increase a lot.

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


evdev* devices change major/minors after suspend/resume (udev?)

2007-04-11 Thread Soeren Sonnenburg
Dear all,

I wonder how one (from the userspace side) should deal with the case
that evdev devices disappear after resume and reappear under a different
name/different major/minor (i.e. evdev1 can become evdev4).

The application in mind scans all evdev devices, opens the ones matching
certain properties and then processes events on them.

Now the problem is that after resume devices disappear, so the
application rescans+reopens the evdev's BUT some of the evdevs are just
not available right after resume but appear at some point later in time.

Is there a way to wait till all evdev's are recreated / ask for udev to
settle ? Or what is the proposed way of dealing with that ?

Very concrete it is this evdev that may be missing... and just FYI this
also seems to cause trouble in Xorg - sometimes the appletouch mouse is
not yet back...

/dev/input/by-id/usb-Apple_Computer_Apple_Internal_Keyboard_._Trackpad-event-kbd
 -> ../event5

Any hints welcome,
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc6 - regressions update

2007-04-08 Thread Soeren Sonnenburg
On Fri, 2007-04-06 at 16:04 -0700, Linus Torvalds wrote:

[HPET issues resolved]
> > 3) Subject: SATA breakage on resume
> > References : http://lkml.org/lkml/2007/3/7/233
> > Submitter  : Thomas Gleixner <[EMAIL PROTECTED]>
> >  Soeren Sonnenburg <[EMAIL PROTECTED]>
> > Status : unknown
> > 
> > I am still seeing these messages after a suspend/resume cycle (though
> > all devices work even after multiple suspend/resume cycles)
> > 
> > ATA: abnormal status 0x80 on port 0x000140df
> 
> This seems to be normal, and related to some unknown timing issue. If the 
> thing works for you apart from the message, I'd just ignore it..

Argh! Now after intensive use over the last 2 days, I realized that the
internal harddisk works OK, but the dvd-drive did not after the 7th
suspend/resume cycle - the device was suddenly gone (I could not even
eject the disc I just inserted), more verbose dmesg follows:

ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1.00: limiting speed to UDMA/33:PIO3
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
sage repeated 4 times
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1.00: disabled

Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch, take #3] high-res timers: resume fix

2007-04-07 Thread Soeren Sonnenburg
On Sat, 2007-04-07 at 12:05 +0200, Ingo Molnar wrote:
> * Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> 
> > Hm, I'm probably missing something obvious, but where is it going to 
> > be called from?
> 
> doh! :) Find new patch below :-/ Soeren, please test this one.

OK, I did about 5 suspend/resume cycles with

CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
CONFIG_HPET_MMAP=y

and no oops / no problem ...

So I guess the fix take #3 is good :-)

One not directly related to this patch (but probably all the timer
stuff) I noticed with -rc6 is that it takes 10 seconds to suspend (it
was ~2 seconds before)

Soeren

>   Ingo
> 
> >
> Subject: [patch] high-res timers: resume fix
> From: Ingo Molnar <[EMAIL PROTECTED]>
> 
> Soeren Sonnenburg reported that upon resume he is getting
> this backtrace:
> 
>  [] smp_apic_timer_interrupt+0x57/0x90
>  [] retrigger_next_event+0x0/0xb0
>  [] apic_timer_interrupt+0x28/0x30
>  [] retrigger_next_event+0x0/0xb0
>  [] __kfifo_put+0x8/0x90
>  [] on_each_cpu+0x35/0x60
>  [] clock_was_set+0x18/0x20
>  [] timekeeping_resume+0x7c/0xa0
>  [] __sysdev_resume+0x11/0x80
>  [] sysdev_resume+0x47/0x80
>  [] device_power_up+0x5/0x10
> 
> it turns out that on resume we mistakenly re-enable interrupts.
> Do the timer retrigger only on the current CPU.
> 
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> Acked-by: Thomas Gleixner <[EMAIL PROTECTED]>
> ---
>  include/linux/hrtimer.h |3 +++
>  kernel/hrtimer.c|   12 
>  kernel/timer.c  |2 +-
>  3 files changed, 16 insertions(+), 1 deletion(-)
> 
> Index: linux/include/linux/hrtimer.h
> ===
> --- linux.orig/include/linux/hrtimer.h
> +++ linux/include/linux/hrtimer.h
> @@ -206,6 +206,7 @@ struct hrtimer_cpu_base {
>  struct clock_event_device;
>  
>  extern void clock_was_set(void);
> +extern void hres_timers_resume(void);
>  extern void hrtimer_interrupt(struct clock_event_device *dev);
>  
>  /*
> @@ -236,6 +237,8 @@ static inline ktime_t hrtimer_cb_get_tim
>   */
>  static inline void clock_was_set(void) { }
>  
> +static inline void hres_timers_resume(void) { }
> +
>  /*
>   * In non high resolution mode the time reference is taken from
>   * the base softirq time variable.
> Index: linux/kernel/hrtimer.c
> ===
> --- linux.orig/kernel/hrtimer.c
> +++ linux/kernel/hrtimer.c
> @@ -459,6 +459,18 @@ void clock_was_set(void)
>  }
>  
>  /*
> + * During resume we might have to reprogram the high resolution timer
> + * interrupt (on the local CPU):
> + */
> +void hres_timers_resume(void)
> +{
> + WARN_ON_ONCE(num_online_cpus() > 1);
> +
> + /* Retrigger the CPU local events: */
> + retrigger_next_event(NULL);
> +}
> +
> +/*
>   * Check, whether the timer is on the callback pending list
>   */
>  static inline int hrtimer_cb_pending(const struct hrtimer *timer)
> Index: linux/kernel/timer.c
> ===
> --- linux.orig/kernel/timer.c
> +++ linux/kernel/timer.c
> @@ -1016,7 +1016,7 @@ static int timekeeping_resume(struct sys
>   clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
>  
>   /* Resume hrtimers */
> - clock_was_set();
> + hres_timers_resume();
>  
>   return 0;
>  }
> 
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc6 - regressions update

2007-04-06 Thread Soeren Sonnenburg
On Thu, 2007-04-05 at 19:50 -0700, Linus Torvalds wrote:
> Ok,
>  I don't think there really is anything very interesting here, but we're 
> hopefully whittling down the list of regressions, and fixing various 
> random other small issues while at it.
> 
> Some smallish MIPS updates, networking (and network driver) fixes, removal 
> of a long obsolete framebuffer driver, etc etc. The shortlog really tells 
> the story.
> 
> We should be getting close to a 2.6.21 release, so please update any 
> regression reports you've done,

regression update for 21-rc6:

1) all s2ram and NO_HZ related things seem to be resolved on my macbook
pro, also 
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y

don't break resume anymore.

2) However I am still having problems with
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_HPET=y
+CONFIG_HPET_MMAP=y
although the machine resumes, I've managed to get the attached oops.

3) Subject: SATA breakage on resume
References : http://lkml.org/lkml/2007/3/7/233
Submitter  : Thomas Gleixner <[EMAIL PROTECTED]>
 Soeren Sonnenburg <[EMAIL PROTECTED]>
Status : unknown

I am still seeing these messages after a suspend/resume cycle (though
all devices work even after multiple suspend/resume cycles)

ATA: abnormal status 0x80 on port 0x000140df
ata3.01: revalidation failed (errno=-2)
ata3: failed to recover some devices, retrying in 5 secs
ata1.00: configured for UDMA/33
ATA: abnormal status 0x7F on port 0x000140df
ATA: abnormal status 0x7F on port 0x000140df
ata3.01: configured for UDMA/133

So that's been a big step forward...
Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
CPU1 is down
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Back to C!
irq 9: nobody cared (try booting with the "irqpoll" option)
 [] __report_bad_irq+0x24/0x90
 [] note_interrupt+0x22f/0x260
 [] acpi_irq+0xb/0x14
 [] handle_IRQ_event+0x25/0x60
 [] handle_level_irq+0xe0/0x110
 [] do_IRQ+0x3b/0x80
 [] smp_apic_timer_interrupt+0x57/0x90
 [] common_interrupt+0x23/0x28
 [] __kfifo_put+0x78/0x90
 [] __do_softirq+0x6d/0xf0
 [] do_softirq+0x37/0x40
 [] irq_exit+0x7a/0x90
 [] smp_apic_timer_interrupt+0x57/0x90
 [] retrigger_next_event+0x0/0xb0
 [] apic_timer_interrupt+0x28/0x30
 [] retrigger_next_event+0x0/0xb0
 [] __kfifo_put+0x8/0x90
 [] on_each_cpu+0x35/0x60
 [] clock_was_set+0x18/0x20
 [] timekeeping_resume+0x7c/0xa0
 [] __sysdev_resume+0x11/0x80
 [] sysdev_resume+0x47/0x80
 [] device_power_up+0x5/0x10
 [] suspend_enter+0x56/0x60
 [] enter_state+0x11a/0x1c0
 [] state_store+0xbd/0xd0
 [] state_store+0x0/0xd0
 [] subsys_attr_store+0x29/0x40
 [] sysfs_write_file+0xb2/0x110
 [] vfs_write+0xa6/0x140
 [] sysfs_write_file+0x0/0x110
 [] sys_write+0x41/0x70
 [] sysenter_past_esp+0x5f/0x85
 ===
handlers:
[] (acpi_irq+0x0/0x14)
Disabling IRQ #9
Enabling non-boot CPUs ...
SMP alternatives: switching to SMP code


Re: s2ram still broken with CONFIG_NO_HZ / HPET (macbook pro)

2007-03-13 Thread Soeren Sonnenburg
On Mon, 2007-03-12 at 19:59 +0900, Tejun Heo wrote:
> Soeren Sonnenburg wrote:
> > Elsewise I still see the
> > 
> > ATA: abnormal status 0x80 on port 0x000140df
> > ATA: abnormal status 0x80 on port 0x000140df
> > ata1.00: configured for UDMA/33
> > ata3.01: revalidation failed (errno=-2)
> > ata3: failed to recover some devices, retrying in 5 secs
> > ATA: abnormal status 0x7F on port 0x000140df
> > ATA: abnormal status 0x7F on port 0x000140df
> > ata3.01: configured for UDMA/133
> 
> I can't tell much about HPET timer, only the ATA messages.
> 
> Abnormal messages can be ignored.  Hmmm... revalidation failed without
> explaining why.  Can you apply the attached patch and see whether the
> added message gets printed?

Well it is there:

ATA: abnormal status 0x80 on port 0x000140df
ata1.00: configured for UDMA/33
ata3.01: NODEV after polling detection
ata3.01: revalidation failed (errno=-2)
ata3: failed to recover some devices, retrying in 5 secs
ATA: abnormal status 0x7F on port 0x000140df
ATA: abnormal status 0x7F on port 0x000140df
ata3.01: configured for UDMA/133
SCSI device sda: 234441648 512-byte hdwr sectors (120034 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA

just FYI a macbook pro has a sata disc + a dvd drive registered
as /dev/sda and /dev/scd0 driven by ata_piix:

dmesg | grep ata_piix 
ata_piix :00:1f.1: version 2.10ac1
scsi0 : ata_piix
scsi1 : ata_piix
ata_piix :00:1f.2: MAP [ P0 P2 XX XX ]
ata_piix :00:1f.2: invalid MAP value 0
scsi2 : ata_piix
scsi3 : ata_piix

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux v2.6.21-rc3

2007-03-07 Thread Soeren Sonnenburg
On Wed, 2007-03-07 at 15:22 +0100, Thomas Gleixner wrote:
> On Tue, 2007-03-06 at 20:59 -0800, Linus Torvalds wrote:
> > We've finally hopefully started to put a dent in the regressions, 
> > especially the suspend/resume problems introduced since 2.6.20.
> 
> Still having SATA breakage on resume:
> 
> Caught that one (from screen)
> 
> ATA: abnormal status 0x7F on port 0x000118cf
> irq 21: nobody cared (try booting ..)
> ...
> Disabling IRQ #21
> 
> 
> During normal boot I see the "ATA: abnormal status 0x7F on port
> 0x000118cf" once, but there the system behaves normal
> 
>   tglx

maybe that is also causing the hang I am still seeing with the full
config... :(
(no display, no usb device activation, but I tend to think the mbp wants
to access the hdd...)

SCSI device sda: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [5/6] 2.6.21-rc2: known regressions

2007-03-06 Thread Soeren Sonnenburg
On Tue, 2007-03-06 at 12:46 +0200, Michael S. Tsirkin wrote:
> > Quoting Ingo Molnar <[EMAIL PROTECTED]>:
> > Subject: Re: [5/6] 2.6.21-rc2: known regressions
> > 
> > 
> > * Michael S. Tsirkin <[EMAIL PROTECTED]> wrote:
> > 
> > > > Quoting Linus Torvalds <[EMAIL PROTECTED]>:
> > > >
> > > > Ok, it does indeed solve the problem for me.
> > > 
> > > Not yet for me unfortunately, although this seems to help.
> > > Is this the patch I should have applied?
> > > http://lkml.org/lkml/2007/3/5/445
> > > 
> > > With this applied, on resume I get *some* screen output soon after 
> > > resume (e.g. with s2ram I get several characters on VGA, X starts 
> > > drawing some windows) but then the crescent symbol starts blinking 
> > > again and the system hangs.
> > 
> > could you try this via s2ram on a text console, to see whether the 
> > kernel spits out any warning before it locks up?
> 
> Yes, that's what I did. Unfortunately only a couple of characters were
> shown before it locked up.
> 
> I still need to check what does this do in the NO_HZ configuration.
> 
> BTW, Ingo, can you suspend/resume any number of times with this patch?

well I could at least two times in a row in my minimalistic setup
(config attached) however using the full kernel config it still
hangs ... (config also attached in case you watn to compare it).

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.


linux-config-2.6.21-isolated-s2ram-debug.bz2
Description: application/bzip


linux-config-2.6.21.bz2
Description: application/bzip


Re: [5/6] 2.6.21-rc2: known regressions

2007-03-05 Thread Soeren Sonnenburg
On Tue, 2007-03-06 at 07:49 +0100, Soeren Sonnenburg wrote:
> On Tue, 2007-03-06 at 01:25 +0100, Thomas Gleixner wrote:
> > On Mon, 2007-03-05 at 15:45 -0800, Linus Torvalds wrote:
> > > 
> > > On Tue, 6 Mar 2007, Thomas Gleixner wrote:
> > > > > 
> > > > > Subject: macbook pro suspend to ram broken  (clockevents)
> > > > > References : http://lkml.org/lkml/2007/3/4/110
> > > > > Submitter  : Soeren Sonnenburg <[EMAIL PROTECTED]>
> > > > > Caused-By  : Thomas Gleixner <[EMAIL PROTECTED]>
> > > > >  commit e9e2cdb412412326c4827fc78ba27f410d837e6e
> > > > > Status : unknown
> > Does this make the problem go away ?
> 
> yes. works in my isolated test-setup. I'm now going to test this in
> git-HEAD with the full config.

*argh* when using the full config on HEAD something still causes a hang
on resume, but no time to trace it (the this probably new problem) down
atm :(

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [5/6] 2.6.21-rc2: known regressions

2007-03-05 Thread Soeren Sonnenburg
On Tue, 2007-03-06 at 01:25 +0100, Thomas Gleixner wrote:
> On Mon, 2007-03-05 at 15:45 -0800, Linus Torvalds wrote:
> > 
> > On Tue, 6 Mar 2007, Thomas Gleixner wrote:
> > > > 
> > > > Subject: macbook pro suspend to ram broken  (clockevents)
> > > > References : http://lkml.org/lkml/2007/3/4/110
> > > > Submitter  : Soeren Sonnenburg <[EMAIL PROTECTED]>
> > > > Caused-By  : Thomas Gleixner <[EMAIL PROTECTED]>
> > > >  commit e9e2cdb412412326c4827fc78ba27f410d837e6e
> > > > Status : unknown
> > > 
> > > I can reproduce this on my dual core VAIO. There are some issues:
> > 
> > Yeah, I think I can too, on my dual-core Mac Mini. 
> > 
> > I'm not done with my bisection, but e9e2cdb4 is among the 28 commits left, 
> > so I'm pretty sure I'm hitting the same bug. I'll do a few more bootups to 
> > be 100% sure.
> 
> I just got the resume fix cleaned up. The suspend / resume thing was
> dropped unintentionally during the -mm code reshuffling.
> 
> It still needs the broadcast fix though.
> 
> Does this make the problem go away ?

yes. works in my isolated test-setup. I'm now going to test this in
git-HEAD with the full config.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


macbook pro suspend to ram broken in linux-2.6.git HEAD

2007-03-04 Thread Soeren Sonnenburg
a rather long git bisect session between v2.6.20 and HEAD identified the
commit below this as the cause. please note that the machine does not
return from resume and although all PM debug was turned on there is
nothing in the logs. happens with a minimalistic setup (console only no
audio/network etc) without SMP and CONFIG_HPET_TIMER  but PREEMPT on and
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y

Any ideas on that issue ?
Soeren

e9e2cdb412412326c4827fc78ba27f410d837e6e is first bad commit
commit e9e2cdb412412326c4827fc78ba27f410d837e6e
Author: Thomas Gleixner <[EMAIL PROTECTED]>
Date:   Fri Feb 16 01:28:04 2007 -0800

[PATCH] clockevents: i386 drivers

Add clockevent drivers for i386: lapic (local) and PIT/HPET (global).  
Update
the timer IRQ to call into the PIT/HPET driver's event handler and the
lapic-timer IRQ to call into the lapic clockevent driver.  The assignement 
of
timer functionality is delegated to the core framework code and replaces the
compile and runtime evalution in do_timer_interrupt_hook()

Use the clockevents broadcast support and implement the lapic_broadcast
function for ACPI.

No changes to existing functionality.

[ kdump fix from Vivek Goyal <[EMAIL PROTECTED]> ]
[ fixes based on review feedback from Arjan van de Ven <[EMAIL PROTECTED]> ]
Cleanups-from: Adrian Bunk <[EMAIL PROTECTED]>
Build-fixes-from: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Cc: john stultz <[EMAIL PROTECTED]>
Cc: Roman Zippel <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 8cd88ef4dc6e976c589deda1c558e19ed8dcafde 
88ed93cefb145aa545ff4561636cc9342fc7bb42 M  arch
:04 04 5a1342027bc77019bb3ca9295d02bf483410e3d4 
3de4aedc5bb5d86287b7a5bad4b673e8ffbc1e44 M  drivers
:04 04 88ade982261c3ce0590c939694994796f02d2e73 
d7221468ac776ccc7804f9a46d6d53fa7c294d43 M  include

-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] 2.6.20-rc6 SCSI error: I/O error - trouble with mass storage devices ?!

2007-01-31 Thread Soeren Sonnenburg
On Tue, 2007-01-30 at 13:08 -0500, Alan Stern wrote:
> On Sat, 27 Jan 2007, Soeren Sonnenburg wrote:

[P990 mass storage trouble]
> > Now I am clueless what could have gone wrong (as I *think* this was all
> > working at some point at least before firmware updates) and what the
> > difference between these mass storage devices is.
> 
> The log revealed that the phone's firmware returns garbage values in the 
> Residue field for some WRITEs.  This patch should take care of it.

I can confirm that this fixes the problem.

Thank you *very* much - you've made my day.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6 pb_fnmode regression

2007-01-29 Thread Soeren Sonnenburg
On Mon, 2007-01-29 at 12:45 +0100, Jiri Kosina wrote:
> On Mon, 29 Jan 2007, Soeren Sonnenburg wrote:
> 
> > That sounds good for me. Breaking with what was there is not a problem 
> > as long as this feature is still there, it can be done in a more clean 
> > way this way, and the new /sys/foo/bar path is documented (basically 
> > people nowadays expect slight user interface changes between kernel 
> > versions).
> 
> So, does the patch below look OK to powerbook people? The only difference 
> is that the module taking care of pb_fnmode parameter is now hid, instead 

For me yes ... I just rebooted and checked fn_modes ... it works nicely.
So I guess this should be applied ?!

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc6 pb_fnmode regression

2007-01-29 Thread Soeren Sonnenburg
On Mon, 2007-01-29 at 12:13 +0100, Jiri Kosina wrote:
> On Mon, 29 Jan 2007, Jiri Kosina wrote:
> 
> > Ah, now I see. The problem is that in pre-2.6.20-rc1 the pb_fnmode was 
> > setting global variable, but after the HID layer rework, this is a 
> > per-hid variable, which is of course not updated when write to sysfs 
> > triggers. I will try to fix this before I send 2.6.20-rc6 updates to 
> > Linus, thanks for pointing this out.
> 
> Actually the cleanest solution would be when I change the code in such a 
> way that pb_fnmode parameter would be passed to hid instead of usbhid 
> module, as this is where the input mapping is being done (you could 
> potentially have a keyboard which needs the very same handling of fn mode 
> as usb powerbook keyboards currently have, but on different transport
> - input mapping is logically transport independent).
> 
> But I guess you will be not OK with breaking the backward compatibility in 
> such way, because all the already existing tutorials, etc. right? 

That sounds good for me. Breaking with what was there is not a problem
as long as this feature is still there, it can be done in a more clean
way this way, and the new  /sys/foo/bar path is documented (basically
people nowadays expect slight user interface changes between kernel
versions).

> Would warning that would trigger when the module parameter is passed to 
> usbhid and would instruct user to pass the parameter to hid module 
> instead, be acceptable? (and then changing the parameter of hid module 
> through sysfs would work as expected again).

I guess this warning is not too useful, except if it is triggered on
echo  >/sys/*/pb_fnmode too (which I suspect is what most people do).

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usbhid quirks for macbook(pro) updated to 2.6.20-rc6

2007-01-29 Thread Soeren Sonnenburg
On Mon, 2007-01-29 at 10:38 +0100, Jiri Kosina wrote:
> On Sat, 27 Jan 2007, Soeren Sonnenburg wrote:
[...]
> Soeren - could you please submit your patch with proper Signed-off-by 
> line?

argh, sorry!

Attached!

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
Signed-off-by: Soeren Sonnenburg <[EMAIL PROTECTED]>
Signed-off-by: Sergey Vlasov <[EMAIL PROTECTED]>

diff -ur linux-2.6.20-rc6/drivers/usb/input/hid-core.c linux-2.6.20-rc6-sonne/drivers/usb/input/hid-core.c
--- linux-2.6.20-rc6/drivers/usb/input/hid-core.c	2007-01-25 03:19:28.0 +0100
+++ linux-2.6.20-rc6-sonne/drivers/usb/input/hid-core.c	2007-01-27 14:55:30.0 +0100
@@ -777,6 +777,7 @@
 #define USB_DEVICE_ID_APPLE_GEYSER4_JIS	0x021c
 #define USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY	0x030a
 #define USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY	0x030b
+#define USB_DEVICE_ID_APPLE_IR		0x8240
 
 #define USB_VENDOR_ID_CHERRY		0x046a
 #define USB_DEVICE_ID_CHERRY_CYMOTION	0x0023
@@ -954,19 +955,21 @@
 
 	{ USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION, HID_QUIRK_CYMOTION },
 
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ISO, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_IR, HID_QUIRK_IGNORE },
 
 	{ USB_VENDOR_ID_PANJIT, 0x0001, HID_QUIRK_IGNORE },
 	{ USB_VENDOR_ID_PANJIT, 0x0002, HID_QUIRK_IGNORE },
@@ -1072,6 +1075,11 @@
 	if (quirks & HID_QUIRK_IGNORE)
 		return NULL;
 
+	if ((quirks & HID_QUIRK_IGNORE_MOUSE) &&
+		(interface->desc.bInterfaceProtocol == USB_INTERFACE_PROTOCOL_MOUSE))
+			return NULL;
+
+
 	if (usb_get_extra_descriptor(interface, HID_DT_HID, &hdesc) &&
 	(!interface->desc.bNumEndpoints ||
 	 usb_get_extra_descriptor(&interface->endpoint[0], HID_DT_HID, &hdesc))) {

--- linux-2.6.20-rc6/include/linux/hid.h	2007-01-25 03:19:28.0 +0100
+++ linux-2.6.20-rc6-sonne/include/linux/hid.h	2007-01-27 11:05:51.0 +0100
@@ -264,6 +264,7 @@
 #define HID_QUIRK_INVERT_HWHEEL			0x4000
 #define HID_QUIRK_POWERBOOK_ISO_KEYBOARD0x8000
 #define HID_QUIRK_BAD_RELATIVE_KEYS		0x0001
+#define HID_QUIRK

Re: 2.6.20-rc6 pb_fnmode regression

2007-01-29 Thread Soeren Sonnenburg
On Mon, 2007-01-29 at 10:55 +0100, Jiri Kosina wrote:
> On Sat, 27 Jan 2007, Soeren Sonnenburg wrote:
> 
> > I realized that any setting to /sys/module/usbhid/parameters/pb_fnmode 
> > is just ignored until the machine does a suspend-resume cycle.
[...]
> I would rather be inclined to just make the 
> /sys/module/usbhid/parameters/pb_fnmode read-only (which is what most of 
> the drivers do anyway), to avoid this kind of confusion.
> 
> Do you have really any strong use-case, when setting the parameter during 
> runtime would be much more useful than just do it during modprobe or 
> rmmod/modprobe cycle?

Well I need in-kernel usbhid and the way this was implemented in 2.6.19
(and before) one could change pb_fnmode on-the-fly. This is mentioned in
all the power/i/mac/book tutorials and everyone is used to switching
modes this way.

I can happily patch the kernel to use the pb_fnmode but nonetheless this
is a regression to pre 2.6.20* and will confuse others too...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.20-rc6 appletouch: incomplete data package (first byte: 2, length: 4).

2007-01-29 Thread Soeren Sonnenburg
Dear all,

I realized that when I compile/load ehci-hcd as a module on this
macbookpro1,1 that appletouch would stop functioning and send tons of

appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).
appletouch: incomplete data package (first byte: 2, length: 4).

*UNTIL* I rmmod appletouch and modprobe appletouch again:

usbcore: deregistering interface driver appletouch
appletouch Geyser 3 inited.
input: appletouch as /class/input/input12
usbcore: registered new interface driver appletouch
appletouch: incomplete data package (first byte: 2, length: 4).

After that it is working.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] 2.6.20-rc6 trivially enable mouse button 2+3 emulation for x86 macs

2007-01-29 Thread Soeren Sonnenburg
Dear all,

I would think this patch is too trivial to not be in 2.6.20:

As macbook/macbook pro's also have to live with a single mouse button
the following patch just enables the Macintosh device drivers menu in
Kconfig + adds the macintosh dir to the obj-* to make macbook* users
happy (who use exactly that since months


Signed-off-by: Soeren Sonnenburg <[EMAIL PROTECTED]>

diff -ur linux-2.6.20-rc6/drivers/macintosh/Kconfig 
linux-2.6.20-rc6-sonne/drivers/macintosh/Kconfig
--- linux-2.6.20-rc6/drivers/macintosh/Kconfig  2007-01-25 03:19:28.0 
+0100
+++ linux-2.6.20-rc6-sonne/drivers/macintosh/Kconfig2007-01-26 
21:30:51.0 +0100
@@ -1,6 +1,6 @@

 menu "Macintosh device drivers"
-   depends on PPC || MAC
+   depends on PPC || MAC || X86

 config ADB
bool "Apple Desktop Bus (ADB) support"

diff -ur linux-2.6.20-rc6/drivers/Makefile 
linux-2.6.20-rc6-sonne/drivers/Makefile
--- linux-2.6.20-rc6/drivers/Makefile   2007-01-25 03:19:28.0 +0100
+++ linux-2.6.20-rc6-sonne/drivers/Makefile 2007-01-26 21:30:51.0 +0100
@@ -30,7 +30,7 @@
 obj-y  += base/ block/ misc/ mfd/ net/ media/
 obj-$(CONFIG_NUBUS)+= nubus/
 obj-$(CONFIG_ATM)  += atm/
-obj-$(CONFIG_PPC_PMAC) += macintosh/
+obj-y  += macintosh/
 obj-$(CONFIG_IDE)  += ide/
 obj-$(CONFIG_FC4)  += fc4/
 obj-$(CONFIG_SCSI) += scsi/
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] usbhid quirks for macbook(pro) updated to 2.6.20-rc6

2007-01-29 Thread Soeren Sonnenburg
On Sat, 2006-12-23 at 11:38 +0100, Soeren Sonnenburg wrote:
> On Fri, 2006-12-15 at 09:56 -0800, Greg KH wrote:
> > On Fri, Dec 15, 2006 at 09:36:04AM +0100, Soeren Sonnenburg wrote:
> > > On Sat, 2006-12-09 at 21:08 -0500, Joseph Fannin wrote:
> > > > On Fri, 2006-12-08 at 18:19 +0100, Soeren Sonnenburg wrote:
> [...]
> > > Greg,
> > > 
> > > I've noticed that this patch is not in 2.6.20-rc1. Could you please
> > > comment on what is wrong with it / whether it will ever have a chance to
> > > be accepted in the way it is done ? 
> > 
> > It's in my queue right now, sorry.  I'll catch up on it in a few hours.

This is the updated version for 2.6.20-rc6.

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
diff -ur linux-2.6.20-rc6/drivers/usb/input/hid-core.c linux-2.6.20-rc6-sonne/drivers/usb/input/hid-core.c
--- linux-2.6.20-rc6/drivers/usb/input/hid-core.c	2007-01-25 03:19:28.0 +0100
+++ linux-2.6.20-rc6-sonne/drivers/usb/input/hid-core.c	2007-01-27 14:55:30.0 +0100
@@ -777,6 +777,7 @@
 #define USB_DEVICE_ID_APPLE_GEYSER4_JIS	0x021c
 #define USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY	0x030a
 #define USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY	0x030b
+#define USB_DEVICE_ID_APPLE_IR		0x8240
 
 #define USB_VENDOR_ID_CHERRY		0x046a
 #define USB_DEVICE_ID_CHERRY_CYMOTION	0x0023
@@ -954,19 +955,21 @@
 
 	{ USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION, HID_QUIRK_CYMOTION },
 
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ISO, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ANSI, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_JIS, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN },
-	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER3_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ANSI, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_ISO, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE | HID_QUIRK_POWERBOOK_ISO_KEYBOARD},
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER4_JIS, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_FOUNTAIN_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY, HID_QUIRK_POWERBOOK_HAS_FN | HID_QUIRK_IGNORE_MOUSE },
+
+	{ USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_IR, HID_QUIRK_IGNORE },
 
 	{ USB_VENDOR_ID_PANJIT, 0x0001, HID_QUIRK_IGNORE },
 	{ USB_VENDOR_ID_PANJIT, 0x0002, HID_QUIRK_IGNORE },
@@ -1072,6 +1075,11 @@
 	if (quirks & HID_QUIRK_IGNORE)
 		return NULL;
 
+	if ((quirks & HID_QUIRK_IGNORE_MOUSE) &&
+		(interface->desc.bInterfaceProtocol == USB_INTERFACE_PROTOCOL_MOUSE))
+			return NULL;
+
+
 	if (usb_get_extra_descriptor(interface, HID_DT_HID, &hdesc) &&
 	(!interface->desc.bNumEndpoints ||
 	 usb_get

2.6.20-rc6 pb_fnmode regression

2007-01-29 Thread Soeren Sonnenburg
Dear all,

I realized that any setting to  /sys/module/usbhid/parameters/pb_fnmode
is just ignored until the machine does a suspend-resume cycle.

I've added a printk in drivers/usb/input/hid-core.c (which is the only
place where hid->pb_fnmode is set) and indeed only on module load ( in
my case usbhid is compiled into the kernel - so on kernel boot) any
change to hid>pb_fnmode is done. Adding a printk to hidinput_pb_event()
in drivers/hid/hid-input.c says the same: hid->pb_fnmode cannot be
changed on the fly anymore...

HOWEVER: If I s2ram the machine hid->pb_fnmode is initialized with the
value I put into /sys/module/usbhid/parameters/pb_fnmode .

As I have no idea how/whether sysfs works/is possible I now hope someone
more knowledgable than me can resolve this issue!

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hotplug from the user side ?

2007-01-25 Thread Soeren Sonnenburg
On Wed, 2007-01-24 at 13:37 -0200, Henrique de Moraes Holschuh wrote:
> On Wed, 24 Jan 2007, Soeren Sonnenburg wrote:
> > might be a good idea to power down the drive using hdparm -Y followed by
> > a scsiadd -r.
[...]
> > the disk or remove the disk from a dm setup). However it is recommended
> > to power down the drive using hdparm -Y followed by a scsiadd -r as
> > stated above. One can validate which disks are attached using ``scsiadd
> 
> Again, this might change soon.

Ok, I think someone more knowledgable than me in this area should do the
final polishing und put it in the docs.

I don't anymore see how/that I can help here.
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.2 sky2/acpi crashes

2007-01-24 Thread Soeren Sonnenburg
[This mail was also posted to newsgate.kernel.]

On Tue, 23 Jan 2007 11:12:50 +, Andrew Lyon wrote:

> On 1/23/07, Soeren Sonnenburg <[EMAIL PROTECTED]> wrote:
>> On Tue, 23 Jan 2007 08:59:28 +, Lionel Landwerlin wrote:
>>
>> > Hi,
>> >
>> > I'm running a macbook with a Marvell ethernet controller, and I have a
>> > lots of freezes when using the ethernet controller under a load of
>> > ~100K/s. Since I'm running a 2.6.19.2 kernel, I'm able to get some
>> > report from the kernel. Here they are :
>>
>> I am also having trouble with the sky2 module, though I've not yet seen a
>> oops, the driver stopped working after some heavy traffic (copying some G
>> of data). Only rmmod sky2; modprobe sky2 resolved this. (I am also on
>> 2.6.19.2 but I've seen this happen on 2.6.20-rcX too).

[...]
> Ive also had the same problem with both 2.6.19.2 and 2.6.20-rcX,
> motherboard is gigabyte ga-965-ds3 , the networking stops completely
> under moderate traffic, I get the following errors or a complete
> lockup:
> 
> Jan 21 02:08:04 beast NETDEV WATCHDOG: eth0: transmit timed out
> Jan 21 02:08:04 beast sky2 eth0: tx timeout
> Jan 21 02:08:04 beast sky2 eth0: transmit ring 475 .. 452 report=475 done=475
> Jan 21 02:08:04 beast sky2 hardware hung? flushing
> 
> At the time I was downloading a iso image at 850k/sec, so not really a
> high network load at all.
> 
> rmmod / modprobe does resolve the issue, but more times than not the
> box locks up completely instead of getting those errors.

I am on a completely different system (macbook pro 1,1) with PREEMPT and
cpu frequency scaling on not sure how this could be related...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.2 sky2/acpi crashes

2007-01-24 Thread Soeren Sonnenburg
On Tue, 23 Jan 2007 11:12:50 +, Andrew Lyon wrote:

> On 1/23/07, Soeren Sonnenburg <[EMAIL PROTECTED]> wrote:
>> On Tue, 23 Jan 2007 08:59:28 +, Lionel Landwerlin wrote:
>>
>> > Hi,
>> >
>> > I'm running a macbook with a Marvell ethernet controller, and I have a
>> > lots of freezes when using the ethernet controller under a load of
>> > ~100K/s. Since I'm running a 2.6.19.2 kernel, I'm able to get some
>> > report from the kernel. Here they are :
>>
>> I am also having trouble with the sky2 module, though I've not yet seen a
>> oops, the driver stopped working after some heavy traffic (copying some G
>> of data). Only rmmod sky2; modprobe sky2 resolved this. (I am also on
>> 2.6.19.2 but I've seen this happen on 2.6.20-rcX too).

[...]
> Ive also had the same problem with both 2.6.19.2 and 2.6.20-rcX,
> motherboard is gigabyte ga-965-ds3 , the networking stops completely
> under moderate traffic, I get the following errors or a complete
> lockup:
> 
> Jan 21 02:08:04 beast NETDEV WATCHDOG: eth0: transmit timed out
> Jan 21 02:08:04 beast sky2 eth0: tx timeout
> Jan 21 02:08:04 beast sky2 eth0: transmit ring 475 .. 452 report=475 done=475
> Jan 21 02:08:04 beast sky2 hardware hung? flushing
> 
> At the time I was downloading a iso image at 850k/sec, so not really a
> high network load at all.
> 
> rmmod / modprobe does resolve the issue, but more times than not the
> box locks up completely instead of getting those errors.

I am on a completely different system (macbook pro 1,1) with PREEMPT and
cpu frequency scaling on not sure how this could be related...

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hotplug from the user side ?

2007-01-24 Thread Soeren Sonnenburg
On Wed, 2007-01-24 at 11:07 +0900, Tejun Heo wrote:
> Henrique de Moraes Holschuh wrote:
> > On Tue, 23 Jan 2007, Tejun Heo wrote:
> >> Henrique de Moraes Holschuh wrote:
> >>> Does SATA electrical conector keying let the disk firmware unload
> >>> heads before the user manages to pull it out enough to sever power?
> >> I don't think so.
[...]
> Agreed.

OK, so here comes the next revision, I hope it can be put in the docs/on
the web page now:

SATA Hotplug from the User Side

In general, before unplugging a drive you must stop using it. This means
unmounting all partitions, removing the disk from a potential raid
array / device mapper / crypt setup. Also depending on your disk-bay it
might be a good idea to power down the drive using hdparm -Y followed by
a scsiadd -r.

BIG FAT WARNING: if your SATA bay does not do electrical connector
keying to let the disk firmware unload heads before the user manages to
pull it out the drive will do an emergency head unload, which is not
good and will likely reduce the drive's lifetime.

• For SIL3114 and SIL3124, AHCI and the CK804 flavour of sata_nv you -
in principle - don't have to run any commands at all. It should notice
when you yank the cable, or plug in a new device. All you have to do is
to stop using the devices before unplugging, e.g. unmount partitions on
the disk or remove the disk from a dm setup). However it is recommended
to power down the drive using hdparm -Y followed by a scsiadd -r as
stated above. One can validate which disks are attached using ``scsiadd
-p''. In the following example the disk on scsi host 3 channel 0 id 0
lun 0 will be removed:

# scsiadd -p

Attached devices:
Host: scsi2 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA  Model: ST3400832AS  Rev: 3.01
  Type:   Direct-AccessANSI SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA  Model: ST3400620AS  Rev: 3.AA
  Type:   Direct-AccessANSI SCSI revision: 05

# scsiadd -r 3 0 0 0
 
Note that the device name may change from e.g. /dev/sdd to /dev/sde on a
remove/reinsert cycle (this can be fixed by using udev-provided
persistent names). Also note that it is perfectly normal to see messages
like this in dmesg:



ata4: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4.00: disabled
ata4: EH complete
ata4.00: detaching (SCSI 3:0:0:0)
Synchronizing SCSI cache for disk sdd: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <3>ata4: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port


ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-7, max UDMA/133, 1465149168 sectors: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/100
ata4: EH complete
scsi 3:0:0:0: Direct-Access ATA  ST3750640AS  3.AA PQ: 0
ANSI: 5
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
 sde: unknown partition table
sd 3:0:0:0: Attached scsi disk sde
sd 3:0:0:0: Attached scsi generic sg3 type 0

However if you happen to see messages like

scsi 3:0:0:0: rejecting I/O to dead device
scsi 4:0:0:0: rejecting I/O to dead device
scsi 5:0:0:0: rejecting I/O to dead device

you did not stop using the devices before unplugging (check that you
unmounted all partitions, removed the disk from a raid array, dmsetup,
cryptsetup).  If you have no pending IO to the device, there won't be
'rejects IO to dead device' messages.

• For other chipsets one in addition might have to run scsiadd -s on
reinserting the disk.
• Note that it might not be such a good idea to do the above on drivers
which don't implement the new EH yet. Those are as of January 2007:
sata_nv, sata_promise (getting there) and sata_sx4.

Soeren (beeing stuck in a train which again seems to be stuck in a lot
of snow somewhere in southern germany...)
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.2 sky2/acpi crashes

2007-01-23 Thread Soeren Sonnenburg
On Tue, 23 Jan 2007 08:59:28 +, Lionel Landwerlin wrote:

> Hi,
> 
> I'm running a macbook with a Marvell ethernet controller, and I have a
> lots of freezes when using the ethernet controller under a load of
> ~100K/s. Since I'm running a 2.6.19.2 kernel, I'm able to get some
> report from the kernel. Here they are :

I am also having trouble with the sky2 module, though I've not yet seen a
oops, the driver stopped working after some heavy traffic (copying some G
of data). Only rmmod sky2; modprobe sky2 resolved this. (I am also on
2.6.19.2 but I've seen this happen on 2.6.20-rcX too).

Soeren
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: prioritize PCI traffic ?

2007-01-15 Thread Soeren Sonnenburg
On Mon, 2007-01-15 at 19:23 +0530, Vaidyanathan Srinivasan wrote:
> Soeren Sonnenburg wrote:
> > Dear all,
> > 
> > is it possible to explicitly tell the kernel to prioritize PCI traffic
> > for a number of cards in pci slots x,y,z ?
> > 
> > I am asking as severe ide traffic causes lost frames when watching TV
> > using 2 DVB cards + vdr... This is simply due to the fact that the PCI
> > bus is saturated...
> 
> How do you know that the bus is saturated?

I simply dd if=/dev/sd? of=/dev/null from four brand new sata-harddisks.

> Are you streaming data to/from the ide hard disks/CDROM?

yes.

> Do you have DMAs 'ON' for the hard disks?

yes.

> Is everything just fine if there are no IDE traffic?

yes.

> Are you running 2.6 kernel with preempt 'ON'?

no: CONFIG_PREEMPT_NONE=y

> Are all hardware on the same IRQ line? (shared interrupts)

no: libata devices are on IRQ 16 and DVB devices on IRQ 20

> > So, is any prioritizing of the PCI bus possible ?
> 
> The drivers + application indirectly can control priority on the
> bus.  Just reduce the priority of the application that uses IDE and
> see if adjusting nice values of applications can change the scenario.

That unfortunately did not help... no change...

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: prioritize PCI traffic ?

2007-01-15 Thread Soeren Sonnenburg
On Mon, 2007-01-15 at 13:01 +0100, Andreas Mohr wrote:
> Hi,
> 
> On Mon, Jan 15, 2007 at 12:07:45PM +0100, Soeren Sonnenburg wrote:
> > Dear all,
> > 
> > is it possible to explicitly tell the kernel to prioritize PCI traffic
> > for a number of cards in pci slots x,y,z ?
> > 
> > I am asking as severe ide traffic causes lost frames when watching TV
> > using 2 DVB cards + vdr... This is simply due to the fact that the PCI
> > bus is saturated...
> > 
> > So, is any prioritizing of the PCI bus possible ?
> 
> You probably need to adjust PCI latency settings via setpci:
> 
> http://www-128.ibm.com/developerworks/library/l-hw2.html

Thanks, but I already tried this...

> Not sure whether this is a LKML related question ;)

Well I already tried to set maximum latencies etc to the cards to
prioritize to no avail... This did not make a difference though. Maybe
this is due to the fact that a lot more has to be transferred (not just
sound, but video data) and this is not possible in a single
transaction ... ?!

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


prioritize PCI traffic ?

2007-01-15 Thread Soeren Sonnenburg
Dear all,

is it possible to explicitly tell the kernel to prioritize PCI traffic
for a number of cards in pci slots x,y,z ?

I am asking as severe ide traffic causes lost frames when watching TV
using 2 DVB cards + vdr... This is simply due to the fact that the PCI
bus is saturated...

So, is any prioritizing of the PCI bus possible ?

Best
Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hotplug from the user side ?

2007-01-12 Thread Soeren Sonnenburg
On Sat, 2007-01-13 at 10:55 +0900, Tejun Heo wrote:
> Soeren Sonnenburg wrote:
> > It is true it detects a removal and newly plugged devices immediately...
> > However it still prints warnings and errors that it could not
> > synchronize SCSI cache for the disks. Then it prints regular 'rejects
> > I/O to dead device' warning messages and on replugging the disks puts
> > them to the next free sd device (e.g. sdc -> sdd).
> 
> You need to stop using the devices before unplugging.  If you have no
> pending IO to the device, there won't be 'rejects IO to dead device'
> messages.  You can ignore the SCSI cache sync failure if the device is
> properly closed before being unplugged.

Jeff & Tejun thanks *a lot* for clarifying this. I am quite happy to see
that this is working very reliably!

> > These messages sound eval - so now the question is should I care ?
> > ( On the other hand it did not crash the machine )
> 
> So, no, you don't really have to care.  Just make sure the device is
> unmounted prior to unplugging.

OK, but then this really should be in the SATA hotplug FAQ (or can one
fix this somehow?)... No user will ignore messages like this. What is
especially annoying is that udev on the first remove/insert cycle
created a new device node so the disk became /dev/sde (was /dev/sdd):
dmesg output of reinserting the disk 2 times follows:


ata4: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4.00: disabled
ata4: EH complete
ata4.00: detaching (SCSI 3:0:0:0)
Synchronizing SCSI cache for disk sdd: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <3>ata4: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port


ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-7, max UDMA/133, 1465149168 sectors: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/100
ata4: EH complete
scsi 3:0:0:0: Direct-Access ATA  ST3750640AS  3.AA PQ: 0
ANSI: 5
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
 sde: unknown partition table
sd 3:0:0:0: Attached scsi disk sde
sd 3:0:0:0: Attached scsi generic sg3 type 0


ata4: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: failed to recover some devices, retrying in 5 secs
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4.00: disabled
ata4: EH complete
ata4.00: detaching (SCSI 3:0:0:0)
Synchronizing SCSI cache for disk sde: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <3>ata4: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port
ata4: COMRESET failed (device not ready)
ata4: hardreset failed, retrying in 5 secs
ata4: hard resetting port


ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-7, max UDMA/133, 1465149168 sectors: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/100
ata4: EH complete
scsi 3:0:0:0: Direct-Access ATA  ST3750640AS  3.AA PQ: 0
ANSI: 5
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
SCSI device sde: 1465149168 512-byte hdwr sectors (750156 MB)
sde: Write Protect is off
sde: Mode Sense: 00 3a 00 00
SCSI device sde: drive cache: write back
 sde: unknown partition table
sd 3:0:0:0: Attached scsi disk sde
sd 3:0:0:0: Attached scsi generic sg3 type 0

remains /dev/sde ... 

Soeren
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hotplug from the user side ?

2007-01-12 Thread Soeren Sonnenburg
On Fri, 2007-01-12 at 12:04 -0500, Jeff Garzik wrote:
> Soeren Sonnenburg wrote:
> > Dear all,
> > 
> > I'd like to try out SATA hotplugging using a SIL3114. Though I was
> > harvesting the web, I could not find any useful information how this is
> > done in practice.
> > 
> > Well I realized that I can still use scsiadd to print and remove
> > devices, e.g.:
> 
> For SIL3114, you shouldn't have to run any commands at all.  It should 
> notice when you yank the cable, or plug in a new device.


It is true it detects a removal and newly plugged devices immediately...
However it still prints warnings and errors that it could not
synchronize SCSI cache for the disks. Then it prints regular 'rejects
I/O to dead device' warning messages and on replugging the disks puts
them to the next free sd device (e.g. sdc -> sdd).

These messages sound eval - so now the question is should I care ?
( On the other hand it did not crash the machine )

What follows is a change between to sata drives attached to port 4/5 of
the sil (ata5/ata6 here):

ata6: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata6: hard resetting port
ata6: SATA link down (SStatus 0 SControl 310)
ata6: failed to recover some devices, retrying in 5 secs
ata5: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata5: hard resetting port
ata5: SATA link down (SStatus 0 SControl 310)
ata5: failed to recover some devices, retrying in 5 secs
ata6: hard resetting port
ata6: SATA link down (SStatus 0 SControl 310)
ata6: failed to recover some devices, retrying in 5 secs
ata5: hard resetting port
ata5: SATA link down (SStatus 0 SControl 310)
ata5: failed to recover some devices, retrying in 5 secs
ata6: hard resetting port
ata6: SATA link down (SStatus 0 SControl 310)
ata6.00: disabled
ata6: EH complete
ata6.00: detaching (SCSI 5:0:0:0)
Synchronizing SCSI cache for disk sdd: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <6>ata5: hard resetting port
ata5: SATA link down (SStatus 0 SControl 310)
ata5.00: disabled
ata5: EH complete
ata5.00: detaching (SCSI 4:0:0:0)
Synchronizing SCSI cache for disk sdc: 
FAILED
  status = 0, message = 00, host = 4, driver = 00
  <3>ata6: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata6: hard resetting port
ata6: port is slow to respond, please be patient (Status 0xff)
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata6.00: ATA-7, max UDMA/133, 1465149168 sectors: LBA48 NCQ (depth 0/32)
ata6.00: configured for UDMA/100
ata6: EH complete
scsi 5:0:0:0: Direct-Access ATA  ST3750640AS  3.AA PQ: 0
ANSI: 5
SCSI device sdf: 1465149168 512-byte hdwr sectors (750156 MB)
sdf: Write Protect is off
sdf: Mode Sense: 00 3a 00 00
SCSI device sdf: drive cache: write back
SCSI device sdf: 1465149168 512-byte hdwr sectors (750156 MB)
sdf: Write Protect is off
sdf: Mode Sense: 00 3a 00 00
SCSI device sdf: drive cache: write back
 sdf: unknown partition table
sd 5:0:0:0: Attached scsi disk sdf
sd 5:0:0:0: Attached scsi generic sg2 type 0
scsi 4:0:0:0: rejecting I/O to dead device
scsi 4:0:0:0: rejecting I/O to dead device
scsi 5:0:0:0: rejecting I/O to dead device
scsi 5:0:0:0: rejecting I/O to dead device
ata5: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata5: hard resetting port
ata5: port is slow to respond, please be patient (Status 0xff)
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata5.00: ATA-7, max UDMA/133, 1465149168 sectors: LBA48 NCQ (depth 0/32)
ata5.00: configured for UDMA/100
ata5: EH complete
scsi 4:0:0:0: Direct-Access ATA  ST3750640AS  3.AA PQ: 0
ANSI: 5
SCSI device sdg: 1465149168 512-byte hdwr sectors (750156 MB)
sdg: Write Protect is off
sdg: Mode Sense: 00 3a 00 00
SCSI device sdg: drive cache: write back
SCSI device sdg: 1465149168 512-byte hdwr sectors (750156 MB)
sdg: Write Protect is off
sdg: Mode Sense: 00 3a 00 00
SCSI device sdg: drive cache: write back
 sdg: unknown partition table
sd 4:0:0:0: Attached scsi disk sdg
sd 4:0:0:0: Attached scsi generic sg3 type 0

Best,
Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA hotplug from the user side ?

2007-01-11 Thread Soeren Sonnenburg
Dear all,

I'd like to try out SATA hotplugging using a SIL3114. Though I was
harvesting the web, I could not find any useful information how this is
done in practice.

Well I realized that I can still use scsiadd to print and remove
devices, e.g.:

# scsiadd -p

Attached devices:
Host: scsi2 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA  Model: ST3400832AS  Rev: 3.01
  Type:   Direct-AccessANSI SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA  Model: ST3400620AS  Rev: 3.AA
  Type:   Direct-AccessANSI SCSI revision: 05

# scsiadd -r 3 0 0 0

Is this all one has to do for hotplugging ? I am asking as I find this
in dmesg when I do so (2.6.19.* kernel):

Synchronizing SCSI cache for disk sdb: 
ata4.00: disabled
ata4: exception Emask 0x10 SAct 0x0 SErr 0x1 action 0x2 frozen
ata4: hard resetting port
ata4: SATA link down (SStatus 0 SControl 310)
ata4: EH complete
ata4: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen
ata4: hard resetting port
ata4: port is slow to respond, please be patient (Status 0xff)
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-7, max UDMA/133, 781422768 sectors: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/100
ata4: EH complete
scsi 3:0:0:0: rejecting I/O to dead device
scsi 3:0:0:0: rejecting I/O to dead device


Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >