Executive summary: please try the r12453 sid snapshot once it appears.
apt repo lines are available on http://wiki.debian.org/DebianKernel

In the later report (Paul Hedderly) the erroring rip
(0xffffffff80227fe2) which corresponds to set_cpus_allowed_ptr+31 (or
0x1f out of 0xe0) is:

0xffffffff80227fda <set_cpus_allowed_ptr+23>:   callq  0xffffffff8022493b 
<task_rq_lock>
0xffffffff80227fdf <set_cpus_allowed_ptr+28>:   mov    %rax,%r13
0xffffffff80227fe2 <set_cpus_allowed_ptr+31>:   mov    (%rbx),%rax
0xffffffff80227fe5 <set_cpus_allowed_ptr+34>:   and    $0xffffffffffffffff,%eax
0xffffffff80227fe8 <set_cpus_allowed_ptr+37>:   test   %rax,0x3e5819(%rip)      
  # 0xffffffff8060d808 <cpu_online_map>

The fault is on the address in %rbx (0xffffffffff5f7000).

0xffffffff80227fe2 is in set_cpus_allowed_ptr (kernel/sched.c:5628).
5623            unsigned long flags;
5624            struct rq *rq;
5625            int ret = 0;
5626    
5627            rq = task_rq_lock(p, &flags);
5628            if (!cpus_intersects(*new_mask, cpu_online_map)) {
5629                    ret = -EINVAL;
5630                    goto out;
5631            }
5632    

I believe %rbx is new_mask. In the earlier two reports (both Andrea
Janna's) the erroring rip (0xffffffff80228045) doesn't precisely match
this but says it set_cpus_allowed_ptr+0x1f/0xe0 which is the same as in
Paul's report so I think it safe to say the versions we're simply linked
slightly differently and it's the same instruction.

The caller of set_cpus_allowed_ptr was
":processor:acpi_processor_get_throttling+0x45/0x6a" which is
0x00000000000004fa <acpi_processor_get_throttling+64>:  callq  0x4ff 
<acpi_processor_get_throttling+69>
0x00000000000004ff <acpi_processor_get_throttling+69>:  mov    %rbx,%rdi
(odd address since this is an unlinked .ko file)

0x4fa is in acpi_processor_get_throttling 
(drivers/acpi/processor_throttling.c:841).
836                     return -ENODEV;
837             /*
838              * Migrate task to the cpu pointed by pr.
839              */
840             saved_mask = current->cpus_allowed;
841             set_cpus_allowed_ptr(current, &cpumask_of_cpu(pr->id));
842             ret = pr->throttling.acpi_processor_get_throttling(pr);
843             /* restore the previous state */
844             set_cpus_allowed_ptr(current, &saved_mask);
845     

So this suggests that &cpumask_of_cpu(pr->id) is somehow bogus.

pr came from acpi_processor_start() via
acpi_processor_get_throttling_info(). Just before the call to
acpi_processor_get_throttling_info() in acpi_processor_start() we see:

        #ifdef CONFIG_XEN
                BUG_ON(pr->acpi_id >= NR_ACPI_CPUS);
                if (processor_device_array[pr->acpi_id] != NULL &&
                    processor_device_array[pr->acpi_id] != device) {
        #else
                if (processor_device_array[pr->id] != NULL &&
                    processor_device_array[pr->id] != device) {
        #endif /* CONFIG_XEN */
                        printk(KERN_WARNING "BIOS reported wrong ACPI id "
                                "for the processor\n");
                        return -ENODEV;
                }
        #ifdef CONFIG_XEN
                processor_device_array[pr->acpi_id] = device;
                if (pr->id != -1)
                        processors[pr->id] = pr;
        #else
                processor_device_array[pr->id] = device;
        
                processors[pr->id] = pr;
        #endif /* CONFIG_XEN */
        
This code is fairly recent in the linux-2.6.18-xen.hg tree and comes
from a combination of two changesets (one adds the feature, the other
unbreaks native build resulting in the ifdef'ery seen above):

http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/d62d60eaba6e
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/e39cf97647af

There are bunch of changes subsequent to these but
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/de7f94bd650b looks
pretty interesting:

        changeset:   713:de7f94bd650b
        user:        Keir Fraser <[EMAIL PROTECTED]>
        date:        Tue Oct 28 10:39:11 2008 +0000
        files:       drivers/acpi/processor_core.c
        description:
        dom0: Fix for throttling while pr->id == -1
        
        Signed-off-by: Wei Gang <[EMAIL PROTECTED]>

This changeset is not present in our current kernel tree. I have added
it and it will show up in the snapshot builds shortly.

Ian.

-- 
Ian Campbell

No passing.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to