On Wed, Jul 25, 2007 at 11:26:42AM -0400, Doug Chapman wrote:
> On Wed, 2007-07-25 at 22:37 +0900, Horms wrote:
>
> > I was also seeing a strange problem relating to the
> > vector domain patch which seemed to be causing
> > corruption of vectors_in_migration, which caused migrate_irqs()
> > to emmit suprious IRQ errors (when called by kexec).
> >
> > I'll try and confirm that this patch soles the problem that
> > I was seeing tomorrow.
> >
>
> You may also want to try this patch:
> http://www.mail-archive.com/[email protected]/msg03113.html
Hi Doug, Hi Ishimatsu-san,
I've tested both of these patches against my problem,
and I notice that they have both been incoporated into
Linus's tree.
It seems that "vector-domain - handle assign_irq_vector(AUTO_ASSIGN)"
(8f5ad1a8227aa110d633b5ed04dde535381c16c7) had no effect on
the problem that I was seeing. But "vector-domain - fix vector_table"
(6ffbc82351c62eeeeaeb9e817ddf93049353493d) appears to resolve the
problem.
As I spent quite a lot of time examining this problem I'll
put my findings below, on the off chance they are of use to
someone in the future.
In my .bss I see that vector_table is right next to
vectors_in_migration, so it seems to make a lot of sense
that inapropriate access to vector_table was corrupting
vectors_in_migration. Furthermore, I added farily large
array, vectors_in_migration_guard between vectors_in_migration and
vector_table and the problem went away, wich seems to futher
pack up the coruption caused by access to vector_table idea.
a000000100587eb8 <vectors_in_migration>:
...
a0000001005884b8 <vector_table>:
...
I guess that if CPU_HOTPLUG was disabled then some other table
would be corrupted, perhaps one that is accessed much more often
than vectors_in_migration.
For the record, the IRQ errors on kexec
were being caused by fixup_irqs() making inapropriate
calls to generic_handle_irq() due to the corruption of
vectors_in_migration. fixup_irqs() is indirectly called by cpu_down().
The log on a system with NR_CPUS=4 is below:
# do_kexec
Kexec: Linux->Linux
Create ramdisk
19296 /tmp/initramfs_data.cpio
kexec-ia64 -l "/boot/vmlinux-ia64-kexec.gz" \
--initrd=/tmp/initramfs_data.cpio \
--append="NAME=rx2620 ip=on loglevel=8 console=tty0
console=uart,mmio,0xff5e0000,115200n8"
Kexec
kexec-ia64 -e
Starting new kernel
ifdown: socket: Function not implemented
irq 318, desc: a00000010050cb00, depth: 1, count: 0, unhandled: 0
->handle_irq(): a000000100437c80, __end_rodata+0x34d8/0x13858
->chip(): a000000100563848, no_irq_chip+0x0/0x80
->action(): 0000000000000000
IRQ_DISABLED set
Unexpected irq vector 0x13e on CPU 1!
irq 344, desc: a00000010050d800, depth: 1, count: 0, unhandled: 0
->handle_irq(): a000000100437c80, __end_rodata+0x34d8/0x13858
->chip(): a000000100563848, no_irq_chip+0x0/0x80
->action(): 0000000000000000
IRQ_DISABLED set
Unexpected irq vector 0x158 on CPU 1!
irq 346, desc: a00000010050d900, depth: 1, count: 0, unhandled: 0
->handle_irq(): a000000100437c80, __end_rodata+0x34d8/0x13858
->chip(): a000000100563848, no_irq_chip+0x0/0x80
->action(): 0000000000000000
IRQ_DISABLED set
Unexpected irq vector 0x15a on CPU 1!
irq 350, desc: a00000010050db00, depth: 1, count: 0, unhandled: 0
->handle_irq(): a000000100437c80, __end_rodata+0x34d8/0x13858
->chip(): a000000100563848, no_irq_chip+0x0/0x80
->action(): 0000000000000000
IRQ_DISABLED set
Unexpected irq vector 0x15e on CPU 1!
CPU 1 is now offline
Linux version 2.6.23-rc1-kexec-ge4903fb5-dirty ([EMAIL PROTECTED]) (gcc version
3.4.5) #173 SMP Thu Jul 26 11:36:46 JST 2007
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html