Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-07 Thread Alex Williamson

   Applied, thanks,

Alex

-- 
Alex Williamson HP Open Source  Linux Org.


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-07 Thread Alex Williamson
Hi Akio,

   Unfortunately I've found a problem with this patch since committing
it.  My system has a 2 port e1000 card that shows up as PCI devices
:01:02.0 and :01:02.1.  I hide function 1 from dom0 using
pciback.hide=(:01:02.1).  Without trying to start up any guest
domains, on reboot dom0 dies with the NaT consumption fault shown below.
I've disabled the Xen call to free_irq_vector() until we can figure out
what's going wrong.  Thanks,

Alex

Will now restart.
reboot[2702]: NaT consumption 17179869216 [1]
Modules linked in:

Pid: 2702, CPU 0, comm:   reboot
psr : 0010085a6010 ifs : 8309 ip  : [a001000ac9b0]Not 
tainted
ip is at notifier_call_chain+0x30/0xc0
unat:  pfs : 8207 rsc : 000b
rnat: f5e9cbd88000 bsps: ffe9 pr  : 0055a959
ldrs:  ccv :  fpsr: 0009804c0270033f
csd :  ssd : 
b0  : a001000af4f0 b6  : a00100017e20 b7  : a00100017df0
f6  : 0 f7  : 0
f8  : 0 f9  : 0
f10 : 0 f11 : 0
r1  : a0010104eac0 r2  : 8792 r3  : e0007e770028
r8  :  r9  : e0007e7702b8 r10 : 
r11 : 0008 r12 : e0007e777d30 r13 : e0007e77
r14 : 45584543 r15 : 0001 r16 : 
r17 : 01234567 r18 : 0020 r19 : 0008
r20 : 20244200 r21 : 0009804c8a70033f r22 : e0007e770f70
r23 : 6fff7fffc0c8 r24 :  r25 : 
r26 : c10a r27 :  r28 : 40002370
r29 : 1213085a6010 r30 :  r31 : a00100c41e00

Call Trace:
 [a0010001cfc0] show_stack+0x40/0xa0
sp=e0007e50 bsp=e0007e7711c0
 [a0010001d8c0] show_regs+0x840/0x880
sp=e0007e777920 bsp=e0007e771168
 [a001000424e0] die+0x1c0/0x3c0
sp=e0007e777920 bsp=e0007e771120
 [a00100042730] die_if_kernel+0x50/0x80
sp=e0007e777940 bsp=e0007e7710f0
 [a00100043880] ia64_fault+0x1120/0x1240
sp=e0007e777940 bsp=e0007e771098
 [a00100069860] xen_leave_kernel+0x0/0x3b0
sp=e0007e777b60 bsp=e0007e771098
 [a001000ac9b0] notifier_call_chain+0x30/0xc0
sp=e0007e777d30 bsp=e0007e771050
 [a001000af4f0] kernel_restart_prepare+0x30/0x80
sp=e0007e777d30 bsp=e0007e771030
 [a001000af560] kernel_restart+0x20/0xe0
sp=e0007e777d30 bsp=e0007e771010
 [a001000b2b50] sys_reboot+0x3b0/0x480
sp=e0007e777d30 bsp=e0007e770f90
 [a00100013f20] ia64_ret_from_syscall+0x0/0x40
sp=e0007e777e30 bsp=e0007e770f90
 [a0010620] __kernel_syscall_via_break+0x0/0x20
sp=e0007e778000 bsp=e0007e770f90
 /etc/rc6.d/S90reboot: line 17:  2702 Segmentation fault  reboot -d -f -i


-- 
Alex Williamson HP Open Source  Linux Org.


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-07 Thread Akio Takebe
Hi, Alex

   Unfortunately I've found a problem with this patch since committing
it.  My system has a 2 port e1000 card that shows up as PCI devices
:01:02.0 and :01:02.1.  I hide function 1 from dom0 using
pciback.hide=(:01:02.1).  Without trying to start up any guest
domains, on reboot dom0 dies with the NaT consumption fault shown below.
This is curious. free_irq_vector() is called at shutdown handler.
But this notifier_call_chain is called before shutdown handler.
Hmmm... I'll investigate it.
Could you send me the result of lspci -vv on your system?

I've disabled the Xen call to free_irq_vector() until we can figure out
what's going wrong.  Thanks,

Thanks.
Isn't this issue occurred by using your patch?

reboot[2702]: NaT consumption 17179869216 [1]
Modules linked in:

Pid: 2702, CPU 0, comm:   reboot
psr : 0010085a6010 ifs : 8309 ip  : [a001000ac9b0]
Not tainted
ip is at notifier_call_chain+0x30/0xc0

isr(17179869216 = 0x40020) show Data NaT Page Consumption.

The below is disassemble of notifier_call_chain().
r32 is pointer of notifier_block list.
I'll check it first.

a001000ac760 notifier_call_chain:
a001000ac760:   00 20 25 0c 80 05   [MII]   alloc 
r36=ar.pfs,9,6,0
a001000ac766:   30 02 00 62 00 a0   mov r35=b0
a001000ac76c:   04 08 00 84 mov r37=r1
a001000ac770:   11 00 01 40 18 10   [MIB]   ld8 r32=[r32]
a001000ac776:   80 00 00 00 42 00   mov r8=r0
a001000ac77c:   00 00 00 20 nop.b 0x0;;
a001000ac780:   10 00 00 00 01 00   [MIB]   nop.m 0x0
a001000ac786:   60 00 80 0e 72 03   cmp.eq p6,p7=0,r32
a001000ac78c:   80 00 00 43   (p06) br.cond.dpnt.few 
a001000ac800 notifier_call_chain+0xa0
a001000ac790:   08 00 00 00 01 00   [MMI]   nop.m 0x0
a001000ac796:   80 00 80 30 20 c0   ld8 r8=[r32]  
HERE???
a001000ac79c:   04 00 01 84 mov r38=r32
a001000ac7a0:   09 38 01 42 00 21   [MMI]   mov r39=r33
a001000ac7a6:   80 02 88 00 42 00   mov r40=r34
a001000ac7ac:   84 00 01 84 adds r32=8,r32;;
a001000ac7b0:   0a 70 20 10 18 14   [MMI]   ld8 r14=[r8],8;;
a001000ac7b6:   00 00 00 02 00 c0   nop.m 0x0
a001000ac7bc:   e0 08 00 07 mov b6=r14
a001000ac7c0:   13 08 00 10 18 10   [MBB]   ld8 r1=[r8]
a001000ac7c6:   00 00 00 00 10 00   nop.b 0x0
a001000ac7cc:   68 00 80 10 br.call.sptk.many 
b0=b6;;
a001000ac7d0:   10 00 00 00 01 00   [MIB]   nop.m 0x0
a001000ac7d6:   80 f0 20 12 28 00   tbit.z p8,p9=r8,15
a001000ac7dc:   00 00 00 20 nop.b 0x0
a001000ac7e0:   1c 08 00 4a 00 21   [MFB]   mov r1=r37
a001000ac7e6:   00 00 00 02 80 04   nop.f 0x0
a001000ac7ec:   20 00 00 43   (p09) br.cond.dpnt.few 
a001000ac800 notifier_call_chain+0xa0
a001000ac7f0:   12 00 01 40 18 10   [MBB]   ld8 r32=[r32]
a001000ac7f6:   00 00 00 00 10 00   nop.b 0x0
a001000ac7fc:   90 ff ff 48 br.few 
a001000ac780 notifier_call_chain+0x20
a001000ac800:   10 00 00 00 01 00   [MIB]   nop.m 0x0
a001000ac806:   00 18 05 80 03 00   mov b0=r35
a001000ac80c:   00 00 00 20 nop.b 0x0
a001000ac810:   11 00 00 00 01 00   [MIB]   nop.m 0x0
a001000ac816:   00 20 01 55 00 80   mov.i ar.pfs=r36
a001000ac81c:   08 00 84 00 br.ret.sptk.many 
b0;;


Best Regards,

Akio Takebe


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-07 Thread Alex Williamson
On Fri, 2006-12-08 at 10:50 +0900, Akio Takebe wrote:
 Hi, Alex
 
Unfortunately I've found a problem with this patch since committing
 it.  My system has a 2 port e1000 card that shows up as PCI devices
 :01:02.0 and :01:02.1.  I hide function 1 from dom0 using
 pciback.hide=(:01:02.1).  Without trying to start up any guest
 domains, on reboot dom0 dies with the NaT consumption fault shown below.
 This is curious. free_irq_vector() is called at shutdown handler.
 But this notifier_call_chain is called before shutdown handler.
 Hmmm... I'll investigate it.

   It looked perhaps like the problem started on bootup.  I saw that
free_irq_vector() was called for the vector assigned to the second
function, right after e1000 claimed the first function on the device.
The network using the first function on the device was actually a little
flaky while dom0 was running too.

 Could you send me the result of lspci -vv on your system?

   Yep, see below.

 I've disabled the Xen call to free_irq_vector() until we can figure out
 what's going wrong.  Thanks,
 
 Thanks.
 Isn't this issue occurred by using your patch?

   No, it only occurs if Xen calls free_irq_vector(), so I suspect the
NaT is just the result of something bad happening much earlier.  Vector
54 in the below output is the one that gets freed.  Thanks,

Alex

-- 
Alex Williamson HP Open Source  Linux Org.


00:01.0 Class ff00: Hewlett-Packard Company Unknown device 1303
Subsystem: Hewlett-Packard Company Unknown device 1303
Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 128 (250ns max), Cache Line Size: 128 bytes
Interrupt: pin A routed to IRQ 0
Capabilities: [e8] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/2 
Enable-
Address:   Data: 

00:01.1 Communication controller: Hewlett-Packard Company Unknown device 1302
Subsystem: Hewlett-Packard Company Unknown device 1302
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 128 (250ns max), Cache Line Size: 128 bytes
Interrupt: pin A routed to IRQ 0
Region 1: Memory at 84054000 (64-bit, non-prefetchable) [size=4K]
Region 3: Memory at 8402 (64-bit, prefetchable) [size=128K]
Capabilities: [e8] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/3 
Enable-
Address:   Data: 

00:01.2 Serial controller: Hewlett-Packard Company Diva Serial [GSP] Multiport 
UART (prog-if 02 [16550])
Subsystem: Hewlett-Packard Company Diva RMP3
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 128 (250ns max), Cache Line Size: 128 bytes
Interrupt: pin A routed to IRQ 0
Region 1: Memory at 84053000 (64-bit, non-prefetchable) [size=4K]
Capabilities: [e8] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/2 
Enable-
Address:   Data: 

00:02.0 USB Controller: NEC Corporation USB (rev 43) (prog-if 10 [OHCI])
Subsystem: Unknown device 03f0:0226
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 128 (250ns min, 10500ns max), Cache Line Size: 128 bytes
Interrupt: pin A routed to IRQ 56
Region 0: Memory at 84052000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:02.1 USB Controller: NEC Corporation USB (rev 43) (prog-if 10 [OHCI])
Subsystem: Unknown device 03f0:0226
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: 

Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-04 Thread Aron Griffis
Hi Akio,

I'm sorry, but I'm confused by your message.  Are you saying there are
two problems here?  One problem in xen/ia64 and one problem in the
e1000 driver?

You sent a patch, I guess that is for xen-ia64-unstable, right?  If
that is ready to be applied, could you include a description of the
problem and what the patch does?

Regarding the e1000 bug, could you comment in the RH bug?
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=208599

Thanks,
Aron

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Patch] fix warnings while rebooting

2006-12-04 Thread Akio Takebe
Hi, Aron

Hi Akio,

I'm sorry, but I'm confused by your message.  Are you saying there are
two problems here?  One problem in xen/ia64 and one problem in the
e1000 driver?
Yes, there are two problem here.
1. double free message is happened
2. CallTrace is happened

problem 1 is a issue of free_irq_vector.
My patch fix this problem.

problem 2 is a issue of some network drivers.
suspend handlers of e1000, tg3 and so on are not called free_irq().
free_irq() is called by only close handlers of them.
So if close handlers are not called before suspend handlers,
iosapic_unregister_intr() call WARN_ON(1).

iosapic_unregister_intr (unsigned int gsi)
{
unsigned long flags;
int irq, vector, index;
[snip...]
memset(iosapic_intr_info[vector], 0,
   sizeof(struct iosapic_intr_info));
iosapic_intr_info[vector].low32 |= IOSAPIC_MASK;
INIT_LIST_HEAD(iosapic_intr_info[vector].rtes);

if (idesc-action) {
printk(KERN_ERR
   interrupt handlers still exist on
   IRQ %u\n, irq);
WARN_ON(1); HERE!!
}

/* Free the interrupt vector */
free_irq_vector(vector);
}
[snip...]
}

I think there are three solutions.
A. do # /etc/xen/scripts/network-bridge stop before reboot
   I think this is the best solution. But if we do that, where is better?
   /etc/init.d/network or /etc/init.d/xend?
   And How do we do in the case of routing mode?
   
B. apply the e1000 patch(I think other driver also apply likely patch.)
   I think the better solution.
   But I'm not familiar with e1000 driver.
   So I'd like to review it by RH Engineer and community people.
   
   The patch is the below.
   
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=edd106fc8ac1826dbe231b70ce0762db24133e5c;hp=e78181feb0b94fb6afeaef3b28d4f5df1b847c98
   
C. ifdef the WARN_ON(1) in iosapic_unregister_intr.
   This is the easiest solution.
   And because Xen don't do I/O Hotplug, this may be the best.

You sent a patch, I guess that is for xen-ia64-unstable, right?  If
that is ready to be applied, could you include a description of the
problem and what the patch does?

Yes, my patch is for xen-ia64-unstable.
My patch fix problem 1 (double free messages).
I already have patches for RHEL5 beta.
I'll send you soon if my patch is applied in xen-ia64-unstable.

- Bug escription
  Please see the following two functions.
  assign_irq_vector() is para-virulized, free_irq_vector() is not 
para-virtualized.
  So ia64_vector_mask is not used in dom0 kernel.
  Though free_irq_vector() try to clear ia64_vector_mask in dom0 kernel,
  ia64_vector_mask is always zero, so the double free message is happened.
  
int
assign_irq_vector (int irq)
{
int pos, vector;

#ifdef CONFIG_XEN
if (is_running_on_xen()) {
extern int xen_assign_irq_vector(int);
return xen_assign_irq_vector(irq);
}
#endif
 again:
pos = find_first_zero_bit(ia64_vector_mask, IA64_NUM_DEVICE_VECTORS);
vector = IA64_FIRST_DEVICE_VECTOR + pos;
if (vector  IA64_LAST_DEVICE_VECTOR)
return -ENOSPC;
if (test_and_set_bit(pos, ia64_vector_mask))
goto again;
return vector;
}

void
free_irq_vector (int vector)
{
int pos;

if (vector  IA64_FIRST_DEVICE_VECTOR || vector  
IA64_LAST_DEVICE_VECTOR)
return;

pos = vector - IA64_FIRST_DEVICE_VECTOR;
if (!test_and_clear_bit(pos, ia64_vector_mask))
printk(KERN_WARNING %s: double free!\n, __FUNCTION__); 
HERE
}



- What the patch does?
  I do that free_irq_vector() is para-virtualized.


Regarding the e1000 bug, could you comment in the RH bug?
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=208599

Yes.

Best Regards,

Akio Takebe


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel