Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Gilles Chanteperdrix
Philippe Gerum wrote:
   I never observe this on my dual PIII whereas I always have the NMI
   watchdog option enabled. Are you running with or without the tracer ?
   
  
  w/o.

enabling xeno nmi watchdog whereas the nucleus module is built-in break
Linux nmi watchdog test. Maybe some setups are not done at Linux level
when this test fails, which could explain some weird behaviour
afterwards.

Do you have the message:
Testing NMI watchdog... CPU#0: NMI appears to be stuck (x-y)!

In Linux boot messages ?

-- 


Gilles Chanteperdrix.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Philippe Gerum
On Sun, 2006-07-09 at 14:50 +0200, Gilles Chanteperdrix wrote:
 Philippe Gerum wrote:
I never observe this on my dual PIII whereas I always have the NMI
watchdog option enabled. Are you running with or without the tracer ?

   
   w/o.
 
 enabling xeno nmi watchdog whereas the nucleus module is built-in break
 Linux nmi watchdog test. Maybe some setups are not done at Linux level
 when this test fails, which could explain some weird behaviour
 afterwards.
 
 Do you have the message:
 Testing NMI watchdog... CPU#0: NMI appears to be stuck (x-y)!
 
 In Linux boot messages ?

Nope. The NMI test looks ok.

Linux version 2.6.17-ipipe ([EMAIL PROTECTED]) (gcc version 3.3.3 (Debian
20040321)) #1 SMP Tue Jul 11 03:17:18 CEST 2006
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fffd000 (usable)
 BIOS-e820: 3fffd000 - 3000 (ACPI data)
 BIOS-e820: 3000 - 4000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820:  - 0001 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f6e80
On node 0 totalpages: 262141
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 225280 pages, LIFO batch:31
  HighMem zone: 32765 pages, LIFO batch:7
DMI 2.0 present.
Intel MultiProcessor Specification v1.1
Virtual Wire compatibility mode.
OEM ID: OEM0 Product ID: PROD APIC at: 0xFEE0
Processor #1 6:8 APIC version 17
Processor #0 6:8 APIC version 17
I/O APIC #2 Version 17 at 0xFEC0.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Processors: 2
Allocating PCI resources starting at 5000 (gap: 4000:bec0)
Built 1 zonelists
Kernel command line: root=/dev/hdc1 ro nmi_watchdog=1 vga=1 console=tty0
console=ttyS0,115200
mapped APIC to d000 (fee0)
mapped IOAPIC to c000 (fec0)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 751.749 MHz processor.
Using tsc for high-res timesource
I-pipe 1.3-07: pipeline enabled.
Console: colour VGA+ 80x50
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1032916k/1048564k available (2491k kernel code, 15100k reserved,
743k data, 272k init, 131060k highmem)
Checking if this processor honours the WP bit even in supervisor mode...
Ok.
Calibrating delay using timer specific routine.. 1505.65 BogoMIPS
(lpj=7528285)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff   
  
CPU: After vendor identify, caps: 0383fbff   
  
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
CPU: After all inits, caps: 0383fbff   0040 
 
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 28k freed
CPU0: Intel Pentium III (Coppermine) stepping 03
Booting processor 1/0 eip 2000
Initializing CPU#1
Calibrating delay using timer specific routine.. 1503.45 BogoMIPS
(lpj=7517293)
CPU: After generic identify, caps: 0383fbff   
  
CPU: After vendor identify, caps: 0383fbff   
  
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
CPU: After all inits, caps: 0383fbff   0040 
 
CPU1: Intel Pentium III (Coppermine) stepping 03
Total of 2 processors activated (3009.11 BogoMIPS).
ExtINT not setup in hardware but reported by MP table
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=0 pin2=0
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=1897
NET: Registered protocol family 16
EISA bus registered
PCI: PCI BIOS revision 2.10 entry at 0xf0730, last bus=1
Setting up standard PCI resources
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI quirk: region e400-e43f claimed by PIIX4 ACPI
PCI quirk: region e800-e80f claimed by PIIX4 SMB
PIIX4 devres B PIO at 0290-0297
Boot video device is :01:00.0
PCI: Using IRQ router PIIX/ICH [8086/7110] at :00:04.0
PCI: Bridge: :00:01.0
  IO window: disabled.
  MEM window: e180-e2df
  PREFETCH window: e2f0-e3ff
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 9, 2621440 bytes)
TCP bind hash table entries: 65536 (order: 8, 1310720 bytes)
TCP: Hash tables configured (established 131072 bind 

Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Jan Kiszka
Philippe Gerum wrote:
 On Sun, 2006-07-09 at 14:50 +0200, Gilles Chanteperdrix wrote:
 Philippe Gerum wrote:
I never observe this on my dual PIII whereas I always have the NMI
watchdog option enabled. Are you running with or without the tracer ?

   
   w/o.

 enabling xeno nmi watchdog whereas the nucleus module is built-in break
 Linux nmi watchdog test. Maybe some setups are not done at Linux level
 when this test fails, which could explain some weird behaviour
 afterwards.

 Do you have the message:
 Testing NMI watchdog... CPU#0: NMI appears to be stuck (x-y)!

 In Linux boot messages ?
 
 Nope. The NMI test looks ok.
 
 Linux version 2.6.17-ipipe ([EMAIL PROTECTED]) (gcc version 3.3.3 (Debian
 20040321)) #1 SMP Tue Jul 11 03:17:18 CEST 2006
 BIOS-provided physical RAM map:
  BIOS-e820:  - 0009fc00 (usable)
  BIOS-e820: 0009fc00 - 000a (reserved)
  BIOS-e820: 000f - 0010 (reserved)
  BIOS-e820: 0010 - 3fffd000 (usable)
  BIOS-e820: 3fffd000 - 3000 (ACPI data)
  BIOS-e820: 3000 - 4000 (ACPI NVS)
  BIOS-e820: fec0 - fec01000 (reserved)
  BIOS-e820: fee0 - fee01000 (reserved)
  BIOS-e820:  - 0001 (reserved)
 127MB HIGHMEM available.
 896MB LOWMEM available.
 found SMP MP-table at 000f6e80
 On node 0 totalpages: 262141
   DMA zone: 4096 pages, LIFO batch:0
   Normal zone: 225280 pages, LIFO batch:31
   HighMem zone: 32765 pages, LIFO batch:7
 DMI 2.0 present.
 Intel MultiProcessor Specification v1.1
 Virtual Wire compatibility mode.
 OEM ID: OEM0 Product ID: PROD APIC at: 0xFEE0
 Processor #1 6:8 APIC version 17
 Processor #0 6:8 APIC version 17
 I/O APIC #2 Version 17 at 0xFEC0.
 Enabling APIC mode:  Flat.  Using 1 I/O APICs
 Processors: 2
 Allocating PCI resources starting at 5000 (gap: 4000:bec0)
 Built 1 zonelists
 Kernel command line: root=/dev/hdc1 ro nmi_watchdog=1 vga=1 console=tty0
 console=ttyS0,115200
 mapped APIC to d000 (fee0)
 mapped IOAPIC to c000 (fec0)
 Enabling fast FPU save and restore... done.
 Enabling unmasked SIMD FPU exception support... done.
 Initializing CPU#0
 PID hash table entries: 4096 (order: 12, 16384 bytes)
 Detected 751.749 MHz processor.
 Using tsc for high-res timesource
 I-pipe 1.3-07: pipeline enabled.
 Console: colour VGA+ 80x50
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
 Memory: 1032916k/1048564k available (2491k kernel code, 15100k reserved,
 743k data, 272k init, 131060k highmem)
 Checking if this processor honours the WP bit even in supervisor mode...
 Ok.
 Calibrating delay using timer specific routine.. 1505.65 BogoMIPS
 (lpj=7528285)
 Mount-cache hash table entries: 512
 CPU: After generic identify, caps: 0383fbff   
   
 CPU: After vendor identify, caps: 0383fbff   
   
 CPU: L1 I cache: 16K, L1 D cache: 16K
 CPU: L2 cache: 256K
 CPU: After all inits, caps: 0383fbff   0040 
  
 Checking 'hlt' instruction... OK.
 Freeing SMP alternatives: 28k freed
 CPU0: Intel Pentium III (Coppermine) stepping 03
 Booting processor 1/0 eip 2000
 Initializing CPU#1
 Calibrating delay using timer specific routine.. 1503.45 BogoMIPS
 (lpj=7517293)
 CPU: After generic identify, caps: 0383fbff   
   
 CPU: After vendor identify, caps: 0383fbff   
   
 CPU: L1 I cache: 16K, L1 D cache: 16K
 CPU: L2 cache: 256K
 CPU: After all inits, caps: 0383fbff   0040 
  
 CPU1: Intel Pentium III (Coppermine) stepping 03
 Total of 2 processors activated (3009.11 BogoMIPS).
 ExtINT not setup in hardware but reported by MP table
 ENABLING IO-APIC IRQs
 ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=0 pin2=0
 checking TSC synchronization across 2 CPUs: passed.
 Brought up 2 CPUs
 migration_cost=1897
 NET: Registered protocol family 16
 EISA bus registered
 PCI: PCI BIOS revision 2.10 entry at 0xf0730, last bus=1
 Setting up standard PCI resources
 usbcore: registered new driver usbfs
 usbcore: registered new driver hub
 PCI: Probing PCI hardware
 PCI: Probing PCI hardware (bus 00)
 PCI quirk: region e400-e43f claimed by PIIX4 ACPI
 PCI quirk: region e800-e80f claimed by PIIX4 SMB
 PIIX4 devres B PIO at 0290-0297
 Boot video device is :01:00.0
 PCI: Using IRQ router PIIX/ICH [8086/7110] at :00:04.0
 PCI: Bridge: :00:01.0
   IO window: disabled.
   MEM window: e180-e2df
   PREFETCH window: e2f0-e3ff
 NET: Registered protocol family 2
 IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
 TCP established hash table entries: 131072 (order: 9, 

Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Philippe Gerum
On Sun, 2006-07-09 at 18:56 +0200, Jan Kiszka wrote:
 I can confirm the failing NMI test here on my notebook with both nucleus
 and native skin built into the kernel. I haven't seen false positive
 NMIs yet, but the tracer is still on. Will switch off and re-check.

FWIW, here, the false positive is raised immediately when starting the
latency test.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] [PATCH] suspend your real-time system

2006-07-09 Thread Jan Kiszka
Hi,

it's a bit crazy but it seems to work: This tiny patch against an I-pipe
kernel really allows suspend to disk/ram for a *running* Xenomai system.
Just the TSC-based timers are skewed up after resume so that, e.g., the
latency test takes a longer pause then. But a serious use-case would
rather include a real-time application shutdown before suspend anyway.

Jan


---
 kernel/power/swsusp.c |4 
 1 file changed, 4 insertions(+)

Index: linux-2.6.17-ipipe/kernel/power/swsusp.c
===
--- linux-2.6.17-ipipe.orig/kernel/power/swsusp.c
+++ linux-2.6.17-ipipe/kernel/power/swsusp.c
@@ -217,6 +217,7 @@ int swsusp_suspend(void)
 	if ((error = arch_prepare_suspend()))
 		return error;
 	local_irq_disable();
+	local_irq_disable_hw();
 	/* At this point, device_suspend() has been called, but *not*
 	 * device_power_down(). We *must* device_power_down() now.
 	 * Otherwise, drivers for some devices (e.g. interrupt controllers)
@@ -242,6 +243,7 @@ Restore_highmem:
 	restore_highmem();
 	device_power_up();
 Enable_irqs:
+	local_irq_enable_hw();
 	local_irq_enable();
 	return error;
 }
@@ -250,6 +252,7 @@ int swsusp_resume(void)
 {
 	int error;
 	local_irq_disable();
+	local_irq_disable_hw();
 	if (device_power_down(PMSG_FREEZE))
 		printk(KERN_ERR Some devices failed to power down, very bad\n);
 	/* We'll ignore saved state, but this gets preempt count (etc) right */
@@ -268,6 +271,7 @@ int swsusp_resume(void)
 	restore_highmem();
 	touch_softlockup_watchdog();
 	device_power_up();
+	local_irq_enable_hw();
 	local_irq_enable();
 	return error;
 }




signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Jan Kiszka
Philippe Gerum wrote:
 On Sun, 2006-07-09 at 18:56 +0200, Jan Kiszka wrote:
 I can confirm the failing NMI test here on my notebook with both nucleus
 and native skin built into the kernel. I haven't seen false positive
 NMIs yet, but the tracer is still on. Will switch off and re-check.
 
 FWIW, here, the false positive is raised immediately when starting the
 latency test.
 

Still no luck here, i.e. the watchdog only triggers when I reduce the
threshold. But I'm not on the box where I originally observed this
effect, and that one is out of reach for me now. Remains mysterious,
maybe chipset-dependent. Fortunately, it's only a debug feature.

Jan


PS: Hey, looks good for France. I do have to switch the program now. :)



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] nervous nmi-watchdog

2006-07-09 Thread Gilles Chanteperdrix
Philippe Gerum wrote:
  On Sun, 2006-07-09 at 18:56 +0200, Jan Kiszka wrote:
   I can confirm the failing NMI test here on my notebook with both nucleus
   and native skin built into the kernel. I haven't seen false positive
   NMIs yet, but the tracer is still on. Will switch off and re-check.
  
  FWIW, here, the false positive is raised immediately when starting the
  latency test.

The attached patch attempt to workaround these early shots, count them,
and display them in /proc/xenomai/nmi_early_shots. Could you try it on
your box and see if the /proc display moves ?

-- 


Gilles Chanteperdrix.
Index: ksrc/arch/i386/nmi.c
===
--- ksrc/arch/i386/nmi.c(revision 1316)
+++ ksrc/arch/i386/nmi.c(working copy)
@@ -61,6 +61,9 @@
 unsigned long perfctr_msr;
 unsigned long long next_linux_check;
 unsigned int p4_cccr_val;
+
+   unsigned early_shots;
+   unsigned long long arm_date;
 };
 char __pad[SMP_CACHE_BYTES];
 } rthal_nmi_wd_t cacheline_aligned;
@@ -108,8 +111,13 @@
 rthal_nmi_wd_t *wd = rthal_nmi_wds[cpu];
 unsigned long long now;
 
-if (wd-armed)
+   if (wd-armed) {
+   if (rthal_rdtsc() - wd-arm_date  rthal_maxlat_tsc) {
+   ++wd-early_shots;
+   wd-next_linux_check = wd-arm_date + rthal_maxlat_tsc;
+   } else
 rthal_nmi_emergency(regs);
+   }
 
 now = rthal_rdtsc();
 
@@ -142,6 +150,27 @@
 wrmsrl(wd-perfctr_msr, now - wd-next_linux_check);
 }
 
+static int earlyshots_read_proc(char *page,
+   char **start,
+   off_t off, int count, int *eof, void *data)
+{
+   int i, len = 0;
+
+   for_each_online_cpu(i)
+   len += sprintf(page + len, CPU#%d: %u\n,
+  i, rthal_nmi_wds[i].early_shots);
+   len -= off;
+   if (len = off + count)
+   *eof = 1;
+   *start = page + off;
+   if (len  count)
+   len = count;
+   if (len  0)
+   len = 0;
+
+   return len;
+}
+
 int rthal_nmi_request(void (*emergency) (struct pt_regs *))
 {
 if (!nmi_active || !nmi_watchdog_tick)
@@ -180,6 +209,11 @@
 rthal_linux_nmi_tick = nmi_watchdog_tick;
 wmb();
 nmi_watchdog_tick = rthal_nmi_watchdog_tick;
+
+   __rthal_add_proc_leaf(nmi_early_shots,
+ earlyshots_read_proc,
+ NULL, NULL, rthal_proc_root);
+
 return 0;
 }
 
@@ -188,6 +222,8 @@
 if (!rthal_linux_nmi_tick)
 return;
 
+   remove_proc_entry(nmi_early_shots, rthal_proc_root);
+
 wrmsrl(rthal_nmi_perfctr_msr, 0 - RTHAL_CPU_FREQ);
 touch_nmi_watchdog();
 wmb();
@@ -215,6 +251,7 @@
 rthal_local_irq_restore(flags);
 }
 
+   wd-arm_date = rthal_rdtsc();
 wrmsrl(wd-perfctr_msr, 0 - delay);
 wmb();
 wd-armed = 1;
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core