Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Steven Scholz
Hi Gilles,

 Ok, found the bug (actually, Philippe did), as almost expected, the way
 it is related to the latency program period is not really obvious. The
 bug is that in the macro irq_handler in entry-armv.S, the return value
 (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
 get_irqnr_and_base.

 
 Here comes a patch. Note that it will only work correctly with
 CONFIG_PREEMPT disabled for now.

Any progress with CONFIG_PREEMPT enabled ?

Thanks!

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 Gilles Chanteperdrix wrote:
 
Steven Scholz wrote:
  Hi Gilles,
  
   Ok, found the bug (actually, Philippe did), as almost expected, the way
   it is related to the latency program period is not really obvious. The
   bug is that in the macro irq_handler in entry-armv.S, the return value
   (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
   get_irqnr_and_base.
  
   
   Here comes a patch. Note that it will only work correctly with
   CONFIG_PREEMPT disabled for now.
  
  Any progress with CONFIG_PREEMPT enabled ?

enabling CONFIG_PREEMPT almost works. But non real-time tasks system
calls fail from time to time.
 
 
 But I do need a new patch?
 
 Cause with last patch I get BUG() in schedule ...

The patch that almost works with CONFIG_PREEMPT only exist on Philippe's
disk, it has not been released.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 Hi,
 
 i pick up this issue again.
 
 I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
 xenomai-svn-2007-02-22
 on an AT91RM9200 (160MHz/80MHz).
 
 When starting latency -p 200 it runs for a while printing
 
 RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
 RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
 RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
 RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
 RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
 
 but then hangs. The timer LED stops blinking. No soft lockup detected 
 appears.

The only explanation I have is that the period is too small. I do not
observe the same behaviour with latency -p 1000. Note that setting the
period to a value comparable to the latency is not considered a normal
use of Xenomai. When setting the period to 100 us on x86, the latency is
less than 50 us (and most of the time a lot less than that), so the
period is at least twice the latency. If you observe a latency of 300
us, you should select a period of at least 600 us to run the test in the
same conditions.

 
 Using a BDI200 it looks like that in kernel/sched.c:schedule() he is returning
 in the lines
 
 #ifdef CONFIG_IPIPE
 if (unlikely(!ipipe_root_domain_p))
 return;
 #endif /* CONFIG_IPIPE */
 
 When stepping trough I only see him getting into schedule() but leaving
 it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
 ...

Ok. Thanks for pointing this out. That is interesting, but not very
informative. It would be interesting if you could get the full
backtrace. What would be also interesting would be to set a break point
on the timer interrupt handler and to follow what happens from timer
interrupt to timer interrupt.

I do not think to remember that there are cases where calling schedule
from a real-time context is done by Xenomai, so maybe you can call panic
in schedule instead of returning. I will try and trig a tracer freeze
and dump the tracer at this point in order to have a better idea of what
 happens.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 12:27 +0100, Steven Scholz wrote:

 #ifdef CONFIG_IPIPE
 if (unlikely(!ipipe_root_domain_p))
 return;
 #endif /* CONFIG_IPIPE */
 
 When stepping trough I only see him getting into schedule() but leaving
 it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
 ...
 

This is exactely the kind of issue which 1.6-02 is expected to solve;
this bug has been identified with all earlier versions, so there must be
another spot where the latter fix is missing in the Adeos patch.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 14:16 +0100, Gilles Chanteperdrix wrote:

 I do not think to remember that there are cases where calling schedule
 from a real-time context is done by Xenomai, so maybe you can call panic
 in schedule instead of returning. I will try and trig a tracer freeze
 and dump the tracer at this point in order to have a better idea of what
  happens.
 

True. Just BUG() instead of returning from schedule() in this case would
do.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Steven Scholz wrote:
 Hi,
 
 i pick up this issue again.
 
 I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
 xenomai-svn-2007-02-22
 on an AT91RM9200 (160MHz/80MHz).
 
 When starting latency -p 200 it runs for a while printing
 
 RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
 RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
 RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
 RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
 RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
 
 but then hangs. The timer LED stops blinking. No soft lockup detected 
 appears.

Easy to reproduce with

~ # cat /dev/zero  /dev/null 
~ # latency -p 200

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Gilles,

 I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
 xenomai-svn-2007-02-22
 on an AT91RM9200 (160MHz/80MHz).

 When starting latency -p 200 it runs for a while printing

 RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
 RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
 RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800

 but then hangs. The timer LED stops blinking. No soft lockup detected 
 appears.
 
 The only explanation I have is that the period is too small. I do not
 observe the same behaviour with latency -p 1000. Note that setting the
 period to a value comparable to the latency is not considered a normal
 use of Xenomai. 

Sure but I would still not expect the system to hang!
As I said missing a deadline is bad but ok.
But hanging the whole system is not quite ok.

 Using a BDI200 it looks like that in kernel/sched.c:schedule() he is 
 returning
 in the lines

 #ifdef CONFIG_IPIPE
 if (unlikely(!ipipe_root_domain_p))
 return;
 #endif /* CONFIG_IPIPE */

 When stepping trough I only see him getting into schedule() but leaving
 it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
 ...
 
 Ok. Thanks for pointing this out. That is interesting, but not very
 informative. It would be interesting if you could get the full
 backtrace. What would be also interesting would be to set a break point
 on the timer interrupt handler and to follow what happens from timer
 interrupt to timer interrupt.

I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
normal printk(), right?

*ipipe_current_domain != ipipe_root_domain !
*ipipe_current_domain = c01fc2c0
*ipipe_root_domain= c01af2c0

But I don't get the output of __backtrace()!

my_printk() works with __backtrace(). The dump of a soft lockup works.


Steven

Index: linux-2.6.19/arch/arm/lib/backtrace.S
===
--- linux-2.6.19.orig/arch/arm/lib/backtrace.S
+++ linux-2.6.19/arch/arm/lib/backtrace.S
@@ -100,7 +100,7 @@ ENTRY(c_backtrace)
  */
 1007:		ldr	r0, =.Lbad
 		mov	r1, frame
-		bl	printk
+		bl	my_printk
 		ldmfd	sp!, {r4 - r8, pc}
 		.ltorg
 		
@@ -134,12 +134,12 @@ ENTRY(c_backtrace)
 		ldr	r2, [stack], #-4
 		mov	r1, reg
 		adr	r0, .Lfp
-		bl	printk
+		bl	my_printk
 2:		subs	reg, reg, #1
 		bpl	1b
 		teq	r7, #0
 		adrne	r0, .Lcr
-		blne	printk
+		blne	my_printk
 		mov	r0, stack
 		ldmfd	sp!, {instr, reg, stack, r7, r8, pc}
 
Index: linux-2.6.19/include/linux/kernel.h
===
--- linux-2.6.19.orig/include/linux/kernel.h
+++ linux-2.6.19/include/linux/kernel.h
@@ -146,6 +146,12 @@ asmlinkage int vprintk(const char *fmt, 
 	__attribute__ ((format (printf, 1, 0)));
 asmlinkage int printk(const char * fmt, ...)
 	__attribute__ ((format (printf, 1, 2)));
+
+
+asmlinkage int my_printk(const char * fmt, ...)
+__attribute__ ((format (printf, 1, 2)));
+
+
 #else
 static inline int vprintk(const char *s, va_list args)
 	__attribute__ ((format (printf, 1, 0)));
Index: linux-2.6.19/kernel/printk.c
===
--- linux-2.6.19.orig/kernel/printk.c
+++ linux-2.6.19/kernel/printk.c
@@ -524,6 +524,21 @@ void __ipipe_flush_printk (unsigned virq
 	spin_unlock_irqrestore(__ipipe_printk_lock, flags);
 }
 
+/*FIXME*/
+extern void printascii(const char *);
+asmlinkage int my_printk(const char *fmt, ...)
+{
+va_list va;
+char buff[256];
+
+va_start(va, fmt);
+vsprintf(buff, fmt, va);
+va_end(va);
+
+printascii(buff);
+	return 0;
+}
+
 asmlinkage int printk(const char *fmt, ...)
 {
 	int r, fbytes, oldcount;
Index: linux-2.6.19/kernel/sched.c
===
--- linux-2.6.19.orig/kernel/sched.c
+++ linux-2.6.19/kernel/sched.c
@@ -3327,8 +3327,14 @@ asmlinkage void __sched schedule(void)
 	struct rq *rq;
 
 #ifdef CONFIG_IPIPE
-	if (unlikely(!ipipe_root_domain_p))
+	if (unlikely(!ipipe_root_domain_p)) {
+		my_printk(ipipe_current_domain != ipipe_root_domain !\n);
+		my_printk(ipipe_current_domain = %p\n, ipipe_current_domain);
+		my_printk(ipipe_root_domain= %p\n, ipipe_root_domain);
+		__backtrace();
+		while (1) { barrier();};
 		return;
+	}
 #endif /* CONFIG_IPIPE */
 	/*
 	 * Test if we are atomic.  Since do_exit() needs to call into

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 Gilles,
 
 
I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
xenomai-svn-2007-02-22
on an AT91RM9200 (160MHz/80MHz).

When starting latency -p 200 it runs for a while printing

RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800

but then hangs. The timer LED stops blinking. No soft lockup detected 
appears.

The only explanation I have is that the period is too small. I do not
observe the same behaviour with latency -p 1000. Note that setting the
period to a value comparable to the latency is not considered a normal
use of Xenomai. 
 
 
 Sure but I would still not expect the system to hang!
 As I said missing a deadline is bad but ok.
 But hanging the whole system is not quite ok.

I want this bug solved too, especially since I am not sure that we will
only see it with too short periods.

 
 
Using a BDI200 it looks like that in kernel/sched.c:schedule() he is 
returning
in the lines

#ifdef CONFIG_IPIPE
if (unlikely(!ipipe_root_domain_p))
return;
#endif /* CONFIG_IPIPE */

When stepping trough I only see him getting into schedule() but leaving
it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
...

Ok. Thanks for pointing this out. That is interesting, but not very
informative. It would be interesting if you could get the full
backtrace. What would be also interesting would be to set a break point
on the timer interrupt handler and to follow what happens from timer
interrupt to timer interrupt.
 
 
 I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
 normal printk(), right?
 
 *ipipe_current_domain != ipipe_root_domain !
 *ipipe_current_domain = c01fc2c0
 *ipipe_root_domain= c01af2c0
 
 But I don't get the output of __backtrace()!
 
 my_printk() works with __backtrace(). The dump of a soft lockup works.

I would add the call to printascii(printk_buff) directly in vprintk, and
use printk. Note however that special care must be taken to avoid
recursion when calling printk inside schedule, because printk may use
schedule. Anyway, I think the tracer will give better results than a
simple backtrace.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Gilles,

 Sure but I would still not expect the system to hang!
 As I said missing a deadline is bad but ok.
 But hanging the whole system is not quite ok.
 
 I want this bug solved too, especially since I am not sure that we will
 only see it with too short periods.

Makes us two! ;-)

 I would add the call to printascii(printk_buff) directly in vprintk, and
 use printk. Note however that special care must be taken to avoid
 recursion when calling printk inside schedule, because printk may use
 schedule. Anyway, I think the tracer will give better results than a
 simple backtrace.
Ok. Thanks.

So what exactly shell I do? I have never worked with the tracer.

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Philippe,

 But I don't get the output of __backtrace()!
 Before calling your backtrace helper, try adding:

 ipipe_set_printk_sync(ipipe_current_domain);
 And then use printk() instead of my_printk()?

 
 Yes, switching this on is a brute force attempt to bypass any
 bufferization and allow printk to call the console driver directly
 regardless of the current domain - this may, or may not work, depending
 on the level of brokenness of the current situation (this said, if I
 don't get why printascii() as used by my_printk() does not send the
 characters to the uart as expected).

Ok. Thanks.

But since BUG() does the backtrace as well, there's no need for my hack.
As you said replacing return with BUG() is enough.

But as you can see from the backtrace, there's not muzch info 

Steven


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 Hi,
 
 
schedule. Anyway, I think the tracer will give better results than a
simple backtrace.

Ok. Thanks.

So what exactly shell I do? I have never worked with the tracer.
 
 
 Just enabled
 
 CONFIG_IPIPE_DEBUG=y
 CONFIG_IPIPE_TRACE=y
 CONFIG_IPIPE_TRACE_ENABLE=y
 CONFIG_IPIPE_TRACE_MCOUNT=y
 CONFIG_IPIPE_TRACE_IRQSOFF=y
 CONFIG_IPIPE_TRACE_SHIFT=15
 # CONFIG_IPIPE_TRACE_VMALLOC is not set
 CONFIG_IPIPE_TRACE_ENABLE_VALUE=1
 
 but get
 
   CC  arch/arm/kernel/asm-offsets.s
 In file included from include/linux/bitops.h:9,
  from include/linux/thread_info.h:20,
  from include/linux/preempt.h:9,
  from include/linux/spinlock.h:49,
  from include/linux/capability.h:45,
  from include/linux/sched.h:46,
  from arch/arm/kernel/asm-offsets.c:13:
 include/asm/bitops.h: In function `atomic_set_bit':
 include/asm/bitops.h:40: warning: implicit declaration of function 
 `local_test_iflag_hw'

At first sight, replacing local_test_iflag_hw with
raw_irqs_disabled_flags should work.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Steven Scholz wrote:
 Jan,
 
 So what exactly shell I do? I have never worked with the tracer.

 Start here: http://www.xenomai.org/index.php/I-pipe:Tracer

 I haven't followed all details (while hacking on other bugs :)), but you
 have two options to catch a trace: the one described on that page *if*
 your board survives the crash, or via ipipe_trace_panic_freeze()
 followed by ipipe_trace_panic_dump() (+ switching to sync printk mode
 first).
 
 Do I need CONFIG_IPIPE_TRACE_MCOUNT=y for the ipipe_trace_panic_dump()?

Yes, because this is what adds per-function call trace points. Otherwise
the information is fairly thin.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Hi all,

 I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
 xenomai-svn-2007-02-22
 on an AT91RM9200 (160MHz/80MHz).
 
 When starting latency -p 200 it runs for a while printing
 
 RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
 RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
 RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
 RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
 RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
 
 but then hangs. The timer LED stops blinking. No soft lockup detected 
 appears.

After patching kernel/sched.c

 #ifdef CONFIG_IPIPE
-   if (unlikely(!ipipe_root_domain_p))
-   return;
+   if (unlikely(!ipipe_root_domain_p)) {
+   ipipe_set_printk_sync(ipipe_current_domain);
+   ipipe_trace_panic_freeze();
+   ipipe_trace_panic_dump();
+   BUG();
+   }
 #endif /* CONFIG_IPIPE */

~ # cat /dev/zero  /dev/null 
~ # latency -p 400
== Sampling period: 400 us
== Test mode: periodic user-mode task
== All results in microseconds
warming up...
RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
...
RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
I-pipe tracer log (30 points):
func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
func   -2 schedule+0x14 (ret_slow_syscall+0x0)
func   -6 __ipipe_walk_pipeline+0x10 (__ipipe_handle_irq+0x190)
[  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
[  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
func  -18 xnpod_suspend_thread+0x14 
(xnpod_wait_thread_period+0xb0)
func  -21 xnpod_wait_thread_period+0x14 
(rt_task_wait_period+0x4c)
func  -23 rt_task_wait_period+0x10 (__rt_task_wait_period+0x54)
func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
func  -29 __ipipe_dispatch_event+0x14 
(__ipipe_syscall_root+0x88)
func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
func  -40 __ipipe_dispatch_event+0x14 
(__ipipe_syscall_root+0x88)
func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
func  -46 __ipipe_restore_pipeline_head+0x10 
(xnpod_wait_thread_period+0x1b4)
[  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
[  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
func  -60 __ipipe_mach_set_dec+0x10 
(xntimer_tick_aperiodic+0x2fc)
[  184] samplin 99-69 xnpod_resume_thread+0x5c 
(xnthread_periodic_handler+0x30)
func  -71 xnpod_resume_thread+0x10 
(xnthread_periodic_handler+0x30)
func  -73 xnthread_periodic_handler+0x10 
(xntimer_tick_aperiodic+0xcc)
func  -77 xntimer_tick_aperiodic+0x14 (xnpod_announce_tick+0x14)
func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
func  -84 xnintr_clock_handler+0x10 
(__ipipe_dispatch_wired+0xe4)
kernel BUG at kernel/sched.c:3337!
Unable to handle kernel NULL pointer dereference at virtual address 
pgd = c1a44000
[] *pgd=21a1a031, *pte=, *ppte=
Internal error: Oops: 817 [#1]
Modules linked in:
CPU: 0
PC is at __bug+0x44/0x58
LR is at __ipipe_sync_stage+0x10/0x294
pc : [c001ed08]lr : [c0051414]Not tainted
sp : c1e8ff64  ip :   fp : c1e8ff74
r10: 003a5b10  r9 : c1e8e000  r8 : 
r7 :   r6 :   r5 : c01ba860  r4 : 
r3 :   r2 : c01ba880  r1 :   r0 : 0001
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: C000717F
Table: 21A44000  DAC: 0015
Process display-181 (pid: 183, stack limit = 0xc1e8e250)
Stack: (0xc1e8ff64 to 0xc1e9)
ff60:   c1e8ffac c1e8ff78 c0181588 c001ecd4 c1e8ff84 c0020340
ff80: c002007c  fefff000    c1e8e000 003a5b10
ffa0:  c1e8ffb0 c001ae04 c0181530 0011b333 3300 07d0 
ffc0: 2028 0320   b714 

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 Hi all,
 
 
I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
xenomai-svn-2007-02-22
on an AT91RM9200 (160MHz/80MHz).

When starting latency -p 200 it runs for a while printing

RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800

but then hangs. The timer LED stops blinking. No soft lockup detected 
appears.
 
 
 After patching kernel/sched.c
 
  #ifdef CONFIG_IPIPE
 -   if (unlikely(!ipipe_root_domain_p))
 -   return;
 +   if (unlikely(!ipipe_root_domain_p)) {
 +   ipipe_set_printk_sync(ipipe_current_domain);
 +   ipipe_trace_panic_freeze();
 +   ipipe_trace_panic_dump();
 +   BUG();
 +   }
  #endif /* CONFIG_IPIPE */
 
 ~ # cat /dev/zero  /dev/null 
 ~ # latency -p 400
 == Sampling period: 400 us
 == Test mode: periodic user-mode task
 == All results in microseconds
 warming up...
 RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
 RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
 RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
 ...
 RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
 RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
 I-pipe tracer log (30 points):
 func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
 func   -2 schedule+0x14 (ret_slow_syscall+0x0)
 func   -6 __ipipe_walk_pipeline+0x10 
 (__ipipe_handle_irq+0x190)
 [  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
 [  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
 func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
 func  -18 xnpod_suspend_thread+0x14 
 (xnpod_wait_thread_period+0xb0)
 func  -21 xnpod_wait_thread_period+0x14 
 (rt_task_wait_period+0x4c)
 func  -23 rt_task_wait_period+0x10 
 (__rt_task_wait_period+0x54)
 func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
 func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
 func  -29 __ipipe_dispatch_event+0x14 
 (__ipipe_syscall_root+0x88)
 func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
 func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
 func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
 func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
 func  -40 __ipipe_dispatch_event+0x14 
 (__ipipe_syscall_root+0x88)
 func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
 func  -46 __ipipe_restore_pipeline_head+0x10 
 (xnpod_wait_thread_period+0x1b4)
 [  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
 [  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
 func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
 func  -60 __ipipe_mach_set_dec+0x10 
 (xntimer_tick_aperiodic+0x2fc)
 [  184] samplin 99-69 xnpod_resume_thread+0x5c 
 (xnthread_periodic_handler+0x30)
 func  -71 xnpod_resume_thread+0x10 
 (xnthread_periodic_handler+0x30)
 func  -73 xnthread_periodic_handler+0x10 
 (xntimer_tick_aperiodic+0xcc)
 func  -77 xntimer_tick_aperiodic+0x14 
 (xnpod_announce_tick+0x14)
 func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
 func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
 func  -84 xnintr_clock_handler+0x10 
 (__ipipe_dispatch_wired+0xe4)
 kernel BUG at kernel/sched.c:3337!
 Unable to handle kernel NULL pointer dereference at virtual address 
 pgd = c1a44000
 [] *pgd=21a1a031, *pte=, *ppte=
 Internal error: Oops: 817 [#1]
 Modules linked in:
 CPU: 0
 PC is at __bug+0x44/0x58
 LR is at __ipipe_sync_stage+0x10/0x294
 pc : [c001ed08]lr : [c0051414]Not tainted
 sp : c1e8ff64  ip :   fp : c1e8ff74
 r10: 003a5b10  r9 : c1e8e000  r8 : 
 r7 :   r6 :   r5 : c01ba860  r4 : 
 r3 :   r2 : c01ba880  r1 :   r0 : 0001
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
 Control: C000717F
 Table: 21A44000  DAC: 0015
 Process display-181 (pid: 183, stack limit = 0xc1e8e250)
 Stack: (0xc1e8ff64 to 0xc1e9)
 ff60:   c1e8ffac c1e8ff78 c0181588 c001ecd4 c1e8ff84 c0020340
 ff80: c002007c  fefff000    c1e8e000 003a5b10
 ffa0:  c1e8ffb0