Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Gilles Chanteperdrix wrote:
> 
>>Steven Scholz wrote:
>> > Hi Gilles,
>> > 
>> > >> Ok, found the bug (actually, Philippe did), as almost expected, the way
>> > >> it is related to the latency program period is not really obvious. The
>> > >> bug is that in the macro "irq_handler" in entry-armv.S, the return value
>> > >> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
>> > >> get_irqnr_and_base.
>> > >>
>> > > 
>> > > Here comes a patch. Note that it will only work correctly with
>> > > CONFIG_PREEMPT disabled for now.
>> > 
>> > Any progress with CONFIG_PREEMPT enabled ?
>>
>>enabling CONFIG_PREEMPT almost works. But non real-time tasks system
>>calls fail from time to time.
> 
> 
> But I do need a new patch?
> 
> Cause with last patch I get "BUG()" in schedule ...

The patch that almost works with CONFIG_PREEMPT only exist on Philippe's
disk, it has not been released.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Steven Scholz
Gilles Chanteperdrix wrote:
> Steven Scholz wrote:
>  > Hi Gilles,
>  > 
>  > >> Ok, found the bug (actually, Philippe did), as almost expected, the way
>  > >> it is related to the latency program period is not really obvious. The
>  > >> bug is that in the macro "irq_handler" in entry-armv.S, the return value
>  > >> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
>  > >> get_irqnr_and_base.
>  > >>
>  > > 
>  > > Here comes a patch. Note that it will only work correctly with
>  > > CONFIG_PREEMPT disabled for now.
>  > 
>  > Any progress with CONFIG_PREEMPT enabled ?
> 
> enabling CONFIG_PREEMPT almost works. But non real-time tasks system
> calls fail from time to time.

But I do need a new patch?

Cause with last patch I get "BUG()" in schedule ...

Steven


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Gilles Chanteperdrix
Steven Scholz wrote:
 > Hi Gilles,
 > 
 > >> Ok, found the bug (actually, Philippe did), as almost expected, the way
 > >> it is related to the latency program period is not really obvious. The
 > >> bug is that in the macro "irq_handler" in entry-armv.S, the return value
 > >> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
 > >> get_irqnr_and_base.
 > >>
 > > 
 > > Here comes a patch. Note that it will only work correctly with
 > > CONFIG_PREEMPT disabled for now.
 > 
 > Any progress with CONFIG_PREEMPT enabled ?

enabling CONFIG_PREEMPT almost works. But non real-time tasks system
calls fail from time to time.

-- 


Gilles Chanteperdrix.

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-03-08 Thread Steven Scholz
Hi Gilles,

>> Ok, found the bug (actually, Philippe did), as almost expected, the way
>> it is related to the latency program period is not really obvious. The
>> bug is that in the macro "irq_handler" in entry-armv.S, the return value
>> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
>> get_irqnr_and_base.
>>
> 
> Here comes a patch. Note that it will only work correctly with
> CONFIG_PREEMPT disabled for now.

Any progress with CONFIG_PREEMPT enabled ?

Thanks!

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-26 Thread Steven Scholz
Hi Gilles,

>> Ok, found the bug (actually, Philippe did), as almost expected, the way
>> it is related to the latency program period is not really obvious. The
>> bug is that in the macro "irq_handler" in entry-armv.S, the return value
>> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
>> get_irqnr_and_base.
> 
> Here comes a patch. Note that it will only work correctly with
> CONFIG_PREEMPT disabled for now.

Right now we are testing it on two boards.
Looks good.

Thanks a million!

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-26 Thread Gilles Chanteperdrix
Gilles Chanteperdrix wrote:
> Ok, found the bug (actually, Philippe did), as almost expected, the way
> it is related to the latency program period is not really obvious. The
> bug is that in the macro "irq_handler" in entry-armv.S, the return value
> (in r0) of __ipipe_grab_irq is overriden by the subsequent call to
> get_irqnr_and_base.
> 

Here comes a patch. Note that it will only work correctly with
CONFIG_PREEMPT disabled for now.

-- 
 Gilles Chanteperdrix
--- arch/arm/kernel/entry-armv.S~	2007-02-26 11:55:18.0 +0100
+++ arch/arm/kernel/entry-armv.S	2007-02-26 12:07:45.0 +0100
@@ -29,12 +29,16 @@
  * Interrupt handling.  Preserves r7, r8, r9
  */
 	.macro	irq_handler
-1:	get_irqnr_and_base r0, r6, r5, lr
+#ifdef CONFIG_IPIPE
+	mov r0, #2
+#endif
+1:	get_irqnr_and_base r4, r6, r5, lr
 	movne	r1, sp
 	@
 	@ routine called with r0 = irq number, r1 = struct pt_regs *
 	@
 	adrne	lr, 1b
+	movne   r0, r4
 #ifdef CONFIG_IPIPE
 	bne	__ipipe_grab_irq
 #else
@@ -48,19 +52,23 @@
 	 * this macro assumes that irqstat (r6) and base (r5) are
 	 * preserved from get_irqnr_and_base above
 	 */
-	test_for_ipi r0, r6, r5, lr
+	test_for_ipi r4, r6, r5, lr
 	movne	r0, sp
 	adrne	lr, 1b
 	bne	do_IPI
 
 #ifdef CONFIG_LOCAL_TIMERS
-	test_for_ltirq r0, r6, r5, lr
+	test_for_ltirq r4, r6, r5, lr
 	movne	r0, sp
 	adrne	lr, 1b
 	bne	do_local_timer
 #endif
 #endif
-
+#ifdef CONFIG_IPIPE
+	cmp	r0, #2
+	moveq	r0, sp
+	bleq	__ipipe_check_root
+#endif	
 	.endm
 
 /*
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-26 Thread Gilles Chanteperdrix
Gilles Chanteperdrix wrote:
> Steven Scholz wrote:
> 
>>Hi all,
>>
>>
>>
>>>I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>>>xenomai-svn-2007-02-22
>>>on an AT91RM9200 (160MHz/80MHz).
>>>
>>>When starting "latency -p 200" it runs for a while printing
>>>
>>>RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>>>RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>>>RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
>>>RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
>>>RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
>>>
>>>but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>>>appears.
>>
>>
>>After patching kernel/sched.c
>>
>> #ifdef CONFIG_IPIPE
>>-   if (unlikely(!ipipe_root_domain_p))
>>-   return;
>>+   if (unlikely(!ipipe_root_domain_p)) {
>>+   ipipe_set_printk_sync(ipipe_current_domain);
>>+   ipipe_trace_panic_freeze();
>>+   ipipe_trace_panic_dump();
>>+   BUG();
>>+   }
>> #endif /* CONFIG_IPIPE */
>>
>>~ # cat /dev/zero > /dev/null &
>>~ # latency -p 400
>>== Sampling period: 400 us
>>== Test mode: periodic user-mode task
>>== All results in microseconds
>>warming up...
>>RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
>>RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>>RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
>>...
>>RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
>>RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
>>I-pipe tracer log (30 points):
>>func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
>>func   -2 schedule+0x14 (ret_slow_syscall+0x0)
>>func   -6 __ipipe_walk_pipeline+0x10 
>>(__ipipe_handle_irq+0x190)
>>[  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
>>[  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
>>func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
>>func  -18 xnpod_suspend_thread+0x14 
>>(xnpod_wait_thread_period+0xb0)
>>func  -21 xnpod_wait_thread_period+0x14 
>>(rt_task_wait_period+0x4c)
>>func  -23 rt_task_wait_period+0x10 
>>(__rt_task_wait_period+0x54)
>>func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
>>func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
>>func  -29 __ipipe_dispatch_event+0x14 
>>(__ipipe_syscall_root+0x88)
>>func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
>>func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
>>func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
>>func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
>>func  -40 __ipipe_dispatch_event+0x14 
>>(__ipipe_syscall_root+0x88)
>>func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
>>func  -46 __ipipe_restore_pipeline_head+0x10 
>>(xnpod_wait_thread_period+0x1b4)
>>[  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
>>[  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
>>func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
>>func  -60 __ipipe_mach_set_dec+0x10 
>>(xntimer_tick_aperiodic+0x2fc)
>>[  184] samplin 99-69 xnpod_resume_thread+0x5c 
>>(xnthread_periodic_handler+0x30)
>>func  -71 xnpod_resume_thread+0x10 
>>(xnthread_periodic_handler+0x30)
>>func  -73 xnthread_periodic_handler+0x10 
>>(xntimer_tick_aperiodic+0xcc)
>>func  -77 xntimer_tick_aperiodic+0x14 
>>(xnpod_announce_tick+0x14)
>>func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
>>func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
>>func  -84 xnintr_clock_handler+0x10 
>>(__ipipe_dispatch_wired+0xe4)
>>kernel BUG at kernel/sched.c:3337!
>>Unable to handle kernel NULL pointer dereference at virtual address 
>>pgd = c1a44000
>>[] *pgd=21a1a031, *pte=, *ppte=
>>Internal error: Oops: 817 [#1]
>>Modules linked in:
>>CPU: 0
>>PC is at __bug+0x44/0x58
>>LR is at __ipipe_sync_stage+0x10/0x294
>>pc : []lr : []Not tainted
>>sp : c1e8ff64  ip :   fp : c1e8ff74
>>r10: 003a5b10  r9 : c1e8e000  r8 : 
>>r7 :   r6 :   r5 : c01ba860  r4 : 
>>r3 :   r2 : c01ba880  r1 :   r0 : 0001
>>Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
>>Control: C000717F
>>Table: 21A44000  DAC: 0015
>>Process display-181 (pid: 183, stack limit = 0xc1e8e250)
>>Stack: (0xc1e8ff64 to 0xc1e9)
>>ff60:  ff

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-26 Thread Steven Scholz
Jan,

>>> BTW, do you have the nucleus option "disable priority coupling" set or
>>> not in your config (the latter is default, ie. it's on)? The other bug
>>> is triggered when prio coupling is on, and doesn't appear while it's
>>> off. Just to collect more patterns...
>> CONFIG_XENO_OPT_PRIOCPL=y
> 
> [Ah, using trunk already]
> 
> Mind given a try what happens when it's off?

I switched it off:

# CONFIG_XENO_OPT_PRIOCPL is not set

Still It crashes/hangs after some time.
You want to see the log file?
I assume it will look pretty much the same ...

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Hi all,
> 
> 
>>I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>>xenomai-svn-2007-02-22
>>on an AT91RM9200 (160MHz/80MHz).
>>
>>When starting "latency -p 200" it runs for a while printing
>>
>>RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>>RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>>RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
>>RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
>>RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
>>
>>but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>>appears.
> 
> 
> After patching kernel/sched.c
> 
>  #ifdef CONFIG_IPIPE
> -   if (unlikely(!ipipe_root_domain_p))
> -   return;
> +   if (unlikely(!ipipe_root_domain_p)) {
> +   ipipe_set_printk_sync(ipipe_current_domain);
> +   ipipe_trace_panic_freeze();
> +   ipipe_trace_panic_dump();
> +   BUG();
> +   }
>  #endif /* CONFIG_IPIPE */
> 
> ~ # cat /dev/zero > /dev/null &
> ~ # latency -p 400
> == Sampling period: 400 us
> == Test mode: periodic user-mode task
> == All results in microseconds
> warming up...
> RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
> RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
> ...
> RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
> RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
> I-pipe tracer log (30 points):
> func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
> func   -2 schedule+0x14 (ret_slow_syscall+0x0)
> func   -6 __ipipe_walk_pipeline+0x10 
> (__ipipe_handle_irq+0x190)
> [  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
> [  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
> func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
> func  -18 xnpod_suspend_thread+0x14 
> (xnpod_wait_thread_period+0xb0)
> func  -21 xnpod_wait_thread_period+0x14 
> (rt_task_wait_period+0x4c)
> func  -23 rt_task_wait_period+0x10 
> (__rt_task_wait_period+0x54)
> func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
> func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
> func  -29 __ipipe_dispatch_event+0x14 
> (__ipipe_syscall_root+0x88)
> func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
> func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
> func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
> func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
> func  -40 __ipipe_dispatch_event+0x14 
> (__ipipe_syscall_root+0x88)
> func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
> func  -46 __ipipe_restore_pipeline_head+0x10 
> (xnpod_wait_thread_period+0x1b4)
> [  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
> [  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
> func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
> func  -60 __ipipe_mach_set_dec+0x10 
> (xntimer_tick_aperiodic+0x2fc)
> [  184] samplin 99-69 xnpod_resume_thread+0x5c 
> (xnthread_periodic_handler+0x30)
> func  -71 xnpod_resume_thread+0x10 
> (xnthread_periodic_handler+0x30)
> func  -73 xnthread_periodic_handler+0x10 
> (xntimer_tick_aperiodic+0xcc)
> func  -77 xntimer_tick_aperiodic+0x14 
> (xnpod_announce_tick+0x14)
> func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
> func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
> func  -84 xnintr_clock_handler+0x10 
> (__ipipe_dispatch_wired+0xe4)
> kernel BUG at kernel/sched.c:3337!
> Unable to handle kernel NULL pointer dereference at virtual address 
> pgd = c1a44000
> [] *pgd=21a1a031, *pte=, *ppte=
> Internal error: Oops: 817 [#1]
> Modules linked in:
> CPU: 0
> PC is at __bug+0x44/0x58
> LR is at __ipipe_sync_stage+0x10/0x294
> pc : []lr : []Not tainted
> sp : c1e8ff64  ip :   fp : c1e8ff74
> r10: 003a5b10  r9 : c1e8e000  r8 : 
> r7 :   r6 :   r5 : c01ba860  r4 : 
> r3 :   r2 : c01ba880  r1 :   r0 : 0001
> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
> Control: C000717F
> Table: 21A44000  DAC: 0015
> Process display-181 (pid: 183, stack limit = 0xc1e8e250)
> Stack: (0xc1e8ff64 to 0xc1e9)
> ff60:   c1e8ffac c1e8ff78 c0181588 c001ecd4 c1e8ff84 c0

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> Steven Scholz wrote:
>> Hi all,
>>
>>
>>> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>>> xenomai-svn-2007-02-22
>>> on an AT91RM9200 (160MHz/80MHz).
>>>
>>> When starting "latency -p 200" it runs for a while printing
>>>
>>> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>>> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat 
>>> worst
>>> RTD|  11.200| 139.200| 236.800|   1|  10.800| 
>>> 280.800
>>> RTD|  11.200| 146.400| 253.200|   1|  10.800| 
>>> 280.800
>>> RTD|  11.200| 144.400| 240.400|   1|  10.800| 
>>> 280.800
>>>
>>> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>>> appears.
>>
>> After patching kernel/sched.c
>>
>>  #ifdef CONFIG_IPIPE
>> -   if (unlikely(!ipipe_root_domain_p))
>> -   return;
>> +   if (unlikely(!ipipe_root_domain_p)) {
>> +   ipipe_set_printk_sync(ipipe_current_domain);
>> +   ipipe_trace_panic_freeze();
>> +   ipipe_trace_panic_dump();
>> +   BUG();
>> +   }
>>  #endif /* CONFIG_IPIPE */
>>
>> ~ # cat /dev/zero > /dev/null &
>> ~ # latency -p 400
>> == Sampling period: 400 us
>> == Test mode: periodic user-mode task
>> == All results in microseconds
>> warming up...
>> RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
>> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>> RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
>> ...
>> RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
>> RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
>> I-pipe tracer log (30 points):
>> func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
>> func   -2 schedule+0x14 (ret_slow_syscall+0x0)
>> func   -6 __ipipe_walk_pipeline+0x10 
>> (__ipipe_handle_irq+0x190)
>> [  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
>> [  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
>> func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
>> func  -18 xnpod_suspend_thread+0x14 
>> (xnpod_wait_thread_period+0xb0)
>> func  -21 xnpod_wait_thread_period+0x14 
>> (rt_task_wait_period+0x4c)
>> func  -23 rt_task_wait_period+0x10 
>> (__rt_task_wait_period+0x54)
>> func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
>> func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
>> func  -29 __ipipe_dispatch_event+0x14 
>> (__ipipe_syscall_root+0x88)
>> func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
>> func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
>> func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
>> func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
>> func  -40 __ipipe_dispatch_event+0x14 
>> (__ipipe_syscall_root+0x88)
>> func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
>> func  -46 __ipipe_restore_pipeline_head+0x10 
>> (xnpod_wait_thread_period+0x1b4)
>> [  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
>> [  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
>> func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
>> func  -60 __ipipe_mach_set_dec+0x10 
>> (xntimer_tick_aperiodic+0x2fc)
>> [  184] samplin 99-69 xnpod_resume_thread+0x5c 
>> (xnthread_periodic_handler+0x30)
>> func  -71 xnpod_resume_thread+0x10 
>> (xnthread_periodic_handler+0x30)
>> func  -73 xnthread_periodic_handler+0x10 
>> (xntimer_tick_aperiodic+0xcc)
>> func  -77 xntimer_tick_aperiodic+0x14 
>> (xnpod_announce_tick+0x14)
>> func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
>> func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
>> func  -84 xnintr_clock_handler+0x10 
>> (__ipipe_dispatch_wired+0xe4)
>> kernel BUG at kernel/sched.c:3337!
>> Unable to handle kernel NULL pointer dereference at virtual address 
>> pgd = c1a44000
>> [] *pgd=21a1a031, *pte=, *ppte=
>> Internal error: Oops: 817 [#1]
>> Modules linked in:
>> CPU: 0
>> PC is at __bug+0x44/0x58
>> LR is at __ipipe_sync_stage+0x10/0x294
>> pc : []lr : []Not tainted
>> sp : c1e8ff64  ip :   fp : c1e8ff74
>> r10: 003a5b10  r9 : c1e8e000  r8 : 
>> r7 :   r6 :   r5 : c01ba860  r4 : 
>> r3 :   r2 : c01ba880  r1 :   r0 : 0001
>> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
>> Control: C000717F
>> Table: 21A44000  DAC: 0015
>> Process 

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Hi all,
> 
> 
>>I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>>xenomai-svn-2007-02-22
>>on an AT91RM9200 (160MHz/80MHz).
>>
>>When starting "latency -p 200" it runs for a while printing
>>
>>RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>>RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>>RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
>>RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
>>RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
>>
>>but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>>appears.
> 
> 
> After patching kernel/sched.c
> 
>  #ifdef CONFIG_IPIPE
> -   if (unlikely(!ipipe_root_domain_p))
> -   return;
> +   if (unlikely(!ipipe_root_domain_p)) {
> +   ipipe_set_printk_sync(ipipe_current_domain);
> +   ipipe_trace_panic_freeze();
> +   ipipe_trace_panic_dump();
> +   BUG();
> +   }
>  #endif /* CONFIG_IPIPE */
> 
> ~ # cat /dev/zero > /dev/null &
> ~ # latency -p 400
> == Sampling period: 400 us
> == Test mode: periodic user-mode task
> == All results in microseconds
> warming up...
> RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
> RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
> ...
> RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
> RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
> I-pipe tracer log (30 points):
> func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
> func   -2 schedule+0x14 (ret_slow_syscall+0x0)
> func   -6 __ipipe_walk_pipeline+0x10 
> (__ipipe_handle_irq+0x190)
> [  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
> [  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
> func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
> func  -18 xnpod_suspend_thread+0x14 
> (xnpod_wait_thread_period+0xb0)
> func  -21 xnpod_wait_thread_period+0x14 
> (rt_task_wait_period+0x4c)
> func  -23 rt_task_wait_period+0x10 
> (__rt_task_wait_period+0x54)
> func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
> func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
> func  -29 __ipipe_dispatch_event+0x14 
> (__ipipe_syscall_root+0x88)
> func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
> func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
> func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
> func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
> func  -40 __ipipe_dispatch_event+0x14 
> (__ipipe_syscall_root+0x88)
> func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
> func  -46 __ipipe_restore_pipeline_head+0x10 
> (xnpod_wait_thread_period+0x1b4)
> [  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
> [  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
> func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
> func  -60 __ipipe_mach_set_dec+0x10 
> (xntimer_tick_aperiodic+0x2fc)
> [  184] samplin 99-69 xnpod_resume_thread+0x5c 
> (xnthread_periodic_handler+0x30)
> func  -71 xnpod_resume_thread+0x10 
> (xnthread_periodic_handler+0x30)
> func  -73 xnthread_periodic_handler+0x10 
> (xntimer_tick_aperiodic+0xcc)
> func  -77 xntimer_tick_aperiodic+0x14 
> (xnpod_announce_tick+0x14)
> func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
> func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
> func  -84 xnintr_clock_handler+0x10 
> (__ipipe_dispatch_wired+0xe4)
> kernel BUG at kernel/sched.c:3337!
> Unable to handle kernel NULL pointer dereference at virtual address 
> pgd = c1a44000
> [] *pgd=21a1a031, *pte=, *ppte=
> Internal error: Oops: 817 [#1]
> Modules linked in:
> CPU: 0
> PC is at __bug+0x44/0x58
> LR is at __ipipe_sync_stage+0x10/0x294
> pc : []lr : []Not tainted
> sp : c1e8ff64  ip :   fp : c1e8ff74
> r10: 003a5b10  r9 : c1e8e000  r8 : 
> r7 :   r6 :   r5 : c01ba860  r4 : 
> r3 :   r2 : c01ba880  r1 :   r0 : 0001
> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
> Control: C000717F
> Table: 21A44000  DAC: 0015
> Process display-181 (pid: 183, stack limit = 0xc1e8e250)
> Stack: (0xc1e8ff64 to 0xc1e9)
> ff60:   c1e8ffac c1e8ff78 c0181588 c001ecd4 c1e8ff84 c0

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Hi all,

> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
> xenomai-svn-2007-02-22
> on an AT91RM9200 (160MHz/80MHz).
> 
> When starting "latency -p 200" it runs for a while printing
> 
> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
> RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
> RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
> RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
> 
> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
> appears.

After patching kernel/sched.c

 #ifdef CONFIG_IPIPE
-   if (unlikely(!ipipe_root_domain_p))
-   return;
+   if (unlikely(!ipipe_root_domain_p)) {
+   ipipe_set_printk_sync(ipipe_current_domain);
+   ipipe_trace_panic_freeze();
+   ipipe_trace_panic_dump();
+   BUG();
+   }
 #endif /* CONFIG_IPIPE */

~ # cat /dev/zero > /dev/null &
~ # latency -p 400
== Sampling period: 400 us
== Test mode: periodic user-mode task
== All results in microseconds
warming up...
RTT|  00:00:01  (periodic user-mode task, 400 us period, priority 99)
RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
RTD| 146.000| 187.200| 258.000|   0| 146.000| 258.000
...
RTD|  72.400| 188.800|3793.600|  97|  68.800|4746.800
RTD|  70.800| 188.800|3256.400| 107|  68.800|4746.800
I-pipe tracer log (30 points):
func0 ipipe_trace_panic_freeze+0x10 (schedule+0x54)
func   -2 schedule+0x14 (ret_slow_syscall+0x0)
func   -6 __ipipe_walk_pipeline+0x10 (__ipipe_handle_irq+0x190)
[  183] display- 0-11 xnpod_schedule+0x60c (xnintr_irq_handler+0x128)
[  184] samplin 99-14 xnpod_schedule+0xb4 (xnpod_suspend_thread+0x178)
func  -16 xnpod_schedule+0x14 (xnpod_suspend_thread+0x178)
func  -18 xnpod_suspend_thread+0x14 
(xnpod_wait_thread_period+0xb0)
func  -21 xnpod_wait_thread_period+0x14 
(rt_task_wait_period+0x4c)
func  -23 rt_task_wait_period+0x10 (__rt_task_wait_period+0x54)
func  -25 __rt_task_wait_period+0x14 (hisyscall_event+0x160)
func  -27 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
func  -29 __ipipe_dispatch_event+0x14 
(__ipipe_syscall_root+0x88)
func  -31 __ipipe_syscall_root+0x10 (vector_swi+0x68)
func  -35 rt_timer_tsc+0x10 (__rt_timer_tsc+0x1c)
func  -36 __rt_timer_tsc+0x14 (hisyscall_event+0x160)
func  -39 hisyscall_event+0x14 (__ipipe_dispatch_event+0xc0)
func  -40 __ipipe_dispatch_event+0x14 
(__ipipe_syscall_root+0x88)
func  -42 __ipipe_syscall_root+0x10 (vector_swi+0x68)
func  -46 __ipipe_restore_pipeline_head+0x10 
(xnpod_wait_thread_period+0x1b4)
[  184] samplin 99-49 xnpod_schedule+0x60c (xnpod_suspend_thread+0x178)
[  183] display- 0-53 xnpod_schedule+0xb4 (xnintr_irq_handler+0x128)
func  -55 xnpod_schedule+0x14 (xnintr_irq_handler+0x128)
func  -60 __ipipe_mach_set_dec+0x10 
(xntimer_tick_aperiodic+0x2fc)
[  184] samplin 99-69 xnpod_resume_thread+0x5c 
(xnthread_periodic_handler+0x30)
func  -71 xnpod_resume_thread+0x10 
(xnthread_periodic_handler+0x30)
func  -73 xnthread_periodic_handler+0x10 
(xntimer_tick_aperiodic+0xcc)
func  -77 xntimer_tick_aperiodic+0x14 (xnpod_announce_tick+0x14)
func  -79 xnpod_announce_tick+0x10 (xnintr_irq_handler+0x54)
func  -82 xnintr_irq_handler+0x14 (xnintr_clock_handler+0x20)
func  -84 xnintr_clock_handler+0x10 
(__ipipe_dispatch_wired+0xe4)
kernel BUG at kernel/sched.c:3337!
Unable to handle kernel NULL pointer dereference at virtual address 
pgd = c1a44000
[] *pgd=21a1a031, *pte=, *ppte=
Internal error: Oops: 817 [#1]
Modules linked in:
CPU: 0
PC is at __bug+0x44/0x58
LR is at __ipipe_sync_stage+0x10/0x294
pc : []lr : []Not tainted
sp : c1e8ff64  ip :   fp : c1e8ff74
r10: 003a5b10  r9 : c1e8e000  r8 : 
r7 :   r6 :   r5 : c01ba860  r4 : 
r3 :   r2 : c01ba880  r1 :   r0 : 0001
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: C000717F
Table: 21A44000  DAC: 0015
Process display-181 (pid: 183, stack limit = 0xc1e8e250)
Stack: (0xc1e8ff64 to 0xc1e9)
ff60:   c1e8ffac c1e8ff78 c0181588 c001ecd4 c1e8ff84 c0020340
ff80: c002007c  fefff000    c1e8e000 003a5b10
ffa0:  c1e8ffb0 c001ae04 c0181530 0011b333 3300 07d0 
ffc0: 2028 0320   b714 

Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Steven Scholz wrote:
> Jan,
> 
>>> So what exactly shell I do? I have never worked with "the tracer".
>>>
>> Start here: http://www.xenomai.org/index.php/I-pipe:Tracer
>>
>> I haven't followed all details (while hacking on other bugs :)), but you
>> have two options to catch a trace: the one described on that page *if*
>> your board survives the crash, or via ipipe_trace_panic_freeze()
>> followed by ipipe_trace_panic_dump() (+ switching to sync printk mode
>> first).
> 
> Do I need CONFIG_IPIPE_TRACE_MCOUNT=y for the ipipe_trace_panic_dump()?

Yes, because this is what adds per-function call trace points. Otherwise
the information is fairly thin.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Hi,
> 
> 
>>>schedule. Anyway, I think the tracer will give better results than a
>>>simple backtrace.
>>
>>Ok. Thanks.
>>
>>So what exactly shell I do? I have never worked with "the tracer".
> 
> 
> Just enabled
> 
> CONFIG_IPIPE_DEBUG=y
> CONFIG_IPIPE_TRACE=y
> CONFIG_IPIPE_TRACE_ENABLE=y
> CONFIG_IPIPE_TRACE_MCOUNT=y
> CONFIG_IPIPE_TRACE_IRQSOFF=y
> CONFIG_IPIPE_TRACE_SHIFT=15
> # CONFIG_IPIPE_TRACE_VMALLOC is not set
> CONFIG_IPIPE_TRACE_ENABLE_VALUE=1
> 
> but get
> 
>   CC  arch/arm/kernel/asm-offsets.s
> In file included from include/linux/bitops.h:9,
>  from include/linux/thread_info.h:20,
>  from include/linux/preempt.h:9,
>  from include/linux/spinlock.h:49,
>  from include/linux/capability.h:45,
>  from include/linux/sched.h:46,
>  from arch/arm/kernel/asm-offsets.c:13:
> include/asm/bitops.h: In function `atomic_set_bit':
> include/asm/bitops.h:40: warning: implicit declaration of function 
> `local_test_iflag_hw'

At first sight, replacing local_test_iflag_hw with
raw_irqs_disabled_flags should work.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Jan,

>> So what exactly shell I do? I have never worked with "the tracer".
>>
> 
> Start here: http://www.xenomai.org/index.php/I-pipe:Tracer
> 
> I haven't followed all details (while hacking on other bugs :)), but you
> have two options to catch a trace: the one described on that page *if*
> your board survives the crash, or via ipipe_trace_panic_freeze()
> followed by ipipe_trace_panic_dump() (+ switching to sync printk mode
> first).

Do I need CONFIG_IPIPE_TRACE_MCOUNT=y for the ipipe_trace_panic_dump()?

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Steven Scholz wrote:
> Jan,
> 
>> BTW, do you have the nucleus option "disable priority coupling" set or
>> not in your config (the latter is default, ie. it's on)? The other bug
>> is triggered when prio coupling is on, and doesn't appear while it's
>> off. Just to collect more patterns...
> 
> CONFIG_XENO_OPT_PRIOCPL=y

[Ah, using trunk already]

Mind given a try what happens when it's off?

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Jan,

> BTW, do you have the nucleus option "disable priority coupling" set or
> not in your config (the latter is default, ie. it's on)? The other bug
> is triggered when prio coupling is on, and doesn't appear while it's
> off. Just to collect more patterns...

CONFIG_XENO_OPT_PRIOCPL=y

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Steven Scholz wrote:
> Gilles,
> 
>>> Sure but I would still not expect the system to hang!
>>> As I said missing a deadline is bad but ok.
>>> But hanging the whole system is not quite ok.
>> I want this bug solved too, especially since I am not sure that we will
>> only see it with too short periods.
> 
> Makes us two! ;-)
> 

BTW, do you have the nucleus option "disable priority coupling" set or
not in your config (the latter is default, ie. it's on)? The other bug
is triggered when prio coupling is on, and doesn't appear while it's
off. Just to collect more patterns...

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Hi,

>> schedule. Anyway, I think the tracer will give better results than a
>> simple backtrace.
> Ok. Thanks.
> 
> So what exactly shell I do? I have never worked with "the tracer".

Just enabled

CONFIG_IPIPE_DEBUG=y
CONFIG_IPIPE_TRACE=y
CONFIG_IPIPE_TRACE_ENABLE=y
CONFIG_IPIPE_TRACE_MCOUNT=y
CONFIG_IPIPE_TRACE_IRQSOFF=y
CONFIG_IPIPE_TRACE_SHIFT=15
# CONFIG_IPIPE_TRACE_VMALLOC is not set
CONFIG_IPIPE_TRACE_ENABLE_VALUE=1

but get

  CC  arch/arm/kernel/asm-offsets.s
In file included from include/linux/bitops.h:9,
 from include/linux/thread_info.h:20,
 from include/linux/preempt.h:9,
 from include/linux/spinlock.h:49,
 from include/linux/capability.h:45,
 from include/linux/sched.h:46,
 from arch/arm/kernel/asm-offsets.c:13:
include/asm/bitops.h: In function `atomic_set_bit':
include/asm/bitops.h:40: warning: implicit declaration of function 
`local_test_iflag_hw'

on compiling.

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Jan Kiszka
Steven Scholz wrote:
> Gilles,
> 
>>> Sure but I would still not expect the system to hang!
>>> As I said missing a deadline is bad but ok.
>>> But hanging the whole system is not quite ok.
>> I want this bug solved too, especially since I am not sure that we will
>> only see it with too short periods.
> 
> Makes us two! ;-)
> 
>> I would add the call to printascii(printk_buff) directly in vprintk, and
>> use printk. Note however that special care must be taken to avoid
>> recursion when calling printk inside schedule, because printk may use
>> schedule. Anyway, I think the tracer will give better results than a
>> simple backtrace.
> Ok. Thanks.
> 
> So what exactly shell I do? I have never worked with "the tracer".
> 

Start here: http://www.xenomai.org/index.php/I-pipe:Tracer

I haven't followed all details (while hacking on other bugs :)), but you
have two options to catch a trace: the one described on that page *if*
your board survives the crash, or via ipipe_trace_panic_freeze()
followed by ipipe_trace_panic_dump() (+ switching to sync printk mode
first).

I'm currently cooking a patch to include the IRQ and domain states in
the latter output (they are available via /proc/ipipe/trace/frozen already).

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Gilles,
> 
> 
>>>Sure but I would still not expect the system to hang!
>>>As I said missing a deadline is bad but ok.
>>>But hanging the whole system is not quite ok.
>>
>>I want this bug solved too, especially since I am not sure that we will
>>only see it with too short periods.
> 
> 
> Makes us two! ;-)
> 
> 
>>I would add the call to printascii(printk_buff) directly in vprintk, and
>>use printk. Note however that special care must be taken to avoid
>>recursion when calling printk inside schedule, because printk may use
>>schedule. Anyway, I think the tracer will give better results than a
>>simple backtrace.
> 
> Ok. Thanks.
> 
> So what exactly shell I do? I have never worked with "the tracer".

There is an entry in Xenomai wiki about the I-pipe tracer, see:

http://www.xenomai.org/index.php/I-pipe:Tracer

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Philippe,

 But I don't get the output of __backtrace()!
>>> Before calling your backtrace helper, try adding:
>>>
>>> ipipe_set_printk_sync(ipipe_current_domain);
>> And then use printk() instead of my_printk()?
>>
> 
> Yes, switching this on is a brute force attempt to bypass any
> bufferization and allow printk to call the console driver directly
> regardless of the current domain - this may, or may not work, depending
> on the level of brokenness of the current situation (this said, if I
> don't get why printascii() as used by my_printk() does not send the
> characters to the uart as expected).

Ok. Thanks.

But since BUG() does the backtrace as well, there's no need for my hack.
As you said replacing return with BUG() is enough.

But as you can see from the backtrace, there's not muzch info 

Steven


___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Gilles,

>> Sure but I would still not expect the system to hang!
>> As I said missing a deadline is bad but ok.
>> But hanging the whole system is not quite ok.
> 
> I want this bug solved too, especially since I am not sure that we will
> only see it with too short periods.

Makes us two! ;-)

> I would add the call to printascii(printk_buff) directly in vprintk, and
> use printk. Note however that special care must be taken to avoid
> recursion when calling printk inside schedule, because printk may use
> schedule. Anyway, I think the tracer will give better results than a
> simple backtrace.
Ok. Thanks.

So what exactly shell I do? I have never worked with "the tracer".

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 14:56 +0100, Steven Scholz wrote:
> Philippe,
> 
> >> I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
> >> normal printk(), right?
> >>
> >> *ipipe_current_domain != ipipe_root_domain !
> >> *ipipe_current_domain = c01fc2c0
> >> *ipipe_root_domain= c01af2c0
> >>
> >> But I don't get the output of __backtrace()!
> > 
> > Before calling your backtrace helper, try adding:
> > 
> > ipipe_set_printk_sync(ipipe_current_domain);
> 
> And then use printk() instead of my_printk()?
> 

Yes, switching this on is a brute force attempt to bypass any
bufferization and allow printk to call the console driver directly
regardless of the current domain - this may, or may not work, depending
on the level of brokenness of the current situation (this said, if I
don't get why printascii() as used by my_printk() does not send the
characters to the uart as expected).

> Steven
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Gilles,
> 
> 
>>>I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>>>xenomai-svn-2007-02-22
>>>on an AT91RM9200 (160MHz/80MHz).
>>>
>>>When starting "latency -p 200" it runs for a while printing
>>>
>>>RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>>>RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>>>RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
>>>
>>>but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>>>appears.
>>
>>The only explanation I have is that the period is too small. I do not
>>observe the same behaviour with latency -p 1000. Note that setting the
>>period to a value comparable to the latency is not considered a normal
>>use of Xenomai. 
> 
> 
> Sure but I would still not expect the system to hang!
> As I said missing a deadline is bad but ok.
> But hanging the whole system is not quite ok.

I want this bug solved too, especially since I am not sure that we will
only see it with too short periods.

> 
> 
>>>Using a BDI200 it looks like that in kernel/sched.c:schedule() he is 
>>>returning
>>>in the lines
>>>
>>>#ifdef CONFIG_IPIPE
>>>if (unlikely(!ipipe_root_domain_p))
>>>return;
>>>#endif /* CONFIG_IPIPE */
>>>
>>>When stepping trough I only see him getting into schedule() but leaving
>>>it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
>>>...
>>
>>Ok. Thanks for pointing this out. That is interesting, but not very
>>informative. It would be interesting if you could get the full
>>backtrace. What would be also interesting would be to set a break point
>>on the timer interrupt handler and to follow what happens from timer
>>interrupt to timer interrupt.
> 
> 
> I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
> normal printk(), right?
> 
> *ipipe_current_domain != ipipe_root_domain !
> *ipipe_current_domain = c01fc2c0
> *ipipe_root_domain= c01af2c0
> 
> But I don't get the output of __backtrace()!
> 
> my_printk() works with __backtrace(). The dump of a soft lockup works.

I would add the call to printascii(printk_buff) directly in vprintk, and
use printk. Note however that special care must be taken to avoid
recursion when calling printk inside schedule, because printk may use
schedule. Anyway, I think the tracer will give better results than a
simple backtrace.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Philippe,

>> I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
>> normal printk(), right?
>>
>> *ipipe_current_domain != ipipe_root_domain !
>> *ipipe_current_domain = c01fc2c0
>> *ipipe_root_domain= c01af2c0
>>
>> But I don't get the output of __backtrace()!
> 
> Before calling your backtrace helper, try adding:
> 
>   ipipe_set_printk_sync(ipipe_current_domain);

And then use printk() instead of my_printk()?

Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Gilles,

>> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
>> xenomai-svn-2007-02-22
>> on an AT91RM9200 (160MHz/80MHz).
>>
>> When starting "latency -p 200" it runs for a while printing
>>
>> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
>> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
>> RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
>>
>> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
>> appears.
> 
> The only explanation I have is that the period is too small. I do not
> observe the same behaviour with latency -p 1000. Note that setting the
> period to a value comparable to the latency is not considered a normal
> use of Xenomai. 

Sure but I would still not expect the system to hang!
As I said missing a deadline is bad but ok.
But hanging the whole system is not quite ok.

>> Using a BDI200 it looks like that in kernel/sched.c:schedule() he is 
>> returning
>> in the lines
>>
>> #ifdef CONFIG_IPIPE
>> if (unlikely(!ipipe_root_domain_p))
>> return;
>> #endif /* CONFIG_IPIPE */
>>
>> When stepping trough I only see him getting into schedule() but leaving
>> it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
>> ...
> 
> Ok. Thanks for pointing this out. That is interesting, but not very
> informative. It would be interesting if you could get the full
> backtrace. What would be also interesting would be to set a break point
> on the timer interrupt handler and to follow what happens from timer
> interrupt to timer interrupt.

I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
normal printk(), right?

*ipipe_current_domain != ipipe_root_domain !
*ipipe_current_domain = c01fc2c0
*ipipe_root_domain= c01af2c0

But I don't get the output of __backtrace()!

my_printk() works with __backtrace(). The dump of a soft lockup works.


Steven

Index: linux-2.6.19/arch/arm/lib/backtrace.S
===
--- linux-2.6.19.orig/arch/arm/lib/backtrace.S
+++ linux-2.6.19/arch/arm/lib/backtrace.S
@@ -100,7 +100,7 @@ ENTRY(c_backtrace)
  */
 1007:		ldr	r0, =.Lbad
 		mov	r1, frame
-		bl	printk
+		bl	my_printk
 		ldmfd	sp!, {r4 - r8, pc}
 		.ltorg
 		
@@ -134,12 +134,12 @@ ENTRY(c_backtrace)
 		ldr	r2, [stack], #-4
 		mov	r1, reg
 		adr	r0, .Lfp
-		bl	printk
+		bl	my_printk
 2:		subs	reg, reg, #1
 		bpl	1b
 		teq	r7, #0
 		adrne	r0, .Lcr
-		blne	printk
+		blne	my_printk
 		mov	r0, stack
 		ldmfd	sp!, {instr, reg, stack, r7, r8, pc}
 
Index: linux-2.6.19/include/linux/kernel.h
===
--- linux-2.6.19.orig/include/linux/kernel.h
+++ linux-2.6.19/include/linux/kernel.h
@@ -146,6 +146,12 @@ asmlinkage int vprintk(const char *fmt, 
 	__attribute__ ((format (printf, 1, 0)));
 asmlinkage int printk(const char * fmt, ...)
 	__attribute__ ((format (printf, 1, 2)));
+
+
+asmlinkage int my_printk(const char * fmt, ...)
+__attribute__ ((format (printf, 1, 2)));
+
+
 #else
 static inline int vprintk(const char *s, va_list args)
 	__attribute__ ((format (printf, 1, 0)));
Index: linux-2.6.19/kernel/printk.c
===
--- linux-2.6.19.orig/kernel/printk.c
+++ linux-2.6.19/kernel/printk.c
@@ -524,6 +524,21 @@ void __ipipe_flush_printk (unsigned virq
 	spin_unlock_irqrestore(&__ipipe_printk_lock, flags);
 }
 
+/*FIXME*/
+extern void printascii(const char *);
+asmlinkage int my_printk(const char *fmt, ...)
+{
+va_list va;
+char buff[256];
+
+va_start(va, fmt);
+vsprintf(buff, fmt, va);
+va_end(va);
+
+printascii(buff);
+	return 0;
+}
+
 asmlinkage int printk(const char *fmt, ...)
 {
 	int r, fbytes, oldcount;
Index: linux-2.6.19/kernel/sched.c
===
--- linux-2.6.19.orig/kernel/sched.c
+++ linux-2.6.19/kernel/sched.c
@@ -3327,8 +3327,14 @@ asmlinkage void __sched schedule(void)
 	struct rq *rq;
 
 #ifdef CONFIG_IPIPE
-	if (unlikely(!ipipe_root_domain_p))
+	if (unlikely(!ipipe_root_domain_p)) {
+		my_printk("ipipe_current_domain != ipipe_root_domain !\n");
+		my_printk("ipipe_current_domain = %p\n", ipipe_current_domain);
+		my_printk("ipipe_root_domain= %p\n", ipipe_root_domain);
+		__backtrace();
+		while (1) { barrier();};
 		return;
+	}
 #endif /* CONFIG_IPIPE */
 	/*
 	 * Test if we are atomic.  Since do_exit() needs to call into

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 14:50 +0100, Steven Scholz wrote:
> Gilles,
> 
> >> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
> >> xenomai-svn-2007-02-22
> >> on an AT91RM9200 (160MHz/80MHz).
> >>
> >> When starting "latency -p 200" it runs for a while printing
> >>
> >> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
> >> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat 
> >> worst
> >> RTD|  11.200| 139.200| 236.800|   1|  10.800| 
> >> 280.800
> >>
> >> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
> >> appears.
> > 
> > The only explanation I have is that the period is too small. I do not
> > observe the same behaviour with latency -p 1000. Note that setting the
> > period to a value comparable to the latency is not considered a normal
> > use of Xenomai. 
> 
> Sure but I would still not expect the system to hang!
> As I said missing a deadline is bad but ok.
> But hanging the whole system is not quite ok.
> 
> >> Using a BDI200 it looks like that in kernel/sched.c:schedule() he is 
> >> returning
> >> in the lines
> >>
> >> #ifdef CONFIG_IPIPE
> >> if (unlikely(!ipipe_root_domain_p))
> >> return;
> >> #endif /* CONFIG_IPIPE */
> >>
> >> When stepping trough I only see him getting into schedule() but leaving
> >> it in the above lines and in 
> >> include/linux/proc_fs.h:proc_net_fops_create() ...
> > 
> > Ok. Thanks for pointing this out. That is interesting, but not very
> > informative. It would be interesting if you could get the full
> > backtrace. What would be also interesting would be to set a break point
> > on the timer interrupt handler and to follow what happens from timer
> > interrupt to timer interrupt.
> 
> I tried! Attached the patch I used. Since teh scheduler hangs I can't use 
> normal printk(), right?
> 
> *ipipe_current_domain != ipipe_root_domain !
> *ipipe_current_domain = c01fc2c0
> *ipipe_root_domain= c01af2c0
> 
> But I don't get the output of __backtrace()!

Before calling your backtrace helper, try adding:

ipipe_set_printk_sync(ipipe_current_domain);

> 
> my_printk() works with __backtrace(). The dump of a soft lockup works.
> 
> 
> Steven
> 
> ___
> Xenomai-core mailing list
> Xenomai-core@gna.org
> https://mail.gna.org/listinfo/xenomai-core
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Steven Scholz wrote:
> Hi,
> 
> i pick up this issue again.
> 
> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
> xenomai-svn-2007-02-22
> on an AT91RM9200 (160MHz/80MHz).
> 
> When starting "latency -p 200" it runs for a while printing
> 
> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
> RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
> RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
> RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
> 
> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
> appears.

Easy to reproduce with

~ # cat /dev/zero > /dev/null &
~ # latency -p 200

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Philippe Gerum wrote:
> On Fri, 2007-02-23 at 14:16 +0100, Gilles Chanteperdrix wrote:
> 
>> I do not think to remember that there are cases where calling schedule
>> from a real-time context is done by Xenomai, so maybe you can call panic
>> in schedule instead of returning. I will try and trig a tracer freeze
>> and dump the tracer at this point in order to have a better idea of what
>>  happens.
>>
> 
> True. Just BUG() instead of returning from schedule() in this case would
> do.

Ok. I put in a BUG() :

RTD|  16.000| 149.200| 240.800|   3|  11.600| 284.000
kernel BUG at kernel/sched.c:3331!
Unable to handle kernel NULL pointer dereference at virtual address 
pgd = c1f34000
[] *pgd=21f28031, *pte=, *ppte=
Internal error: Oops: 817 [#1]
Modules linked in:
CPU: 0
PC is at __bug+0x40/0x54
LR is at vscnprintf+0x14/0x28
pc : []lr : []Not tainted
sp : c02b7f64  ip : c02b7f20  fp : c02b7f74
r10: 40247000  r9 : c02b6000  r8 : 
r7 :   r6 :   r5 : c0019de4  r4 : 
r3 :   r2 : c0185845  r1 : 3fda  r0 : 0002
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  Segment user
Control: C000717F
Table: 21F34000  DAC: 0015
Process display-181 (pid: 183, stack limit = 0xc02b6250)
Stack: (0xc02b7f64 to 0xc02b8000)
7f60:   c02b7fac c02b7f78 c0178434 c001db84 c02b7f84 c001f140
7f80: c001ee88  fefff000    c02b6000 40247000
7fa0:  c02b7fb0 c0019de4 c0178410 000e  0003 0352
7fc0: 999a 4023b3c8 0008  4024888c 000c 40247000 be7ff4bc
7fe0: 999a be7ff3b8 999a 4017f0b8 2010  0004 c0188328
Backtrace:
[] (__bug+0x0/0x54) from [] (schedule+0x34/0x838)
* r4 =  *
[] (schedule+0x0/0x838) from [] (ret_slow_syscall+0x0/0x10)
Code: 1b00479a e59f0014 eb004798 e3a03000 (e5833000)


:-(

Won't help much?

--
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 14:16 +0100, Gilles Chanteperdrix wrote:

> I do not think to remember that there are cases where calling schedule
> from a real-time context is done by Xenomai, so maybe you can call panic
> in schedule instead of returning. I will try and trig a tracer freeze
> and dump the tracer at this point in order to have a better idea of what
>  happens.
> 

True. Just BUG() instead of returning from schedule() in this case would
do.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Philippe Gerum
On Fri, 2007-02-23 at 12:27 +0100, Steven Scholz wrote:

> #ifdef CONFIG_IPIPE
> if (unlikely(!ipipe_root_domain_p))
> return;
> #endif /* CONFIG_IPIPE */
> 
> When stepping trough I only see him getting into schedule() but leaving
> it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
> ...
> 

This is exactely the kind of issue which 1.6-02 is expected to solve;
this bug has been identified with all earlier versions, so there must be
another spot where the latter fix is missing in the Adeos patch.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Gilles Chanteperdrix
Steven Scholz wrote:
> Hi,
> 
> i pick up this issue again.
> 
> I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
> xenomai-svn-2007-02-22
> on an AT91RM9200 (160MHz/80MHz).
> 
> When starting "latency -p 200" it runs for a while printing
> 
> RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
> RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
> RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
> RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
> RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800
> 
> but then hangs. The timer LED stops blinking. No "soft lockup detected" 
> appears.

The only explanation I have is that the period is too small. I do not
observe the same behaviour with latency -p 1000. Note that setting the
period to a value comparable to the latency is not considered a normal
use of Xenomai. When setting the period to 100 us on x86, the latency is
less than 50 us (and most of the time a lot less than that), so the
period is at least twice the latency. If you observe a latency of 300
us, you should select a period of at least 600 us to run the test in the
same conditions.

> 
> Using a BDI200 it looks like that in kernel/sched.c:schedule() he is returning
> in the lines
> 
> #ifdef CONFIG_IPIPE
> if (unlikely(!ipipe_root_domain_p))
> return;
> #endif /* CONFIG_IPIPE */
> 
> When stepping trough I only see him getting into schedule() but leaving
> it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() 
> ...

Ok. Thanks for pointing this out. That is interesting, but not very
informative. It would be interesting if you could get the full
backtrace. What would be also interesting would be to set a break point
on the timer interrupt handler and to follow what happens from timer
interrupt to timer interrupt.

I do not think to remember that there are cases where calling schedule
from a real-time context is done by Xenomai, so maybe you can call panic
in schedule instead of returning. I will try and trig a tracer freeze
and dump the tracer at this point in order to have a better idea of what
 happens.

-- 
 Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


[Xenomai-core] latency hangs on AT91RM9200

2007-02-23 Thread Steven Scholz
Hi,

i pick up this issue again.

I am running 2.6.19 + adeos-ipipe-2.6.19-arm-1.6-02.patch + 
xenomai-svn-2007-02-22
on an AT91RM9200 (160MHz/80MHz).

When starting "latency -p 200" it runs for a while printing

RTT|  00:05:37  (periodic user-mode task, 200 us period, priority 99)
RTH|-lat min|-lat avg|-lat max|-overrun|lat best|---lat worst
RTD|  11.200| 139.200| 236.800|   1|  10.800| 280.800
RTD|  11.200| 146.400| 253.200|   1|  10.800| 280.800
RTD|  11.200| 144.400| 240.400|   1|  10.800| 280.800

but then hangs. The timer LED stops blinking. No "soft lockup detected" appears.

Using a BDI200 it looks like that in kernel/sched.c:schedule() he is returning
in the lines

#ifdef CONFIG_IPIPE
if (unlikely(!ipipe_root_domain_p))
return;
#endif /* CONFIG_IPIPE */

When stepping trough I only see him getting into schedule() but leaving
it in the above lines and in include/linux/proc_fs.h:proc_net_fops_create() ...

-- 
Steven

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core