Re: [Xenomai-core] crashing 2.6.22

2007-10-09 Thread Philippe Gerum
On Mon, 2007-10-08 at 10:45 +0200, Gilles Chanteperdrix wrote:
> > Ooops. By reading all my mails, I would have avoided reinventing
> this
> > wheel on my own. Your patch is almost what I posted yesterday to fix
> the
> > vmalloc issue.
> >
> > Looks like we no longer need the last hunk of it on recent kernels,
> right?
> 
> Yes, it fixes an issue which was fixed a long time ago.
> 
Yeah, my mistake. I've postponed this patch and forgot to push it
forward again after the testing period in my cooker. Sorry about that.

> > Jan
> >
> > PS: We should really consider using bug trackers for Adeos and
> Xenomai!
> > I have a few (minor) patches hanging around as well, but things
> quickly
> > get lost when bigger problems pop up.
> 
> We have bug trackers, the point is think about using them.

Indeed. We even have patch trackers we could activate. Either we use
them, or any patch sent for inclusion should be posted in a separate
mail to the -core list, with a [PATCH] header. I'm sometimes missing
them because they are part of a lengthy conversation, buried under lots
of mail I could fast-forward a bit aggressively.
> 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-08 Thread Gilles Chanteperdrix
On 10/8/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> Gilles Chanteperdrix wrote:
> > On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> >> Philippe Gerum wrote:
> >>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>  Jan Kiszka wrote:
> > Philippe Gerum wrote:
> >> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>  ...
> >>>  And a third
> >>> one only gives me "Detected illicit call from domain Xenomai" before 
> >>> the
> >>> box reboots. :(
> >> Grmff... Do you run with your smp_processor_id() instrumentation in?
> > Yes, but I suspect this is just a symptom of some severe memory
> > corruption that (also?) hits I-pipe data structures. I just put in some
> > different instrumentation, and that warning is gone, the box just hangs
> > hard at a different point. Very unfriendly.
>  Hah! Got some crash log by hacking a raw printk-to-uart:
> 
>  [...]
>  <6>Xenomai: starting RTDM services.
>  <6>NET: Registered protocol family 10
>  <6>lo: Disabled Privacy Extensions
>  <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>  <3>I-pipe: Detected illicit call from domain 'Xenomai'
>  <3>into a service reserved for domain 'Linux' and below.
> f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>  c05dc100
> 0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>  f3a6bc70
> c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>  c012727f
>  Call Trace:
>   [] show_trace_log_lvl+0x1f/0x40
>   [] show_stack_log_lvl+0xb1/0xe0
>   [] show_stack+0x33/0x40
>   [] ipipe_check_context+0x7b/0x90
>   [] __atomic_notifier_call_chain+0x24/0x60
>   [] atomic_notifier_call_chain+0x1f/0x30
>   [] notify_die+0x32/0x40
>   [] do_invalid_op+0x59/0xa0
>   [] __ipipe_handle_exception+0x7b/0x144
>   [] error_code+0x6f/0x7c
> >>> Wow. Why that?
> >>>
>   [] __ipipe_handle_exception+0x83/0x144
>   [] error_code+0x6f/0x7c
> >>> And this? We should not get any exception over an IPI3 handler. I guess
> >>> the double fault may be explained by this root cause.
> >>>
>   [] __ipipe_handle_irq+0x4f/0x140
>   [] ipipe_ipi3+0x26/0x40
> >>> Our LAPIC timer vector. Are you running full modular or statically btw?
> >> Fully modular. Compiling the nucleus in makes the lock-up move to
> >> another, once again invisible spot.
> >>
> >> I nailed down the fault address in the scenario above. It's in the
> >> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
> >> loosing module text pages over the time? This functions must have been
> >> executed before as the timer was armed while I collected the
> >> /proc/modules and then triggered the crash.
> >
> > There is a pending issue about vmalloced areas, which I completely forgot:
> > https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
> >
>
> Ooops. By reading all my mails, I would have avoided reinventing this
> wheel on my own. Your patch is almost what I posted yesterday to fix the
> vmalloc issue.
>
> Looks like we no longer need the last hunk of it on recent kernels, right?

Yes, it fixes an issue which was fixed a long time ago.

> Jan
>
> PS: We should really consider using bug trackers for Adeos and Xenomai!
> I have a few (minor) patches hanging around as well, but things quickly
> get lost when bigger problems pop up.

We have bug trackers, the point is think about using them.

-- 
   Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-08 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>> Philippe Gerum wrote:
>>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
 Jan Kiszka wrote:
> Philippe Gerum wrote:
>> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
 ...
>>>  And a third
>>> one only gives me "Detected illicit call from domain Xenomai" before the
>>> box reboots. :(
>> Grmff... Do you run with your smp_processor_id() instrumentation in?
> Yes, but I suspect this is just a symptom of some severe memory
> corruption that (also?) hits I-pipe data structures. I just put in some
> different instrumentation, and that warning is gone, the box just hangs
> hard at a different point. Very unfriendly.
 Hah! Got some crash log by hacking a raw printk-to-uart:

 [...]
 <6>Xenomai: starting RTDM services.
 <6>NET: Registered protocol family 10
 <6>lo: Disabled Privacy Extensions
 <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
 <3>I-pipe: Detected illicit call from domain 'Xenomai'
 <3>into a service reserved for domain 'Linux' and below.
f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
 c05dc100
0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
 f3a6bc70
c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
 c012727f
 Call Trace:
  [] show_trace_log_lvl+0x1f/0x40
  [] show_stack_log_lvl+0xb1/0xe0
  [] show_stack+0x33/0x40
  [] ipipe_check_context+0x7b/0x90
  [] __atomic_notifier_call_chain+0x24/0x60
  [] atomic_notifier_call_chain+0x1f/0x30
  [] notify_die+0x32/0x40
  [] do_invalid_op+0x59/0xa0
  [] __ipipe_handle_exception+0x7b/0x144
  [] error_code+0x6f/0x7c
>>> Wow. Why that?
>>>
  [] __ipipe_handle_exception+0x83/0x144
  [] error_code+0x6f/0x7c
>>> And this? We should not get any exception over an IPI3 handler. I guess
>>> the double fault may be explained by this root cause.
>>>
  [] __ipipe_handle_irq+0x4f/0x140
  [] ipipe_ipi3+0x26/0x40
>>> Our LAPIC timer vector. Are you running full modular or statically btw?
>> Fully modular. Compiling the nucleus in makes the lock-up move to
>> another, once again invisible spot.
>>
>> I nailed down the fault address in the scenario above. It's in the
>> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
>> loosing module text pages over the time? This functions must have been
>> executed before as the timer was armed while I collected the
>> /proc/modules and then triggered the crash.
> 
> There is a pending issue about vmalloced areas, which I completely forgot:
> https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
> 

Ooops. By reading all my mails, I would have avoided reinventing this
wheel on my own. Your patch is almost what I posted yesterday to fix the
vmalloc issue.

Looks like we no longer need the last hunk of it on recent kernels, right?

Jan

PS: We should really consider using bug trackers for Adeos and Xenomai!
I have a few (minor) patches hanging around as well, but things quickly
get lost when bigger problems pop up.



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Labozzetta, Saverio




-Original Message-
From: [EMAIL PROTECTED] on behalf of Labozzetta, Saverio
Sent: Mon 2007-10-01 2:42 PM
To: Jan Kiszka; Gilles Chanteperdrix
Cc: xenomai-core
Subject: Re: [Xenomai-core] crashing 2.6.22
 




>>-Original Message-
>>From: [EMAIL PROTECTED] on behalf of Jan Kiszka
>>Sent: Mon 2007-10-01 11:32 AM
>>To: Gilles Chanteperdrix
>>Cc: xenomai-core
>Subject: Re: [Xenomai-core] crashing 2.6.22
>> 
>>Gilles Chanteperdrix wrote:
>>> On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> Philippe Gerum wrote:
>>>>>>>>>> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>>>>>>>> ...
>>>>>>>>>>>  And a third
>>>>>>>>>>> one only gives me "Detected illicit call from domain Xenomai" 
>>>>>>>>>>> before the
>>>>>>>>>>> box reboots. :(
>>>>>>>>>> Grmff... Do you run with your smp_processor_id() instrumentation in?
>>>>>>>>> Yes, but I suspect this is just a symptom of some severe memory
>>>>>>>>> corruption that (also?) hits I-pipe data structures. I just put in 
>>>>>>>>> some
>>>>>>>>> different instrumentation, and that warning is gone, the box just 
>>>>>>>>> hangs
>>>>>>>>> hard at a different point. Very unfriendly.
>>>>>>>> Hah! Got some crash log by hacking a raw printk-to-uart:
>>>>>>>>
>>>>>>>> [...]
>>>>>>>> <6>Xenomai: starting RTDM services.
>>>>>>>> <6>NET: Registered protocol family 10
>>>>>>>> <6>lo: Disabled Privacy Extensions
>>>>>>>> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>>>>>>>> <3>I-pipe: Detected illicit call from domain 'Xenomai'
>>>>>>>> <3>into a service reserved for domain 'Linux' and below.
>>>>>>>>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>>>>>>>> c05dc100
>>>>>>>>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>>>>>>>> f3a6bc70
>>>>>>>>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>>>>>>>> c012727f
>>>>>>>> Call Trace:
>>>>>>>>  [] show_trace_log_lvl+0x1f/0x40
>>>>>>>>  [] show_stack_log_lvl+0xb1/0xe0
>>>>>>>>  [] show_stack+0x33/0x40
>>>>>>>>  [] ipipe_check_context+0x7b/0x90
>>>>>>>>  [] __atomic_notifier_call_chain+0x24/0x60
>>>>>>>>  [] atomic_notifier_call_chain+0x1f/0x30
>>>>>>>>  [] notify_die+0x32/0x40
>>>>>>>>  [] do_invalid_op+0x59/0xa0
>>>>>>>>  [] __ipipe_handle_exception+0x7b/0x144
>>>>>>>>  [] error_code+0x6f/0x7c
>>>>>>> Wow. Why that?
>>>>>>>
>>>>>>>>  [] __ipipe_handle_exception+0x83/0x144
>>>>>>>>  [] error_code+0x6f/0x7c
>>>>>>> And this? We should not get any exception over an IPI3 handler. I guess
>>>>>>> the double fault may be explained by this root cause.
>>>>>>>
>>>>>>>>  [] __ipipe_handle_irq+0x4f/0x140
>>>>>>>>  [] ipipe_ipi3+0x26/0x40
>>>>>>> Our LAPIC timer vector. Are you running full modular or statically btw?
>>>>>> Fully modular. Compiling the nucleus in makes the lock-up move to
>>>>>> another, once again invisible spot.
>>>>>
>>>>>> I nailed down the fault address in the scenario above. It's in the
>>>>>> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
>>>>>> loosing module text pages over the time? This functions must have been
>>>>>> executed before as the timer was armed while I collected the
>>>>>> /proc/modu

Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Labozzetta, Saverio




>-Original Message-
>From: [EMAIL PROTECTED] on behalf of Jan Kiszka
>Sent: Mon 2007-10-01 11:32 AM
>To: Gilles Chanteperdrix
>Cc: xenomai-core
>Subject: Re: [Xenomai-core] crashing 2.6.22
> 
>Gilles Chanteperdrix wrote:
>> On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>>> Gilles Chanteperdrix wrote:
>>>> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>>>>> Philippe Gerum wrote:
>>>>>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Philippe Gerum wrote:
>>>>>>>>> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>>>>>>> ...
>>>>>>>>>>  And a third
>>>>>>>>>> one only gives me "Detected illicit call from domain Xenomai" before 
>>>>>>>>>> the
>>>>>>>>>> box reboots. :(
>>>>>>>>> Grmff... Do you run with your smp_processor_id() instrumentation in?
>>>>>>>> Yes, but I suspect this is just a symptom of some severe memory
>>>>>>>> corruption that (also?) hits I-pipe data structures. I just put in some
>>>>>>>> different instrumentation, and that warning is gone, the box just hangs
>>>>>>>> hard at a different point. Very unfriendly.
>>>>>>> Hah! Got some crash log by hacking a raw printk-to-uart:
>>>>>>>
>>>>>>> [...]
>>>>>>> <6>Xenomai: starting RTDM services.
>>>>>>> <6>NET: Registered protocol family 10
>>>>>>> <6>lo: Disabled Privacy Extensions
>>>>>>> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>>>>>>> <3>I-pipe: Detected illicit call from domain 'Xenomai'
>>>>>>> <3>into a service reserved for domain 'Linux' and below.
>>>>>>>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>>>>>>> c05dc100
>>>>>>>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>>>>>>> f3a6bc70
>>>>>>>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>>>>>>> c012727f
>>>>>>> Call Trace:
>>>>>>>  [] show_trace_log_lvl+0x1f/0x40
>>>>>>>  [] show_stack_log_lvl+0xb1/0xe0
>>>>>>>  [] show_stack+0x33/0x40
>>>>>>>  [] ipipe_check_context+0x7b/0x90
>>>>>>>  [] __atomic_notifier_call_chain+0x24/0x60
>>>>>>>  [] atomic_notifier_call_chain+0x1f/0x30
>>>>>>>  [] notify_die+0x32/0x40
>>>>>>>  [] do_invalid_op+0x59/0xa0
>>>>>>>  [] __ipipe_handle_exception+0x7b/0x144
>>>>>>>  [] error_code+0x6f/0x7c
>>>>>> Wow. Why that?
>>>>>>
>>>>>>>  [] __ipipe_handle_exception+0x83/0x144
>>>>>>>  [] error_code+0x6f/0x7c
>>>>>> And this? We should not get any exception over an IPI3 handler. I guess
>>>>>> the double fault may be explained by this root cause.
>>>>>>
>>>>>>>  [] __ipipe_handle_irq+0x4f/0x140
>>>>>>>  [] ipipe_ipi3+0x26/0x40
>>>>>> Our LAPIC timer vector. Are you running full modular or statically btw?
>>>>> Fully modular. Compiling the nucleus in makes the lock-up move to
>>>>> another, once again invisible spot.
>>>>>
>>>>> I nailed down the fault address in the scenario above. It's in the
>>>>> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
>>>>> loosing module text pages over the time? This functions must have been
>>>>> executed before as the timer was armed while I collected the
>>>>> /proc/modules and then triggered the crash.
>>>> There is a pending issue about vmalloced areas, which I completely forgot:
>>>> https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
>>>>
>>> Would this explain my problems which are already visible without any
>>> Xenomai application running (and also without unloading the modules
>>> again, to answer Philippe's question)? Hell, I would love to find the
>>> reason here, debugging this stuff stopped being fun a long time ago...
>> 
>> It would explain bugs involving a race between task creation and
>> vmalloc/ioremap. But the bug would only happen with Xenomai tasks
>> running,
>
>I don't need to start any Xenomai task to trigger the problem.
>
>> otherwise, the vmalloced/ioremaped area would be mapped lazily as usual.
>
>I guess module text pages are not mapped lazily, otherwise quite a lot 
>of things would have fallen apart much earlier, right?

 AFAIK Once inserted module text pages are part of the kernel, so have
to be reliably ready as long as the servicies offered are registred,
is the insertion function which allocates memory, access it to write 
the text of the module and make it part of the kernel, so is keep in 
main memory.

  Saverio

>
>Jan
>
>-- 
>Siemens AG, Corporate Technology, CT SE 2
>Corporate Competence Center Embedded Linux
>



This message contains information that may be privileged or confidential and is 
the property of the Capgemini Group. It is intended only for the person to whom 
it is addressed. If you are not the intended recipient,  you are not authorized 
to read, print, retain, copy, disseminate,  distribute, or use this message or 
any part thereof. If you receive this  message in error, please notify the 
sender immediately and delete all  copies of this message.
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Gilles Chanteperdrix
On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> Gilles Chanteperdrix wrote:
> > On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> >> Gilles Chanteperdrix wrote:
> >>> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>  Philippe Gerum wrote:
> > On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
> >> Jan Kiszka wrote:
> >>> Philippe Gerum wrote:
>  On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
> >> ...
> >  And a third
> > one only gives me "Detected illicit call from domain Xenomai" 
> > before the
> > box reboots. :(
>  Grmff... Do you run with your smp_processor_id() instrumentation in?
> >>> Yes, but I suspect this is just a symptom of some severe memory
> >>> corruption that (also?) hits I-pipe data structures. I just put in 
> >>> some
> >>> different instrumentation, and that warning is gone, the box just 
> >>> hangs
> >>> hard at a different point. Very unfriendly.
> >> Hah! Got some crash log by hacking a raw printk-to-uart:
> >>
> >> [...]
> >> <6>Xenomai: starting RTDM services.
> >> <6>NET: Registered protocol family 10
> >> <6>lo: Disabled Privacy Extensions
> >> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
> >> <3>I-pipe: Detected illicit call from domain 'Xenomai'
> >> <3>into a service reserved for domain 'Linux' and below.
> >>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
> >> c05dc100
> >>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
> >> f3a6bc70
> >>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
> >> c012727f
> >> Call Trace:
> >>  [] show_trace_log_lvl+0x1f/0x40
> >>  [] show_stack_log_lvl+0xb1/0xe0
> >>  [] show_stack+0x33/0x40
> >>  [] ipipe_check_context+0x7b/0x90
> >>  [] __atomic_notifier_call_chain+0x24/0x60
> >>  [] atomic_notifier_call_chain+0x1f/0x30
> >>  [] notify_die+0x32/0x40
> >>  [] do_invalid_op+0x59/0xa0
> >>  [] __ipipe_handle_exception+0x7b/0x144
> >>  [] error_code+0x6f/0x7c
> > Wow. Why that?
> >
> >>  [] __ipipe_handle_exception+0x83/0x144
> >>  [] error_code+0x6f/0x7c
> > And this? We should not get any exception over an IPI3 handler. I guess
> > the double fault may be explained by this root cause.
> >
> >>  [] __ipipe_handle_irq+0x4f/0x140
> >>  [] ipipe_ipi3+0x26/0x40
> > Our LAPIC timer vector. Are you running full modular or statically btw?
>  Fully modular. Compiling the nucleus in makes the lock-up move to
>  another, once again invisible spot.
> 
>  I nailed down the fault address in the scenario above. It's in the
>  nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
>  loosing module text pages over the time? This functions must have been
>  executed before as the timer was armed while I collected the
>  /proc/modules and then triggered the crash.
> >>> There is a pending issue about vmalloced areas, which I completely forgot:
> >>> https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
> >>>
> >> Would this explain my problems which are already visible without any
> >> Xenomai application running (and also without unloading the modules
> >> again, to answer Philippe's question)? Hell, I would love to find the
> >> reason here, debugging this stuff stopped being fun a long time ago...
> >
> > It would explain bugs involving a race between task creation and
> > vmalloc/ioremap. But the bug would only happen with Xenomai tasks
> > running,
>
> I don't need to start any Xenomai task to trigger the problem.
>
> > otherwise, the vmalloced/ioremaped area would be mapped lazily as usual.
>
> I guess module text pages are not mapped lazily, otherwise quite a lot
> of things would have fallen apart much earlier, right?

This would happen when a task and a module are created at the same
time, and the module would be mapped lazily only for the newly created
task.

-- 
   Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>> Gilles Chanteperdrix wrote:
>>> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
 Philippe Gerum wrote:
> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>> ...
>  And a third
> one only gives me "Detected illicit call from domain Xenomai" before 
> the
> box reboots. :(
 Grmff... Do you run with your smp_processor_id() instrumentation in?
>>> Yes, but I suspect this is just a symptom of some severe memory
>>> corruption that (also?) hits I-pipe data structures. I just put in some
>>> different instrumentation, and that warning is gone, the box just hangs
>>> hard at a different point. Very unfriendly.
>> Hah! Got some crash log by hacking a raw printk-to-uart:
>>
>> [...]
>> <6>Xenomai: starting RTDM services.
>> <6>NET: Registered protocol family 10
>> <6>lo: Disabled Privacy Extensions
>> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>> <3>I-pipe: Detected illicit call from domain 'Xenomai'
>> <3>into a service reserved for domain 'Linux' and below.
>>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>> c05dc100
>>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>> f3a6bc70
>>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>> c012727f
>> Call Trace:
>>  [] show_trace_log_lvl+0x1f/0x40
>>  [] show_stack_log_lvl+0xb1/0xe0
>>  [] show_stack+0x33/0x40
>>  [] ipipe_check_context+0x7b/0x90
>>  [] __atomic_notifier_call_chain+0x24/0x60
>>  [] atomic_notifier_call_chain+0x1f/0x30
>>  [] notify_die+0x32/0x40
>>  [] do_invalid_op+0x59/0xa0
>>  [] __ipipe_handle_exception+0x7b/0x144
>>  [] error_code+0x6f/0x7c
> Wow. Why that?
>
>>  [] __ipipe_handle_exception+0x83/0x144
>>  [] error_code+0x6f/0x7c
> And this? We should not get any exception over an IPI3 handler. I guess
> the double fault may be explained by this root cause.
>
>>  [] __ipipe_handle_irq+0x4f/0x140
>>  [] ipipe_ipi3+0x26/0x40
> Our LAPIC timer vector. Are you running full modular or statically btw?
 Fully modular. Compiling the nucleus in makes the lock-up move to
 another, once again invisible spot.

 I nailed down the fault address in the scenario above. It's in the
 nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
 loosing module text pages over the time? This functions must have been
 executed before as the timer was armed while I collected the
 /proc/modules and then triggered the crash.
>>> There is a pending issue about vmalloced areas, which I completely forgot:
>>> https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
>>>
>> Would this explain my problems which are already visible without any
>> Xenomai application running (and also without unloading the modules
>> again, to answer Philippe's question)? Hell, I would love to find the
>> reason here, debugging this stuff stopped being fun a long time ago...
> 
> It would explain bugs involving a race between task creation and
> vmalloc/ioremap. But the bug would only happen with Xenomai tasks
> running,

I don't need to start any Xenomai task to trigger the problem.

> otherwise, the vmalloced/ioremaped area would be mapped lazily as usual.

I guess module text pages are not mapped lazily, otherwise quite a lot 
of things would have fallen apart much earlier, right?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Gilles Chanteperdrix
On 10/1/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> Gilles Chanteperdrix wrote:
> > On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> >> Philippe Gerum wrote:
> >>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>  Jan Kiszka wrote:
> > Philippe Gerum wrote:
> >> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>  ...
> >>>  And a third
> >>> one only gives me "Detected illicit call from domain Xenomai" before 
> >>> the
> >>> box reboots. :(
> >> Grmff... Do you run with your smp_processor_id() instrumentation in?
> > Yes, but I suspect this is just a symptom of some severe memory
> > corruption that (also?) hits I-pipe data structures. I just put in some
> > different instrumentation, and that warning is gone, the box just hangs
> > hard at a different point. Very unfriendly.
>  Hah! Got some crash log by hacking a raw printk-to-uart:
> 
>  [...]
>  <6>Xenomai: starting RTDM services.
>  <6>NET: Registered protocol family 10
>  <6>lo: Disabled Privacy Extensions
>  <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>  <3>I-pipe: Detected illicit call from domain 'Xenomai'
>  <3>into a service reserved for domain 'Linux' and below.
> f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>  c05dc100
> 0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>  f3a6bc70
> c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>  c012727f
>  Call Trace:
>   [] show_trace_log_lvl+0x1f/0x40
>   [] show_stack_log_lvl+0xb1/0xe0
>   [] show_stack+0x33/0x40
>   [] ipipe_check_context+0x7b/0x90
>   [] __atomic_notifier_call_chain+0x24/0x60
>   [] atomic_notifier_call_chain+0x1f/0x30
>   [] notify_die+0x32/0x40
>   [] do_invalid_op+0x59/0xa0
>   [] __ipipe_handle_exception+0x7b/0x144
>   [] error_code+0x6f/0x7c
> >>> Wow. Why that?
> >>>
>   [] __ipipe_handle_exception+0x83/0x144
>   [] error_code+0x6f/0x7c
> >>> And this? We should not get any exception over an IPI3 handler. I guess
> >>> the double fault may be explained by this root cause.
> >>>
>   [] __ipipe_handle_irq+0x4f/0x140
>   [] ipipe_ipi3+0x26/0x40
> >>> Our LAPIC timer vector. Are you running full modular or statically btw?
> >> Fully modular. Compiling the nucleus in makes the lock-up move to
> >> another, once again invisible spot.
> >>
> >> I nailed down the fault address in the scenario above. It's in the
> >> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
> >> loosing module text pages over the time? This functions must have been
> >> executed before as the timer was armed while I collected the
> >> /proc/modules and then triggered the crash.
> >
> > There is a pending issue about vmalloced areas, which I completely forgot:
> > https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
> >
>
> Would this explain my problems which are already visible without any
> Xenomai application running (and also without unloading the modules
> again, to answer Philippe's question)? Hell, I would love to find the
> reason here, debugging this stuff stopped being fun a long time ago...

It would explain bugs involving a race between task creation and
vmalloc/ioremap. But the bug would only happen with Xenomai tasks
running,
otherwise, the vmalloced/ioremaped area would be mapped lazily as usual.

-- 
   Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Jan Kiszka
Gilles Chanteperdrix wrote:
> On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
>> Philippe Gerum wrote:
>>> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
 Jan Kiszka wrote:
> Philippe Gerum wrote:
>> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
 ...
>>>  And a third
>>> one only gives me "Detected illicit call from domain Xenomai" before the
>>> box reboots. :(
>> Grmff... Do you run with your smp_processor_id() instrumentation in?
> Yes, but I suspect this is just a symptom of some severe memory
> corruption that (also?) hits I-pipe data structures. I just put in some
> different instrumentation, and that warning is gone, the box just hangs
> hard at a different point. Very unfriendly.
 Hah! Got some crash log by hacking a raw printk-to-uart:

 [...]
 <6>Xenomai: starting RTDM services.
 <6>NET: Registered protocol family 10
 <6>lo: Disabled Privacy Extensions
 <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
 <3>I-pipe: Detected illicit call from domain 'Xenomai'
 <3>into a service reserved for domain 'Linux' and below.
f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
 c05dc100
0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
 f3a6bc70
c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
 c012727f
 Call Trace:
  [] show_trace_log_lvl+0x1f/0x40
  [] show_stack_log_lvl+0xb1/0xe0
  [] show_stack+0x33/0x40
  [] ipipe_check_context+0x7b/0x90
  [] __atomic_notifier_call_chain+0x24/0x60
  [] atomic_notifier_call_chain+0x1f/0x30
  [] notify_die+0x32/0x40
  [] do_invalid_op+0x59/0xa0
  [] __ipipe_handle_exception+0x7b/0x144
  [] error_code+0x6f/0x7c
>>> Wow. Why that?
>>>
  [] __ipipe_handle_exception+0x83/0x144
  [] error_code+0x6f/0x7c
>>> And this? We should not get any exception over an IPI3 handler. I guess
>>> the double fault may be explained by this root cause.
>>>
  [] __ipipe_handle_irq+0x4f/0x140
  [] ipipe_ipi3+0x26/0x40
>>> Our LAPIC timer vector. Are you running full modular or statically btw?
>> Fully modular. Compiling the nucleus in makes the lock-up move to
>> another, once again invisible spot.
>>
>> I nailed down the fault address in the scenario above. It's in the
>> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
>> loosing module text pages over the time? This functions must have been
>> executed before as the timer was armed while I collected the
>> /proc/modules and then triggered the crash.
> 
> There is a pending issue about vmalloced areas, which I completely forgot:
> https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html
> 

Would this explain my problems which are already visible without any 
Xenomai application running (and also without unloading the modules 
again, to answer Philippe's question)? Hell, I would love to find the 
reason here, debugging this stuff stopped being fun a long time ago...

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-10-01 Thread Gilles Chanteperdrix
On 9/30/07, Jan Kiszka <[EMAIL PROTECTED]> wrote:
> Philippe Gerum wrote:
> > On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
> >> Jan Kiszka wrote:
> >>> Philippe Gerum wrote:
>  On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
> >> ...
> >  And a third
> > one only gives me "Detected illicit call from domain Xenomai" before the
> > box reboots. :(
>  Grmff... Do you run with your smp_processor_id() instrumentation in?
> >>> Yes, but I suspect this is just a symptom of some severe memory
> >>> corruption that (also?) hits I-pipe data structures. I just put in some
> >>> different instrumentation, and that warning is gone, the box just hangs
> >>> hard at a different point. Very unfriendly.
> >> Hah! Got some crash log by hacking a raw printk-to-uart:
> >>
> >> [...]
> >> <6>Xenomai: starting RTDM services.
> >> <6>NET: Registered protocol family 10
> >> <6>lo: Disabled Privacy Extensions
> >> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
> >> <3>I-pipe: Detected illicit call from domain 'Xenomai'
> >> <3>into a service reserved for domain 'Linux' and below.
> >>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
> >> c05dc100
> >>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
> >> f3a6bc70
> >>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
> >> c012727f
> >> Call Trace:
> >>  [] show_trace_log_lvl+0x1f/0x40
> >>  [] show_stack_log_lvl+0xb1/0xe0
> >>  [] show_stack+0x33/0x40
> >>  [] ipipe_check_context+0x7b/0x90
> >>  [] __atomic_notifier_call_chain+0x24/0x60
> >>  [] atomic_notifier_call_chain+0x1f/0x30
> >>  [] notify_die+0x32/0x40
> >>  [] do_invalid_op+0x59/0xa0
> >>  [] __ipipe_handle_exception+0x7b/0x144
> >>  [] error_code+0x6f/0x7c
> >
> > Wow. Why that?
> >
> >>  [] __ipipe_handle_exception+0x83/0x144
> >>  [] error_code+0x6f/0x7c
> >
> > And this? We should not get any exception over an IPI3 handler. I guess
> > the double fault may be explained by this root cause.
> >
> >>  [] __ipipe_handle_irq+0x4f/0x140
> >>  [] ipipe_ipi3+0x26/0x40
> >
> > Our LAPIC timer vector. Are you running full modular or statically btw?
>
> Fully modular. Compiling the nucleus in makes the lock-up move to
> another, once again invisible spot.
>
> I nailed down the fault address in the scenario above. It's in the
> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
> loosing module text pages over the time? This functions must have been
> executed before as the timer was armed while I collected the
> /proc/modules and then triggered the crash.

There is a pending issue about vmalloced areas, which I completely forgot:
https://mail.gna.org/public/xenomai-core/2007-02/msg00138.html

-- 
   Gilles Chanteperdrix

___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-09-30 Thread Philippe Gerum
On Sun, 2007-09-30 at 17:31 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
> >> Jan Kiszka wrote:
> >>> Philippe Gerum wrote:
>  On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
> >> ...
> >  And a third
> > one only gives me "Detected illicit call from domain Xenomai" before the
> > box reboots. :(
>  Grmff... Do you run with your smp_processor_id() instrumentation in?
> >>> Yes, but I suspect this is just a symptom of some severe memory
> >>> corruption that (also?) hits I-pipe data structures. I just put in some
> >>> different instrumentation, and that warning is gone, the box just hangs
> >>> hard at a different point. Very unfriendly.
> >> Hah! Got some crash log by hacking a raw printk-to-uart:
> >>
> >> [...]
> >> <6>Xenomai: starting RTDM services.
> >> <6>NET: Registered protocol family 10
> >> <6>lo: Disabled Privacy Extensions
> >> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
> >> <3>I-pipe: Detected illicit call from domain 'Xenomai'
> >> <3>into a service reserved for domain 'Linux' and below.
> >>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
> >> c05dc100
> >>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
> >> f3a6bc70
> >>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
> >> c012727f
> >> Call Trace:
> >>  [] show_trace_log_lvl+0x1f/0x40
> >>  [] show_stack_log_lvl+0xb1/0xe0
> >>  [] show_stack+0x33/0x40
> >>  [] ipipe_check_context+0x7b/0x90
> >>  [] __atomic_notifier_call_chain+0x24/0x60
> >>  [] atomic_notifier_call_chain+0x1f/0x30
> >>  [] notify_die+0x32/0x40
> >>  [] do_invalid_op+0x59/0xa0
> >>  [] __ipipe_handle_exception+0x7b/0x144
> >>  [] error_code+0x6f/0x7c
> > 
> > Wow. Why that?
> > 
> >>  [] __ipipe_handle_exception+0x83/0x144
> >>  [] error_code+0x6f/0x7c
> > 
> > And this? We should not get any exception over an IPI3 handler. I guess
> > the double fault may be explained by this root cause.
> > 
> >>  [] __ipipe_handle_irq+0x4f/0x140
> >>  [] ipipe_ipi3+0x26/0x40
> > 
> > Our LAPIC timer vector. Are you running full modular or statically btw?
> 
> Fully modular. Compiling the nucleus in makes the lock-up move to
> another, once again invisible spot.
> 
> I nailed down the fault address in the scenario above. It's in the
> nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
> loosing module text pages over the time?
>  This functions must have been
> executed before as the timer was armed while I collected the
> /proc/modules and then triggered the crash.

The timer is routed when the first skin binds to the nucleus. Modules
are unmapped while the box goes down for reboot, so maybe the timer is
not released in the LAPIC case upon such event. IIRC, I fixed a similar
issue in the PIT case recently, where rthal_timer_release() would not
call ipipe_release_tickdev(). It would be interesting to know whether
rthal_timer_release() is ever called at all upon shutdown. If not, the
kernel event notifier is likely going to be our friend soon...

> 
> Jan
> 
-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22 (was: [Xenomai-help] Non-APIC setup broken for 2.4-SVN?)

2007-09-30 Thread Philippe Gerum
On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Philippe Gerum wrote:
> >> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
> ...
> >>>  And a third
> >>> one only gives me "Detected illicit call from domain Xenomai" before the
> >>> box reboots. :(
> >> Grmff... Do you run with your smp_processor_id() instrumentation in?
> > 
> > Yes, but I suspect this is just a symptom of some severe memory
> > corruption that (also?) hits I-pipe data structures. I just put in some
> > different instrumentation, and that warning is gone, the box just hangs
> > hard at a different point. Very unfriendly.
> 
> Hah! Got some crash log by hacking a raw printk-to-uart:
> 

Btw, if it's based on fiddling with the 16550 directly as I imagine it
is, you may want to push me a patch. We already have this feature for
blackfin and powerpc (__ipipe_serial_debug(const char *fmt, ...)), and
it definitely makes sense to have it for x86* too.

-- 
Philippe.



___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22

2007-09-30 Thread Jan Kiszka
Philippe Gerum wrote:
> On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
 On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
>> ...
>  And a third
> one only gives me "Detected illicit call from domain Xenomai" before the
> box reboots. :(
 Grmff... Do you run with your smp_processor_id() instrumentation in?
>>> Yes, but I suspect this is just a symptom of some severe memory
>>> corruption that (also?) hits I-pipe data structures. I just put in some
>>> different instrumentation, and that warning is gone, the box just hangs
>>> hard at a different point. Very unfriendly.
>> Hah! Got some crash log by hacking a raw printk-to-uart:
>>
>> [...]
>> <6>Xenomai: starting RTDM services.
>> <6>NET: Registered protocol family 10
>> <6>lo: Disabled Privacy Extensions
>> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
>> <3>I-pipe: Detected illicit call from domain 'Xenomai'
>> <3>into a service reserved for domain 'Linux' and below.
>>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 
>> c05dc100
>>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 
>> f3a6bc70
>>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 
>> c012727f
>> Call Trace:
>>  [] show_trace_log_lvl+0x1f/0x40
>>  [] show_stack_log_lvl+0xb1/0xe0
>>  [] show_stack+0x33/0x40
>>  [] ipipe_check_context+0x7b/0x90
>>  [] __atomic_notifier_call_chain+0x24/0x60
>>  [] atomic_notifier_call_chain+0x1f/0x30
>>  [] notify_die+0x32/0x40
>>  [] do_invalid_op+0x59/0xa0
>>  [] __ipipe_handle_exception+0x7b/0x144
>>  [] error_code+0x6f/0x7c
> 
> Wow. Why that?
> 
>>  [] __ipipe_handle_exception+0x83/0x144
>>  [] error_code+0x6f/0x7c
> 
> And this? We should not get any exception over an IPI3 handler. I guess
> the double fault may be explained by this root cause.
> 
>>  [] __ipipe_handle_irq+0x4f/0x140
>>  [] ipipe_ipi3+0x26/0x40
> 
> Our LAPIC timer vector. Are you running full modular or statically btw?

Fully modular. Compiling the nucleus in makes the lock-up move to
another, once again invisible spot.

I nailed down the fault address in the scenario above. It's in the
nucleus module, at the first byte of xntimer_tick_aperiodic. Are we
loosing module text pages over the time? This functions must have been
executed before as the timer was armed while I collected the
/proc/modules and then triggered the crash.

Jan



signature.asc
Description: OpenPGP digital signature
___
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core


Re: [Xenomai-core] crashing 2.6.22 (was: [Xenomai-help] Non-APIC setup broken for 2.4-SVN?)

2007-09-30 Thread Philippe Gerum
On Sun, 2007-09-30 at 13:42 +0200, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Philippe Gerum wrote:
> >> On Sun, 2007-09-30 at 12:22 +0200, Jan Kiszka wrote:
> ...
> >>>  And a third
> >>> one only gives me "Detected illicit call from domain Xenomai" before the
> >>> box reboots. :(
> >> Grmff... Do you run with your smp_processor_id() instrumentation in?
> > 
> > Yes, but I suspect this is just a symptom of some severe memory
> > corruption that (also?) hits I-pipe data structures. I just put in some
> > different instrumentation, and that warning is gone, the box just hangs
> > hard at a different point. Very unfriendly.
> 
> Hah! Got some crash log by hacking a raw printk-to-uart:
> 
> [...]
> <6>Xenomai: starting RTDM services.
> <6>NET: Registered protocol family 10
> <6>lo: Disabled Privacy Extensions
> <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
> <3>I-pipe: Detected illicit call from domain 'Xenomai'
> <3>into a service reserved for domain 'Linux' and below.
>f3a6bc18   c05dad6c f3a6bc3c c0105fc3 c03513c7 c05dc100
>0009 f3a6bc54 c01479cb c03592f8 c0357ae2 c035e069 f3a6bc88 f3a6bc70
>c0127224 c0111df8  f3a6bd74  f3a6bd74 f3a6bc80 c012727f
> Call Trace:
>  [] show_trace_log_lvl+0x1f/0x40
>  [] show_stack_log_lvl+0xb1/0xe0
>  [] show_stack+0x33/0x40
>  [] ipipe_check_context+0x7b/0x90
>  [] __atomic_notifier_call_chain+0x24/0x60
>  [] atomic_notifier_call_chain+0x1f/0x30
>  [] notify_die+0x32/0x40
>  [] do_invalid_op+0x59/0xa0
>  [] __ipipe_handle_exception+0x7b/0x144
>  [] error_code+0x6f/0x7c

Wow. Why that?

>  [] __ipipe_handle_exception+0x83/0x144
>  [] error_code+0x6f/0x7c

And this? We should not get any exception over an IPI3 handler. I guess
the double fault may be explained by this root cause.

>  [] __ipipe_handle_irq+0x4f/0x140
>  [] ipipe_ipi3+0x26/0x40

Our LAPIC timer vector. Are you running full modular or statically btw?

>  [] mcount+0x24/0x29
>  [] kunmap_atomic+0x9/0x60
>  [] __handle_mm_fault+0x210/0x910
>  [] do_page_fault+0x1dc/0x5f0
>  [] __ipipe_handle_exception+0x7b/0x144
>  [] error_code+0x6f/0x7c
>  ===
> I-pipe tracer log (30 points):
> #*func0 ipipe_trace_panic_freeze+0x8 
> (ipipe_check_context+0x40)
> #*func0 ipipe_check_context+0xc 
> (__atomic_notifier_call_chain+0x24)
> #*func0 __atomic_notifier_call_chain+0x14 
> (atomic_notifier_call_chain+0x1f)
> #*func0 atomic_notifier_call_chain+0xb 
> (notify_die+0x32)
> #*func0 notify_die+0xb (do_invalid_op+0x59)
> #*func0 do_invalid_op+0x10 
> (__ipipe_handle_exception+0x7b)
> #*func   -1 __ipipe_handle_exception+0xe (error_code+0x6f)
> #*func   -1 __ipipe_restore_root+0x8 
> (__ipipe_handle_exception+0x83)
>  |  #*func   -2 do_page_fault+0xe 
> (__ipipe_handle_exception+0x7b)
>  |  # func   -2 __ipipe_handle_exception+0xe (error_code+0x6f)
>  |   +func   -3 __ipipe_dispatch_wired+0x16 
> (__ipipe_handle_irq+0x4f)
>  |   +func   -3 __ipipe_ack_apic+0x8 (__ipipe_handle_irq+0x8f)
>  |   +func   -3 __ipipe_handle_irq+0x14 (ipipe_ipi3+0x26)
>  +func   -3 kunmap_atomic+0x9 (__handle_mm_fault+0x210)
>  +func   -3 ipipe_check_context+0xc 
> (__handle_mm_fault+0x204)
>  +func   -4 page_add_file_rmap+0x8 
> (__handle_mm_fault+0x586)
>  +func   -4 ipipe_check_context+0xc 
> (__handle_mm_fault+0x196)
>  +func   -4 kmap_atomic_prot+0xb (kmap_atomic+0x13)
>  +func   -4 kmap_atomic+0x8 (__handle_mm_fault+0x186)
>  +func   -4 mark_page_accessed+0x9 (filemap_nopage+0x13c)
>  +func   -4 ipipe_check_context+0xc (find_get_page+0x65)
>  #func   -4 __ipipe_unstall_root+0x8 (find_get_page+0x5b)
>  #func   -4 radix_tree_lookup+0x16 (find_get_page+0x36)
>  #func   -4 ipipe_check_context+0xc (find_get_page+0x2d)
>  +func   -5 ipipe_check_context+0xc (find_get_page+0x18)
>  +func   -5 find_get_page+0xa (filemap_nopage+0x1de)
>  +func   -5 filemap_nopage+0xe (__handle_mm_fault+0x11f)
>  +func   -5 ipipe_check_context+0xc (kunmap_atomic+0x50)
>  +func   -5 kunmap_atomic+0x9 (__handle_mm_fault+0xcc)
>  +func   -5 kmap_atomic_prot+0xb (kmap_atomic+0x13)
> <0>PANIC: double fault, gdt at c0392000 [255 bytes]
> <0>double fault, tss at c038d7e0
> <0>eip = c0127266, esp = dfec1ff8
> <0>eax = c05dad6c, ebx = dfec20f4, ecx = dfec2008, edx = 0009
> <0>esi = , edi = dfec20f4
> 
> Double fault, explains why it is so slippery... And the crash looks a
> bit like that backtr