[PATCH v2] cobalt: Make some private functions static

2018-11-27 Thread Jan Kiszka via Xenomai
Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - decouple from "cobalt: Return error from xnsched_setparam" which is
   going to be addressed differently

 kernel/cobalt/sched-idle.c | 14 +++---
 kernel/cobalt/sched-rt.c   | 14 +++---
 kernel/cobalt/sched-weak.c | 14 +++---
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/kernel/cobalt/sched-idle.c b/kernel/cobalt/sched-idle.c
index e5ca8aee87..e6799824a8 100644
--- a/kernel/cobalt/sched-idle.c
+++ b/kernel/cobalt/sched-idle.c
@@ -23,25 +23,25 @@ static struct xnthread *xnsched_idle_pick(struct xnsched 
*sched)
return >rootcb;
 }
 
-bool xnsched_idle_setparam(struct xnthread *thread,
-  const union xnsched_policy_param *p)
+static bool xnsched_idle_setparam(struct xnthread *thread,
+ const union xnsched_policy_param *p)
 {
return __xnsched_idle_setparam(thread, p);
 }
 
-void xnsched_idle_getparam(struct xnthread *thread,
-  union xnsched_policy_param *p)
+static void xnsched_idle_getparam(struct xnthread *thread,
+ union xnsched_policy_param *p)
 {
__xnsched_idle_getparam(thread, p);
 }
 
-void xnsched_idle_trackprio(struct xnthread *thread,
-  const union xnsched_policy_param *p)
+static void xnsched_idle_trackprio(struct xnthread *thread,
+  const union xnsched_policy_param *p)
 {
__xnsched_idle_trackprio(thread, p);
 }
 
-void xnsched_idle_protectprio(struct xnthread *thread, int prio)
+static void xnsched_idle_protectprio(struct xnthread *thread, int prio)
 {
__xnsched_idle_protectprio(thread, prio);
 }
diff --git a/kernel/cobalt/sched-rt.c b/kernel/cobalt/sched-rt.c
index 114ddad214..24570322a0 100644
--- a/kernel/cobalt/sched-rt.c
+++ b/kernel/cobalt/sched-rt.c
@@ -91,25 +91,25 @@ void xnsched_rt_tick(struct xnsched *sched)
xnsched_putback(sched->curr);
 }
 
-bool xnsched_rt_setparam(struct xnthread *thread,
-const union xnsched_policy_param *p)
+static bool xnsched_rt_setparam(struct xnthread *thread,
+   const union xnsched_policy_param *p)
 {
return __xnsched_rt_setparam(thread, p);
 }
 
-void xnsched_rt_getparam(struct xnthread *thread,
-union xnsched_policy_param *p)
+static void xnsched_rt_getparam(struct xnthread *thread,
+   union xnsched_policy_param *p)
 {
__xnsched_rt_getparam(thread, p);
 }
 
-void xnsched_rt_trackprio(struct xnthread *thread,
- const union xnsched_policy_param *p)
+static void xnsched_rt_trackprio(struct xnthread *thread,
+const union xnsched_policy_param *p)
 {
__xnsched_rt_trackprio(thread, p);
 }
 
-void xnsched_rt_protectprio(struct xnthread *thread, int prio)
+static void xnsched_rt_protectprio(struct xnthread *thread, int prio)
 {
__xnsched_rt_protectprio(thread, prio);
 }
diff --git a/kernel/cobalt/sched-weak.c b/kernel/cobalt/sched-weak.c
index fc778b8535..bd3872f104 100644
--- a/kernel/cobalt/sched-weak.c
+++ b/kernel/cobalt/sched-weak.c
@@ -44,8 +44,8 @@ static struct xnthread *xnsched_weak_pick(struct xnsched 
*sched)
return xnsched_getq(>weak.runnable);
 }
 
-bool xnsched_weak_setparam(struct xnthread *thread,
-  const union xnsched_policy_param *p)
+static bool xnsched_weak_setparam(struct xnthread *thread,
+ const union xnsched_policy_param *p)
 {
if (!xnthread_test_state(thread, XNBOOST))
xnthread_set_state(thread, XNWEAK);
@@ -53,14 +53,14 @@ bool xnsched_weak_setparam(struct xnthread *thread,
return xnsched_set_effective_priority(thread, p->weak.prio);
 }
 
-void xnsched_weak_getparam(struct xnthread *thread,
-  union xnsched_policy_param *p)
+static void xnsched_weak_getparam(struct xnthread *thread,
+ union xnsched_policy_param *p)
 {
p->weak.prio = thread->cprio;
 }
 
-void xnsched_weak_trackprio(struct xnthread *thread,
-   const union xnsched_policy_param *p)
+static void xnsched_weak_trackprio(struct xnthread *thread,
+  const union xnsched_policy_param *p)
 {
if (p)
thread->cprio = p->weak.prio;
@@ -68,7 +68,7 @@ void xnsched_weak_trackprio(struct xnthread *thread,
thread->cprio = thread->bprio;
 }
 
-void xnsched_weak_protectprio(struct xnthread *thread, int prio)
+static void xnsched_weak_protectprio(struct xnthread *thread, int prio)
 {
if (prio > XNSCHED_WEAK_MAX_PRIO)
prio = XNSCHED_WEAK_MAX_PRIO;
-- 
2.16.4



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Jan Kiszka via Xenomai

On 27.11.18 19:57, Philippe Gerum wrote:

On 11/27/18 7:15 PM, Jan Kiszka wrote:

On 08.11.18 14:24, Philippe Gerum wrote:

On 11/8/18 2:15 PM, Jan Kiszka wrote:

On 08.11.18 14:09, Philippe Gerum wrote:

On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote:

On 11/5/18 1:20 PM, Jan Kiszka wrote:

The mayday mechanism exists in order to kick a xenomai userspace task
into secondary mode while it is running userspace code. For that, we
ask
I-pipe to call us back when the task was interrupted and is about to
return to userspace.

So far we defer the relaxation from that callback via a VDSO-like
mechanism that triggers a special syscall to the return path of that
very same syscall. However, that is not desirable because it is a
complex, arch-specific mechanism that can easily break and,
specifically, that destroys the backtrace of ptraced tasks.

Fortunately, we can fulfill the needs of mayday also by relaxing
the task directly from the mayday callback. Tested successfully on
x86-64 and ARM.


ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to
work
properly. This method would cause an interrupt state breakage with any
older ARM patch.



ppc32 is fine back to kernel 4.1 regarding this (did not check earlier
stuff), and arm64 needs the 4.14-based split series to work properly
(4.9 is wrong).


Is the reason for arm64 on 4.9 the same as for arm?



Yes, same pattern.



Is ARM64 on 4.9 considered stable?


I don't think so. I would recommend 4.14 as a stable starting point for
Xenomai/arm64.



Should we add a warning to the nucleus that triggers when ARM64 is built against 
<4.14?


The mayday issue would also be trivial to address in 4.9, 100% analogously to 
the ARM patches, but that would be pointless if no one should use that kernel 
for ARM64.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Jan Kiszka via Xenomai

On 27.11.18 19:54, Philippe Gerum wrote:

On 11/27/18 6:44 PM, Jan Kiszka wrote:

On 08.11.18 14:24, Philippe Gerum wrote:

On 11/8/18 2:14 PM, Jan Kiszka wrote:

On 08.11.18 14:05, Philippe Gerum wrote:

On 11/5/18 1:20 PM, Jan Kiszka wrote:

The mayday mechanism exists in order to kick a xenomai userspace task
into secondary mode while it is running userspace code. For that,
we ask
I-pipe to call us back when the task was interrupted and is about to
return to userspace.

So far we defer the relaxation from that callback via a VDSO-like
mechanism that triggers a special syscall to the return path of that
very same syscall. However, that is not desirable because it is a
complex, arch-specific mechanism that can easily break and,
specifically, that destroys the backtrace of ptraced tasks.

Fortunately, we can fulfill the needs of mayday also by relaxing
the task directly from the mayday callback. Tested successfully on
x86-64 and ARM.


ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to
work
properly. This method would cause an interrupt state breakage with any
older ARM patch.


That would be a change compared to my past ARM experiments (4.x, if not
3.x based). Can you point me to the problem in more details?



Calling    __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in
re-enabling IRQs unexpectedly for the caller, due to the callee
relaxing. AFAIR, a core debug level assertion should trip when that
happens.



Coming back to that topic: Just tested ARM on 4.4 with all debug
switches of core and ipipe enabled but nothing triggered. So, either the
issue is more subtle, or it not really an issue.


xnthread_relax() forces hard irqs on:

__ipipe_exit_irq -> back from irq_handler (entry-armv.S), so you end up
running the interrupt return path with irqs on both for svc or usr
modes, including the regular preemption management stuff which assumes
the opposite.

WARN_ON_ONCE(!hard_irqs_disabled()) should trigger from
__ipipe_check_root_interruptible() if added there.



Ah! Now we are on the same page. But this does not trigger anymore, and that's 
because of these two:


- 
https://gitlab.denx.de/Xenomai/ipipe/commit/b2bb695a1ece2320ab04cba9aa359bed8953baf7
- 
https://gitlab.denx.de/Xenomai/ipipe/commit/da42030eb2e1afc6f3f8f4f6d40ae9b336e3d635


I've noticed that issue back then and urged on addressing it. So we are already 
safe on ARM as well. That's good.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Philippe Gerum via Xenomai
On 11/27/18 7:15 PM, Jan Kiszka wrote:
> On 08.11.18 14:24, Philippe Gerum wrote:
>> On 11/8/18 2:15 PM, Jan Kiszka wrote:
>>> On 08.11.18 14:09, Philippe Gerum wrote:
 On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote:
> On 11/5/18 1:20 PM, Jan Kiszka wrote:
>> The mayday mechanism exists in order to kick a xenomai userspace task
>> into secondary mode while it is running userspace code. For that, we
>> ask
>> I-pipe to call us back when the task was interrupted and is about to
>> return to userspace.
>>
>> So far we defer the relaxation from that callback via a VDSO-like
>> mechanism that triggers a special syscall to the return path of that
>> very same syscall. However, that is not desirable because it is a
>> complex, arch-specific mechanism that can easily break and,
>> specifically, that destroys the backtrace of ptraced tasks.
>>
>> Fortunately, we can fulfill the needs of mayday also by relaxing
>> the task directly from the mayday callback. Tested successfully on
>> x86-64 and ARM.
>
> ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to
> work
> properly. This method would cause an interrupt state breakage with any
> older ARM patch.
>

 ppc32 is fine back to kernel 4.1 regarding this (did not check earlier
 stuff), and arm64 needs the 4.14-based split series to work properly
 (4.9 is wrong).
>>>
>>> Is the reason for arm64 on 4.9 the same as for arm?
>>>
>>
>> Yes, same pattern.
>>
> 
> Is ARM64 on 4.9 considered stable?

I don't think so. I would recommend 4.14 as a stable starting point for
Xenomai/arm64.

 Then I would look into that path as
> well.
> 
> But given that ARM64 will be newly introduced with 3.1 only and that 4.9
> will likely be out of our legacy maintenance by then, the simpler
> solution for mayday topic could just be raising the minimum supported
> version to 4.14 for that arch (if there is a real issue).
> 
> Jan
> 


-- 
Philippe.



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Philippe Gerum via Xenomai
On 11/27/18 6:44 PM, Jan Kiszka wrote:
> On 08.11.18 14:24, Philippe Gerum wrote:
>> On 11/8/18 2:14 PM, Jan Kiszka wrote:
>>> On 08.11.18 14:05, Philippe Gerum wrote:
 On 11/5/18 1:20 PM, Jan Kiszka wrote:
> The mayday mechanism exists in order to kick a xenomai userspace task
> into secondary mode while it is running userspace code. For that,
> we ask
> I-pipe to call us back when the task was interrupted and is about to
> return to userspace.
>
> So far we defer the relaxation from that callback via a VDSO-like
> mechanism that triggers a special syscall to the return path of that
> very same syscall. However, that is not desirable because it is a
> complex, arch-specific mechanism that can easily break and,
> specifically, that destroys the backtrace of ptraced tasks.
>
> Fortunately, we can fulfill the needs of mayday also by relaxing
> the task directly from the mayday callback. Tested successfully on
> x86-64 and ARM.

 ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to
 work
 properly. This method would cause an interrupt state breakage with any
 older ARM patch.
>>>
>>> That would be a change compared to my past ARM experiments (4.x, if not
>>> 3.x based). Can you point me to the problem in more details?
>>>
>>
>> Calling    __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in
>> re-enabling IRQs unexpectedly for the caller, due to the callee
>> relaxing. AFAIR, a core debug level assertion should trip when that
>> happens.
>>
> 
> Coming back to that topic: Just tested ARM on 4.4 with all debug
> switches of core and ipipe enabled but nothing triggered. So, either the
> issue is more subtle, or it not really an issue.

xnthread_relax() forces hard irqs on:

__ipipe_exit_irq -> back from irq_handler (entry-armv.S), so you end up
running the interrupt return path with irqs on both for svc or usr
modes, including the regular preemption management stuff which assumes
the opposite.

WARN_ON_ONCE(!hard_irqs_disabled()) should trigger from
__ipipe_check_root_interruptible() if added there.

> 
> I suppose I need more background here to understand the risk you see. If
> there is a problem, I'd either like to backport a solution from 4.14 or
> detect what the kernel supports and scale down mayday depending on that,
> at least on ARM.
> 
> BTW, this is the call path we are talking about:
> 
> #0  handle_mayday_event (regs=) at
> ../kernel/xenomai/posix/process.c:1028
> #1  0xc02c3c70 in __ipipe_notify_trap (exception=9, regs=0xece07fb0) at
> ../kernel/ipipe/core.c:1065
> #2  0xc02c52a8 in __ipipe_call_mayday (regs=0xece07fb0) at
> ../kernel/ipipe/core.c:1610
> #3  0xc021e448 in __ipipe_exit_irq (regs=) at
> ../arch/arm/kernel/ipipe.c:357
> #4  0xc020a57c in __ipipe_exit_irq (regs=) at
> ../arch/arm/kernel/ipipe.c:396
> #5  __ipipe_grab_irq (irq=, regs=) at
> ../arch/arm/kernel/ipipe.c:395
> #6  0xc020ac0c in ipipe_handle_multi_irq (regs=,
> irq=) at ../arch/arm/include/asm/ipipe.h:232
> #7  ipipe_handle_domain_irq (regs=, hwirq= out>, domain=) at ../arch/arm/include/asm/ipipe.h:249
> #8  gic_handle_irq (regs=0xece07fb0) at ../drivers/irqchip/irq-gic.c:367
> 
> I wonder what the difference is to a normal Xenomai reschedules at the
> end of an interrupt.
> 

The rescheduling bits for xnthread_relax() are very specific, so that we
can transfer control of the current context to another scheduler logic
in a safe way. Looking at the way XNRELAX is dealt with by
xnthread_suspend() may help.

-- 
Philippe.



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Jan Kiszka via Xenomai

On 08.11.18 14:24, Philippe Gerum wrote:

On 11/8/18 2:15 PM, Jan Kiszka wrote:

On 08.11.18 14:09, Philippe Gerum wrote:

On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote:

On 11/5/18 1:20 PM, Jan Kiszka wrote:

The mayday mechanism exists in order to kick a xenomai userspace task
into secondary mode while it is running userspace code. For that, we
ask
I-pipe to call us back when the task was interrupted and is about to
return to userspace.

So far we defer the relaxation from that callback via a VDSO-like
mechanism that triggers a special syscall to the return path of that
very same syscall. However, that is not desirable because it is a
complex, arch-specific mechanism that can easily break and,
specifically, that destroys the backtrace of ptraced tasks.

Fortunately, we can fulfill the needs of mayday also by relaxing
the task directly from the mayday callback. Tested successfully on
x86-64 and ARM.


ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work
properly. This method would cause an interrupt state breakage with any
older ARM patch.



ppc32 is fine back to kernel 4.1 regarding this (did not check earlier
stuff), and arm64 needs the 4.14-based split series to work properly
(4.9 is wrong).


Is the reason for arm64 on 4.9 the same as for arm?



Yes, same pattern.



Is ARM64 on 4.9 considered stable? Then I would look into that path as well.

But given that ARM64 will be newly introduced with 3.1 only and that 4.9 will 
likely be out of our legacy maintenance by then, the simpler solution for mayday 
topic could just be raising the minimum supported version to 4.14 for that arch 
(if there is a real issue).


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: [PATCH] cobalt/kernel: Simplify mayday processing

2018-11-27 Thread Jan Kiszka via Xenomai

On 08.11.18 14:24, Philippe Gerum wrote:

On 11/8/18 2:14 PM, Jan Kiszka wrote:

On 08.11.18 14:05, Philippe Gerum wrote:

On 11/5/18 1:20 PM, Jan Kiszka wrote:

The mayday mechanism exists in order to kick a xenomai userspace task
into secondary mode while it is running userspace code. For that, we ask
I-pipe to call us back when the task was interrupted and is about to
return to userspace.

So far we defer the relaxation from that callback via a VDSO-like
mechanism that triggers a special syscall to the return path of that
very same syscall. However, that is not desirable because it is a
complex, arch-specific mechanism that can easily break and,
specifically, that destroys the backtrace of ptraced tasks.

Fortunately, we can fulfill the needs of mayday also by relaxing
the task directly from the mayday callback. Tested successfully on
x86-64 and ARM.


ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work
properly. This method would cause an interrupt state breakage with any
older ARM patch.


That would be a change compared to my past ARM experiments (4.x, if not
3.x based). Can you point me to the problem in more details?



Calling __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in
re-enabling IRQs unexpectedly for the caller, due to the callee
relaxing. AFAIR, a core debug level assertion should trip when that happens.



Coming back to that topic: Just tested ARM on 4.4 with all debug switches of 
core and ipipe enabled but nothing triggered. So, either the issue is more 
subtle, or it not really an issue.


I suppose I need more background here to understand the risk you see. If there 
is a problem, I'd either like to backport a solution from 4.14 or detect what 
the kernel supports and scale down mayday depending on that, at least on ARM.


BTW, this is the call path we are talking about:

#0  handle_mayday_event (regs=) at 
../kernel/xenomai/posix/process.c:1028
#1  0xc02c3c70 in __ipipe_notify_trap (exception=9, regs=0xece07fb0) at 
../kernel/ipipe/core.c:1065
#2  0xc02c52a8 in __ipipe_call_mayday (regs=0xece07fb0) at 
../kernel/ipipe/core.c:1610
#3  0xc021e448 in __ipipe_exit_irq (regs=) at 
../arch/arm/kernel/ipipe.c:357
#4  0xc020a57c in __ipipe_exit_irq (regs=) at 
../arch/arm/kernel/ipipe.c:396
#5  __ipipe_grab_irq (irq=, regs=) at 
../arch/arm/kernel/ipipe.c:395
#6  0xc020ac0c in ipipe_handle_multi_irq (regs=, irq=out>) at ../arch/arm/include/asm/ipipe.h:232
#7  ipipe_handle_domain_irq (regs=, hwirq=, 
domain=) at ../arch/arm/include/asm/ipipe.h:249

#8  gic_handle_irq (regs=0xece07fb0) at ../drivers/irqchip/irq-gic.c:367

I wonder what the difference is to a normal Xenomai reschedules at the end of an 
interrupt.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re: rt_e1000e: Detected Hardware Unit Hang

2018-11-27 Thread Jan Kiszka via Xenomai

On 27.11.18 10:47, Means Lee wrote:
I can't reduce the size of my patch for the e1000e realtime driver existed in 
Xenomai, because the non-realtime driver evolutions a lot. So I offered the diff 
file of three files I changed when porting the driver. On my view, the hardware 
oprations shall be unchanged so I focused on the change of netdev.c. I modfied 
param.c and e1000.h changed the private data structure and parameters a little.


For the question you asked:
- the upstream Linux driver I ported over works fine with my hardware, even when 
I try to put a strong pressure on it(UDP broadcast storm).
- when I meet the hardware unit hang, the Tx completion interrupt didn't 
dissapper but it do reduced a lot.

- I didn't enable CONFIG_IPIPE_DEBUG_CONTEXT, but I do uses lock in several 
places.


Then this should be the next thing to do. This not only detects direct locking 
issues but also those triggered by calling into Linux functions that take normal 
locks.



- then the TX path
    - the interrupt registertion code was shown below:
             err = rtdm_irq_request(>irq_handle,
                   adapter->pdev->irq, e1000_intr_msi, 0,
                   netdev->name, adapter);
    - I write a new index of ring buffer to TDT register to notify the hardware 
there is an packet should be sent.
             writel(tx_ring->next_to_use, tx_ring->tail);//after writel, the 
interrupt routine shall be launched.
    - If the 'event flow' means the event during the transmit process, the event 


I mean specifically if both the vanilla driver as well as the ported version 
take the code path and receive the same interrupts when sending packets. Of 
course, you can put identical package load on both because higher RTnet layers 
do no exist for vanilla Linux. You may capture the outgoing traffic under RTnet 
(RTcap) and replay that under Linux.



flow is shown below:
     e1000e_xmit_frame                        send an packet atomicly
          e1000_tx_map                           use DMA to map the packet(maped 
before,so just get the DMA address)
          e1000_tx_queue                        make sure the tx ring buffer 
index right

          write the TDT register to tell hardware to send an packet
after the hardware sent an packet, it supposed to trigger an TX completion 
interrupt and the driver shall response:

     e1000_intr_msi                                triggered when Tx/Rx 
completion
          e1000_clean_tx_irq                    recycle the transmit resource

By the way, I found that every time master station sent an Ready frame belongs 
to RTcfg, this Tx hung shows up. And the comunication
  before that works fine: the TDMA sync frame send properly and every stage 
before the Ready frame goes well.
If I let it stay in the RTCFG_MAIN_CLIENT_2 stage(so far the master and slave 
known each other), master and slave could comunicate
properly. So the Ready frame triggers this problem, but why? An frame of 
specific format triggers the hardware hung, why it happens?


RTcfg is unlikely to be the reason, but maybe the transmission pattern triggers 
the issue in the driver.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



Re:Bottom half's in xenomai 2.6

2018-11-27 Thread Manikanta Mylavarapu via Xenomai
Hi.
How can I implement bottom half on xenomai 2.6. I used tasklet kind of
things in Linux.
Please help me.


Re: beaglebone: arm: xeno-test gives segmentation fault - test dlopen failed

2018-11-27 Thread Pintu Agarwal via Xenomai
On Tue, Nov 27, 2018 at 3:28 PM Pintu Agarwal  wrote:
>
> Dear Henning, Greg and all,
>
> Sorry, re-checking after a long time.
>
> The dlopen test is still failing on Beagle Bone black with xenomai-3.
> Yesterday, I just freshly cloned the xenomai-3 repo, build and install
> on beagle bone black, then run xeno-test/smokey test.
> Found that dlopen test is still crashing. I though this issue is
> already fixed, but not.
>
> Here are the results:
>
> ~/xenomai-3# cat .git/config
> url = https://git.xenomai.org/xenomai-3.git
>
> ~/xenomai-3# git branch
> * master
>
> ./configure --with-pic --with-core=cobalt --enable-smp --disable-tls
> --enable-dlopen-libs
> make
> make install
>
> :/usr/xenomai/bin# ./smokey --run
> arith OK
> bufp skipped (no kernel support)
> cpu_affinity OK
> fpu_stress OK
> iddp skipped (no kernel support)
> leaks OK
> memory_coreheap OK
> memory_heapmem OK
> memory_tlsf OK
> net_packet_dgram skipped (no kernel support)
> net_packet_raw skipped (no kernel support)
> net_udp skipped (no kernel support)
> posix_clock OK
> posix_cond OK
> posix_fork OK
> posix_mutex OK
> posix_select OK
> rtdm skipped (no kernel support)
> sched_quota skipped (no kernel support)
> sched_tp skipped (no kernel support)
> setsched OK
> sigdebug skipped (no kernel support)
> timerfd OK
> tsc OK
> vdso_access OK
> xddp skipped (no kernel support)
> Segmentation fault
> ./smokey: test dlopen failed: Unknown error -1
>
> ~/xenomai-3# /usr/xenomai/bin/dlopentest
> Segmentation fault
>
> So, basically, dlopentest results into Segmentation Fault on arm.
>
> Note:
> Disabling dlopen from "configure" does not cause this issue, and
> xeno-test/smokey test pass successfully.
> But I am wondering why dlopen test may fail on arm (on x86 it works).
>
> So, it some packages or configuration that I am missing on my Beagle Bone ?
> Can anybody describe the root cause (if already investigated earlier)...
>

Hi,

I am investigating this issue.
Currently I found that normal "dlopen" tests is working fine on Beagle Bone.
There is some crash happening in: my_dlopen()

Hopefully, I might be able to fix this issue and release a patch, in 2
weeks times.

Thanks,
Pintu

> Thanks,
> Pintu
>
> On Fri, Jun 29, 2018 at 10:42 PM Pintu Kumar  wrote:
> >
> > On Fri, Jun 29, 2018 at 9:37 PM Henning Schild
> >  wrote:
> > >
> > > Am Fri, 29 Jun 2018 17:52:44 +0200
> > > schrieb Philippe Gerum :
> > >
> > > > On 06/29/2018 05:42 PM, Henning Schild wrote:
> > > > > Hi,
> > > > >
> > > > > i had a closer look. You might want to revert the following commit
> > > > > https://gitlab.denx.de/Xenomai/xenomai/commit/408c93e26438c83c08f216a2c8bd7079253cf71a
> > > > >
> > > > > It does include the testcase in the build even if the feature is
> > > > > disabled.
> > > > >
> > > >
> > > > I don't think so. It includes the testcase in the source distribution
> > > > allowing "make distcheck" to succeed, even if the test has not be
> > > > built in the smokey driver, which fixes a build regression introduced
> > > > by dlopen. Tests built in the executable must be listed in
> > > > COBALT_SUBDIRS, which dlopen isn't unless --enable-dlopen-libs has
> > > > been given.
> > >
> > > Yes, you are right. I just saw a "+= dlopen" in the resulting Makefile
> > > but that should not cause the problem.
> > >
> > > Pintu: Please build in a completely fresh environment i.e. "git clean
> > > -dfx; autoreconf -i; ./configure". And now give me
> > >
> > > grep dlopening -A1 config.log
> > > grep am__append_1 testsuite/smokey/Makefile
> > > autoreconf --version
> > >
> >
> > OK. Thank you so much.
> > I will try this on Monday and let you know.
> >
> > > Ideally with a hash i can find on gitlab.denx.de.
> > >
> > > Henning
> > >
> > >
> > > > If you want to exclude the dlopen test from the test series at
> > > > runtime, you can pass "--exclude=dlopen" on the command line to
> > > > smokey, as mentioned in the help strings (smokey --help).
> > > >
> >
> > OK. Thanks for this info. However, when I run xeno-test it
> > automatically includes smokey test.
> > How do I exclude from xeno-test itself ?
> > We have a criteria to attach xeno-test report as the success result.
> >
> > >



Re: beaglebone: arm: xeno-test gives segmentation fault - test dlopen failed

2018-11-27 Thread Pintu Agarwal via Xenomai
Dear Henning, Greg and all,

Sorry, re-checking after a long time.

The dlopen test is still failing on Beagle Bone black with xenomai-3.
Yesterday, I just freshly cloned the xenomai-3 repo, build and install
on beagle bone black, then run xeno-test/smokey test.
Found that dlopen test is still crashing. I though this issue is
already fixed, but not.

Here are the results:

~/xenomai-3# cat .git/config
url = https://git.xenomai.org/xenomai-3.git

~/xenomai-3# git branch
* master

./configure --with-pic --with-core=cobalt --enable-smp --disable-tls
--enable-dlopen-libs
make
make install

:/usr/xenomai/bin# ./smokey --run
arith OK
bufp skipped (no kernel support)
cpu_affinity OK
fpu_stress OK
iddp skipped (no kernel support)
leaks OK
memory_coreheap OK
memory_heapmem OK
memory_tlsf OK
net_packet_dgram skipped (no kernel support)
net_packet_raw skipped (no kernel support)
net_udp skipped (no kernel support)
posix_clock OK
posix_cond OK
posix_fork OK
posix_mutex OK
posix_select OK
rtdm skipped (no kernel support)
sched_quota skipped (no kernel support)
sched_tp skipped (no kernel support)
setsched OK
sigdebug skipped (no kernel support)
timerfd OK
tsc OK
vdso_access OK
xddp skipped (no kernel support)
Segmentation fault
./smokey: test dlopen failed: Unknown error -1

~/xenomai-3# /usr/xenomai/bin/dlopentest
Segmentation fault

So, basically, dlopentest results into Segmentation Fault on arm.

Note:
Disabling dlopen from "configure" does not cause this issue, and
xeno-test/smokey test pass successfully.
But I am wondering why dlopen test may fail on arm (on x86 it works).

So, it some packages or configuration that I am missing on my Beagle Bone ?
Can anybody describe the root cause (if already investigated earlier)...

Thanks,
Pintu

On Fri, Jun 29, 2018 at 10:42 PM Pintu Kumar  wrote:
>
> On Fri, Jun 29, 2018 at 9:37 PM Henning Schild
>  wrote:
> >
> > Am Fri, 29 Jun 2018 17:52:44 +0200
> > schrieb Philippe Gerum :
> >
> > > On 06/29/2018 05:42 PM, Henning Schild wrote:
> > > > Hi,
> > > >
> > > > i had a closer look. You might want to revert the following commit
> > > > https://gitlab.denx.de/Xenomai/xenomai/commit/408c93e26438c83c08f216a2c8bd7079253cf71a
> > > >
> > > > It does include the testcase in the build even if the feature is
> > > > disabled.
> > > >
> > >
> > > I don't think so. It includes the testcase in the source distribution
> > > allowing "make distcheck" to succeed, even if the test has not be
> > > built in the smokey driver, which fixes a build regression introduced
> > > by dlopen. Tests built in the executable must be listed in
> > > COBALT_SUBDIRS, which dlopen isn't unless --enable-dlopen-libs has
> > > been given.
> >
> > Yes, you are right. I just saw a "+= dlopen" in the resulting Makefile
> > but that should not cause the problem.
> >
> > Pintu: Please build in a completely fresh environment i.e. "git clean
> > -dfx; autoreconf -i; ./configure". And now give me
> >
> > grep dlopening -A1 config.log
> > grep am__append_1 testsuite/smokey/Makefile
> > autoreconf --version
> >
>
> OK. Thank you so much.
> I will try this on Monday and let you know.
>
> > Ideally with a hash i can find on gitlab.denx.de.
> >
> > Henning
> >
> >
> > > If you want to exclude the dlopen test from the test series at
> > > runtime, you can pass "--exclude=dlopen" on the command line to
> > > smokey, as mentioned in the help strings (smokey --help).
> > >
>
> OK. Thanks for this info. However, when I run xeno-test it
> automatically includes smokey test.
> How do I exclude from xeno-test itself ?
> We have a criteria to attach xeno-test report as the success result.
>
> >



Re: rt_e1000e: Detected Hardware Unit Hang

2018-11-27 Thread Means Lee via Xenomai
I can't reduce the size of my patch for the e1000e realtime driver existed
in Xenomai, because the non-realtime driver evolutions a lot. So I offered
the diff file of three files I changed when porting the driver. On my view,
the hardware oprations shall be unchanged so I focused on the change of
netdev.c. I modfied param.c and e1000.h changed the private data structure
and parameters a little.

For the question you asked:
- the upstream Linux driver I ported over works fine with my hardware, even
when I try to put a strong pressure on it(UDP broadcast storm).
- when I meet the hardware unit hang, the Tx completion interrupt didn't
dissapper but it do reduced a lot.
- I didn't enable CONFIG_IPIPE_DEBUG_CONTEXT, but I do uses lock in several
places.
- then the TX path
   - the interrupt registertion code was shown below:
err = rtdm_irq_request(>irq_handle,
  adapter->pdev->irq, e1000_intr_msi, 0,
  netdev->name, adapter);
   - I write a new index of ring buffer to TDT register to notify the
hardware there is an packet should be sent.
writel(tx_ring->next_to_use, tx_ring->tail);//after writel, the
interrupt routine shall be launched.
   - If the 'event flow' means the event during the transmit process, the
event flow is shown below:
e1000e_xmit_framesend an packet atomicly
 e1000_tx_map   use DMA to map the
packet(maped before,so just get the DMA address)
 e1000_tx_queuemake sure the tx ring buffer
index right
 write the TDT register to tell hardware to send an packet
after the hardware sent an packet, it supposed to trigger an TX completion
interrupt and the driver shall response:
e1000_intr_msitriggered when Tx/Rx
completion
 e1000_clean_tx_irqrecycle the transmit resource

By the way, I found that every time master station sent an Ready frame
belongs to RTcfg, this Tx hung shows up. And the comunication
 before that works fine: the TDMA sync frame send properly and every stage
before the Ready frame goes well.
If I let it stay in the RTCFG_MAIN_CLIENT_2 stage(so far the master and
slave known each other), master and slave could comunicate
properly. So the Ready frame triggers this problem, but why? An frame of
specific format triggers the hardware hung, why it happens?

Jan Kiszka  于2018年11月23日周五 下午7:59写道:

> On 21.11.18 02:36, Means Lee via Xenomai wrote:
> > Sure thing. As I ported e1000e-rt driver from mainline kernel e1000e
> > driver, which
> > the commit id is 089d7720383d7bc9ca6b8824a05dfa66f80d1f41, the patch
> file is
> >   kind of huge so I attach them here in this mail.
> > diff-with-nrt.diff is the diff file of mainline driver e1000e and my
> ported
> > driver.
> > diff-with-old-rt-e1000e.diff is the diff file of xenomai v3.0.7 driver
> and
> > my ported driver.
> >
>
> Unified diff ("diff -u"), please. I got that offlist, which is more
> readable,
> but it remains huge.
>
> So, let's analyze systematically:
>   - the upstream Linux driver you ported over works fine with your
> hardware, correct?
>   - hardware unit hand may mean that no TX completion interrupt arrived -
> can you confirm this based on /proc/xenomai/irq?
>   - did you enable CONFIG_IPIPE_DEBUG_CONTEXT? It can reveal invalid lock
> usage (common mistake when porting linux drivers over)
>   - then look into the TX path
>  - is the interrupt registered properly?
>  - is packet submission happening?
>  - is any interrupt arriving?
>  - compare event flow to vanilla Linux driver (add instrumentation to
>both)
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE
> Corporate Competence Center Embedded Linux
>
-- next part --
A non-text attachment was scrubbed...
Name: e1000.diff
Type: text/x-patch
Size: 2569 bytes
Desc: not available
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment.bin>
-- next part --
A non-text attachment was scrubbed...
Name: param.diff
Type: text/x-patch
Size: 800 bytes
Desc: not available
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment-0001.bin>
-- next part --
A non-text attachment was scrubbed...
Name: netdev.diff
Type: text/x-patch
Size: 118926 bytes
Desc: not available
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment-0002.bin>