[PATCH v2] cobalt: Make some private functions static
Signed-off-by: Jan Kiszka --- Changes in v2: - decouple from "cobalt: Return error from xnsched_setparam" which is going to be addressed differently kernel/cobalt/sched-idle.c | 14 +++--- kernel/cobalt/sched-rt.c | 14 +++--- kernel/cobalt/sched-weak.c | 14 +++--- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/kernel/cobalt/sched-idle.c b/kernel/cobalt/sched-idle.c index e5ca8aee87..e6799824a8 100644 --- a/kernel/cobalt/sched-idle.c +++ b/kernel/cobalt/sched-idle.c @@ -23,25 +23,25 @@ static struct xnthread *xnsched_idle_pick(struct xnsched *sched) return >rootcb; } -bool xnsched_idle_setparam(struct xnthread *thread, - const union xnsched_policy_param *p) +static bool xnsched_idle_setparam(struct xnthread *thread, + const union xnsched_policy_param *p) { return __xnsched_idle_setparam(thread, p); } -void xnsched_idle_getparam(struct xnthread *thread, - union xnsched_policy_param *p) +static void xnsched_idle_getparam(struct xnthread *thread, + union xnsched_policy_param *p) { __xnsched_idle_getparam(thread, p); } -void xnsched_idle_trackprio(struct xnthread *thread, - const union xnsched_policy_param *p) +static void xnsched_idle_trackprio(struct xnthread *thread, + const union xnsched_policy_param *p) { __xnsched_idle_trackprio(thread, p); } -void xnsched_idle_protectprio(struct xnthread *thread, int prio) +static void xnsched_idle_protectprio(struct xnthread *thread, int prio) { __xnsched_idle_protectprio(thread, prio); } diff --git a/kernel/cobalt/sched-rt.c b/kernel/cobalt/sched-rt.c index 114ddad214..24570322a0 100644 --- a/kernel/cobalt/sched-rt.c +++ b/kernel/cobalt/sched-rt.c @@ -91,25 +91,25 @@ void xnsched_rt_tick(struct xnsched *sched) xnsched_putback(sched->curr); } -bool xnsched_rt_setparam(struct xnthread *thread, -const union xnsched_policy_param *p) +static bool xnsched_rt_setparam(struct xnthread *thread, + const union xnsched_policy_param *p) { return __xnsched_rt_setparam(thread, p); } -void xnsched_rt_getparam(struct xnthread *thread, -union xnsched_policy_param *p) +static void xnsched_rt_getparam(struct xnthread *thread, + union xnsched_policy_param *p) { __xnsched_rt_getparam(thread, p); } -void xnsched_rt_trackprio(struct xnthread *thread, - const union xnsched_policy_param *p) +static void xnsched_rt_trackprio(struct xnthread *thread, +const union xnsched_policy_param *p) { __xnsched_rt_trackprio(thread, p); } -void xnsched_rt_protectprio(struct xnthread *thread, int prio) +static void xnsched_rt_protectprio(struct xnthread *thread, int prio) { __xnsched_rt_protectprio(thread, prio); } diff --git a/kernel/cobalt/sched-weak.c b/kernel/cobalt/sched-weak.c index fc778b8535..bd3872f104 100644 --- a/kernel/cobalt/sched-weak.c +++ b/kernel/cobalt/sched-weak.c @@ -44,8 +44,8 @@ static struct xnthread *xnsched_weak_pick(struct xnsched *sched) return xnsched_getq(>weak.runnable); } -bool xnsched_weak_setparam(struct xnthread *thread, - const union xnsched_policy_param *p) +static bool xnsched_weak_setparam(struct xnthread *thread, + const union xnsched_policy_param *p) { if (!xnthread_test_state(thread, XNBOOST)) xnthread_set_state(thread, XNWEAK); @@ -53,14 +53,14 @@ bool xnsched_weak_setparam(struct xnthread *thread, return xnsched_set_effective_priority(thread, p->weak.prio); } -void xnsched_weak_getparam(struct xnthread *thread, - union xnsched_policy_param *p) +static void xnsched_weak_getparam(struct xnthread *thread, + union xnsched_policy_param *p) { p->weak.prio = thread->cprio; } -void xnsched_weak_trackprio(struct xnthread *thread, - const union xnsched_policy_param *p) +static void xnsched_weak_trackprio(struct xnthread *thread, + const union xnsched_policy_param *p) { if (p) thread->cprio = p->weak.prio; @@ -68,7 +68,7 @@ void xnsched_weak_trackprio(struct xnthread *thread, thread->cprio = thread->bprio; } -void xnsched_weak_protectprio(struct xnthread *thread, int prio) +static void xnsched_weak_protectprio(struct xnthread *thread, int prio) { if (prio > XNSCHED_WEAK_MAX_PRIO) prio = XNSCHED_WEAK_MAX_PRIO; -- 2.16.4
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 27.11.18 19:57, Philippe Gerum wrote: On 11/27/18 7:15 PM, Jan Kiszka wrote: On 08.11.18 14:24, Philippe Gerum wrote: On 11/8/18 2:15 PM, Jan Kiszka wrote: On 08.11.18 14:09, Philippe Gerum wrote: On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote: On 11/5/18 1:20 PM, Jan Kiszka wrote: The mayday mechanism exists in order to kick a xenomai userspace task into secondary mode while it is running userspace code. For that, we ask I-pipe to call us back when the task was interrupted and is about to return to userspace. So far we defer the relaxation from that callback via a VDSO-like mechanism that triggers a special syscall to the return path of that very same syscall. However, that is not desirable because it is a complex, arch-specific mechanism that can easily break and, specifically, that destroys the backtrace of ptraced tasks. Fortunately, we can fulfill the needs of mayday also by relaxing the task directly from the mayday callback. Tested successfully on x86-64 and ARM. ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work properly. This method would cause an interrupt state breakage with any older ARM patch. ppc32 is fine back to kernel 4.1 regarding this (did not check earlier stuff), and arm64 needs the 4.14-based split series to work properly (4.9 is wrong). Is the reason for arm64 on 4.9 the same as for arm? Yes, same pattern. Is ARM64 on 4.9 considered stable? I don't think so. I would recommend 4.14 as a stable starting point for Xenomai/arm64. Should we add a warning to the nucleus that triggers when ARM64 is built against <4.14? The mayday issue would also be trivial to address in 4.9, 100% analogously to the ARM patches, but that would be pointless if no one should use that kernel for ARM64. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 27.11.18 19:54, Philippe Gerum wrote: On 11/27/18 6:44 PM, Jan Kiszka wrote: On 08.11.18 14:24, Philippe Gerum wrote: On 11/8/18 2:14 PM, Jan Kiszka wrote: On 08.11.18 14:05, Philippe Gerum wrote: On 11/5/18 1:20 PM, Jan Kiszka wrote: The mayday mechanism exists in order to kick a xenomai userspace task into secondary mode while it is running userspace code. For that, we ask I-pipe to call us back when the task was interrupted and is about to return to userspace. So far we defer the relaxation from that callback via a VDSO-like mechanism that triggers a special syscall to the return path of that very same syscall. However, that is not desirable because it is a complex, arch-specific mechanism that can easily break and, specifically, that destroys the backtrace of ptraced tasks. Fortunately, we can fulfill the needs of mayday also by relaxing the task directly from the mayday callback. Tested successfully on x86-64 and ARM. ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work properly. This method would cause an interrupt state breakage with any older ARM patch. That would be a change compared to my past ARM experiments (4.x, if not 3.x based). Can you point me to the problem in more details? Calling __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in re-enabling IRQs unexpectedly for the caller, due to the callee relaxing. AFAIR, a core debug level assertion should trip when that happens. Coming back to that topic: Just tested ARM on 4.4 with all debug switches of core and ipipe enabled but nothing triggered. So, either the issue is more subtle, or it not really an issue. xnthread_relax() forces hard irqs on: __ipipe_exit_irq -> back from irq_handler (entry-armv.S), so you end up running the interrupt return path with irqs on both for svc or usr modes, including the regular preemption management stuff which assumes the opposite. WARN_ON_ONCE(!hard_irqs_disabled()) should trigger from __ipipe_check_root_interruptible() if added there. Ah! Now we are on the same page. But this does not trigger anymore, and that's because of these two: - https://gitlab.denx.de/Xenomai/ipipe/commit/b2bb695a1ece2320ab04cba9aa359bed8953baf7 - https://gitlab.denx.de/Xenomai/ipipe/commit/da42030eb2e1afc6f3f8f4f6d40ae9b336e3d635 I've noticed that issue back then and urged on addressing it. So we are already safe on ARM as well. That's good. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 11/27/18 7:15 PM, Jan Kiszka wrote: > On 08.11.18 14:24, Philippe Gerum wrote: >> On 11/8/18 2:15 PM, Jan Kiszka wrote: >>> On 08.11.18 14:09, Philippe Gerum wrote: On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote: > On 11/5/18 1:20 PM, Jan Kiszka wrote: >> The mayday mechanism exists in order to kick a xenomai userspace task >> into secondary mode while it is running userspace code. For that, we >> ask >> I-pipe to call us back when the task was interrupted and is about to >> return to userspace. >> >> So far we defer the relaxation from that callback via a VDSO-like >> mechanism that triggers a special syscall to the return path of that >> very same syscall. However, that is not desirable because it is a >> complex, arch-specific mechanism that can easily break and, >> specifically, that destroys the backtrace of ptraced tasks. >> >> Fortunately, we can fulfill the needs of mayday also by relaxing >> the task directly from the mayday callback. Tested successfully on >> x86-64 and ARM. > > ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to > work > properly. This method would cause an interrupt state breakage with any > older ARM patch. > ppc32 is fine back to kernel 4.1 regarding this (did not check earlier stuff), and arm64 needs the 4.14-based split series to work properly (4.9 is wrong). >>> >>> Is the reason for arm64 on 4.9 the same as for arm? >>> >> >> Yes, same pattern. >> > > Is ARM64 on 4.9 considered stable? I don't think so. I would recommend 4.14 as a stable starting point for Xenomai/arm64. Then I would look into that path as > well. > > But given that ARM64 will be newly introduced with 3.1 only and that 4.9 > will likely be out of our legacy maintenance by then, the simpler > solution for mayday topic could just be raising the minimum supported > version to 4.14 for that arch (if there is a real issue). > > Jan > -- Philippe.
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 11/27/18 6:44 PM, Jan Kiszka wrote: > On 08.11.18 14:24, Philippe Gerum wrote: >> On 11/8/18 2:14 PM, Jan Kiszka wrote: >>> On 08.11.18 14:05, Philippe Gerum wrote: On 11/5/18 1:20 PM, Jan Kiszka wrote: > The mayday mechanism exists in order to kick a xenomai userspace task > into secondary mode while it is running userspace code. For that, > we ask > I-pipe to call us back when the task was interrupted and is about to > return to userspace. > > So far we defer the relaxation from that callback via a VDSO-like > mechanism that triggers a special syscall to the return path of that > very same syscall. However, that is not desirable because it is a > complex, arch-specific mechanism that can easily break and, > specifically, that destroys the backtrace of ptraced tasks. > > Fortunately, we can fulfill the needs of mayday also by relaxing > the task directly from the mayday callback. Tested successfully on > x86-64 and ARM. ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work properly. This method would cause an interrupt state breakage with any older ARM patch. >>> >>> That would be a change compared to my past ARM experiments (4.x, if not >>> 3.x based). Can you point me to the problem in more details? >>> >> >> Calling __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in >> re-enabling IRQs unexpectedly for the caller, due to the callee >> relaxing. AFAIR, a core debug level assertion should trip when that >> happens. >> > > Coming back to that topic: Just tested ARM on 4.4 with all debug > switches of core and ipipe enabled but nothing triggered. So, either the > issue is more subtle, or it not really an issue. xnthread_relax() forces hard irqs on: __ipipe_exit_irq -> back from irq_handler (entry-armv.S), so you end up running the interrupt return path with irqs on both for svc or usr modes, including the regular preemption management stuff which assumes the opposite. WARN_ON_ONCE(!hard_irqs_disabled()) should trigger from __ipipe_check_root_interruptible() if added there. > > I suppose I need more background here to understand the risk you see. If > there is a problem, I'd either like to backport a solution from 4.14 or > detect what the kernel supports and scale down mayday depending on that, > at least on ARM. > > BTW, this is the call path we are talking about: > > #0 handle_mayday_event (regs=) at > ../kernel/xenomai/posix/process.c:1028 > #1 0xc02c3c70 in __ipipe_notify_trap (exception=9, regs=0xece07fb0) at > ../kernel/ipipe/core.c:1065 > #2 0xc02c52a8 in __ipipe_call_mayday (regs=0xece07fb0) at > ../kernel/ipipe/core.c:1610 > #3 0xc021e448 in __ipipe_exit_irq (regs=) at > ../arch/arm/kernel/ipipe.c:357 > #4 0xc020a57c in __ipipe_exit_irq (regs=) at > ../arch/arm/kernel/ipipe.c:396 > #5 __ipipe_grab_irq (irq=, regs=) at > ../arch/arm/kernel/ipipe.c:395 > #6 0xc020ac0c in ipipe_handle_multi_irq (regs=, > irq=) at ../arch/arm/include/asm/ipipe.h:232 > #7 ipipe_handle_domain_irq (regs=, hwirq= out>, domain=) at ../arch/arm/include/asm/ipipe.h:249 > #8 gic_handle_irq (regs=0xece07fb0) at ../drivers/irqchip/irq-gic.c:367 > > I wonder what the difference is to a normal Xenomai reschedules at the > end of an interrupt. > The rescheduling bits for xnthread_relax() are very specific, so that we can transfer control of the current context to another scheduler logic in a safe way. Looking at the way XNRELAX is dealt with by xnthread_suspend() may help. -- Philippe.
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 08.11.18 14:24, Philippe Gerum wrote: On 11/8/18 2:15 PM, Jan Kiszka wrote: On 08.11.18 14:09, Philippe Gerum wrote: On 11/8/18 2:05 PM, Philippe Gerum via Xenomai wrote: On 11/5/18 1:20 PM, Jan Kiszka wrote: The mayday mechanism exists in order to kick a xenomai userspace task into secondary mode while it is running userspace code. For that, we ask I-pipe to call us back when the task was interrupted and is about to return to userspace. So far we defer the relaxation from that callback via a VDSO-like mechanism that triggers a special syscall to the return path of that very same syscall. However, that is not desirable because it is a complex, arch-specific mechanism that can easily break and, specifically, that destroys the backtrace of ptraced tasks. Fortunately, we can fulfill the needs of mayday also by relaxing the task directly from the mayday callback. Tested successfully on x86-64 and ARM. ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work properly. This method would cause an interrupt state breakage with any older ARM patch. ppc32 is fine back to kernel 4.1 regarding this (did not check earlier stuff), and arm64 needs the 4.14-based split series to work properly (4.9 is wrong). Is the reason for arm64 on 4.9 the same as for arm? Yes, same pattern. Is ARM64 on 4.9 considered stable? Then I would look into that path as well. But given that ARM64 will be newly introduced with 3.1 only and that 4.9 will likely be out of our legacy maintenance by then, the simpler solution for mayday topic could just be raising the minimum supported version to 4.14 for that arch (if there is a real issue). Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On 08.11.18 14:24, Philippe Gerum wrote: On 11/8/18 2:14 PM, Jan Kiszka wrote: On 08.11.18 14:05, Philippe Gerum wrote: On 11/5/18 1:20 PM, Jan Kiszka wrote: The mayday mechanism exists in order to kick a xenomai userspace task into secondary mode while it is running userspace code. For that, we ask I-pipe to call us back when the task was interrupted and is about to return to userspace. So far we defer the relaxation from that callback via a VDSO-like mechanism that triggers a special syscall to the return path of that very same syscall. However, that is not desirable because it is a complex, arch-specific mechanism that can easily break and, specifically, that destroys the backtrace of ptraced tasks. Fortunately, we can fulfill the needs of mayday also by relaxing the task directly from the mayday callback. Tested successfully on x86-64 and ARM. ARM-wise, this change requires ipipe-core-4.14.71-arm-3 or later to work properly. This method would cause an interrupt state breakage with any older ARM patch. That would be a change compared to my past ARM experiments (4.x, if not 3.x based). Can you point me to the problem in more details? Calling __ipipe_notify_trap(IPIPE_TRAP_MAYDAY, regs) may result in re-enabling IRQs unexpectedly for the caller, due to the callee relaxing. AFAIR, a core debug level assertion should trip when that happens. Coming back to that topic: Just tested ARM on 4.4 with all debug switches of core and ipipe enabled but nothing triggered. So, either the issue is more subtle, or it not really an issue. I suppose I need more background here to understand the risk you see. If there is a problem, I'd either like to backport a solution from 4.14 or detect what the kernel supports and scale down mayday depending on that, at least on ARM. BTW, this is the call path we are talking about: #0 handle_mayday_event (regs=) at ../kernel/xenomai/posix/process.c:1028 #1 0xc02c3c70 in __ipipe_notify_trap (exception=9, regs=0xece07fb0) at ../kernel/ipipe/core.c:1065 #2 0xc02c52a8 in __ipipe_call_mayday (regs=0xece07fb0) at ../kernel/ipipe/core.c:1610 #3 0xc021e448 in __ipipe_exit_irq (regs=) at ../arch/arm/kernel/ipipe.c:357 #4 0xc020a57c in __ipipe_exit_irq (regs=) at ../arch/arm/kernel/ipipe.c:396 #5 __ipipe_grab_irq (irq=, regs=) at ../arch/arm/kernel/ipipe.c:395 #6 0xc020ac0c in ipipe_handle_multi_irq (regs=, irq=out>) at ../arch/arm/include/asm/ipipe.h:232 #7 ipipe_handle_domain_irq (regs=, hwirq=, domain=) at ../arch/arm/include/asm/ipipe.h:249 #8 gic_handle_irq (regs=0xece07fb0) at ../drivers/irqchip/irq-gic.c:367 I wonder what the difference is to a normal Xenomai reschedules at the end of an interrupt. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re: rt_e1000e: Detected Hardware Unit Hang
On 27.11.18 10:47, Means Lee wrote: I can't reduce the size of my patch for the e1000e realtime driver existed in Xenomai, because the non-realtime driver evolutions a lot. So I offered the diff file of three files I changed when porting the driver. On my view, the hardware oprations shall be unchanged so I focused on the change of netdev.c. I modfied param.c and e1000.h changed the private data structure and parameters a little. For the question you asked: - the upstream Linux driver I ported over works fine with my hardware, even when I try to put a strong pressure on it(UDP broadcast storm). - when I meet the hardware unit hang, the Tx completion interrupt didn't dissapper but it do reduced a lot. - I didn't enable CONFIG_IPIPE_DEBUG_CONTEXT, but I do uses lock in several places. Then this should be the next thing to do. This not only detects direct locking issues but also those triggered by calling into Linux functions that take normal locks. - then the TX path - the interrupt registertion code was shown below: err = rtdm_irq_request(>irq_handle, adapter->pdev->irq, e1000_intr_msi, 0, netdev->name, adapter); - I write a new index of ring buffer to TDT register to notify the hardware there is an packet should be sent. writel(tx_ring->next_to_use, tx_ring->tail);//after writel, the interrupt routine shall be launched. - If the 'event flow' means the event during the transmit process, the event I mean specifically if both the vanilla driver as well as the ported version take the code path and receive the same interrupts when sending packets. Of course, you can put identical package load on both because higher RTnet layers do no exist for vanilla Linux. You may capture the outgoing traffic under RTnet (RTcap) and replay that under Linux. flow is shown below: e1000e_xmit_frame send an packet atomicly e1000_tx_map use DMA to map the packet(maped before,so just get the DMA address) e1000_tx_queue make sure the tx ring buffer index right write the TDT register to tell hardware to send an packet after the hardware sent an packet, it supposed to trigger an TX completion interrupt and the driver shall response: e1000_intr_msi triggered when Tx/Rx completion e1000_clean_tx_irq recycle the transmit resource By the way, I found that every time master station sent an Ready frame belongs to RTcfg, this Tx hung shows up. And the comunication before that works fine: the TDMA sync frame send properly and every stage before the Ready frame goes well. If I let it stay in the RTCFG_MAIN_CLIENT_2 stage(so far the master and slave known each other), master and slave could comunicate properly. So the Ready frame triggers this problem, but why? An frame of specific format triggers the hardware hung, why it happens? RTcfg is unlikely to be the reason, but maybe the transmission pattern triggers the issue in the driver. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux
Re:Bottom half's in xenomai 2.6
Hi. How can I implement bottom half on xenomai 2.6. I used tasklet kind of things in Linux. Please help me.
Re: beaglebone: arm: xeno-test gives segmentation fault - test dlopen failed
On Tue, Nov 27, 2018 at 3:28 PM Pintu Agarwal wrote: > > Dear Henning, Greg and all, > > Sorry, re-checking after a long time. > > The dlopen test is still failing on Beagle Bone black with xenomai-3. > Yesterday, I just freshly cloned the xenomai-3 repo, build and install > on beagle bone black, then run xeno-test/smokey test. > Found that dlopen test is still crashing. I though this issue is > already fixed, but not. > > Here are the results: > > ~/xenomai-3# cat .git/config > url = https://git.xenomai.org/xenomai-3.git > > ~/xenomai-3# git branch > * master > > ./configure --with-pic --with-core=cobalt --enable-smp --disable-tls > --enable-dlopen-libs > make > make install > > :/usr/xenomai/bin# ./smokey --run > arith OK > bufp skipped (no kernel support) > cpu_affinity OK > fpu_stress OK > iddp skipped (no kernel support) > leaks OK > memory_coreheap OK > memory_heapmem OK > memory_tlsf OK > net_packet_dgram skipped (no kernel support) > net_packet_raw skipped (no kernel support) > net_udp skipped (no kernel support) > posix_clock OK > posix_cond OK > posix_fork OK > posix_mutex OK > posix_select OK > rtdm skipped (no kernel support) > sched_quota skipped (no kernel support) > sched_tp skipped (no kernel support) > setsched OK > sigdebug skipped (no kernel support) > timerfd OK > tsc OK > vdso_access OK > xddp skipped (no kernel support) > Segmentation fault > ./smokey: test dlopen failed: Unknown error -1 > > ~/xenomai-3# /usr/xenomai/bin/dlopentest > Segmentation fault > > So, basically, dlopentest results into Segmentation Fault on arm. > > Note: > Disabling dlopen from "configure" does not cause this issue, and > xeno-test/smokey test pass successfully. > But I am wondering why dlopen test may fail on arm (on x86 it works). > > So, it some packages or configuration that I am missing on my Beagle Bone ? > Can anybody describe the root cause (if already investigated earlier)... > Hi, I am investigating this issue. Currently I found that normal "dlopen" tests is working fine on Beagle Bone. There is some crash happening in: my_dlopen() Hopefully, I might be able to fix this issue and release a patch, in 2 weeks times. Thanks, Pintu > Thanks, > Pintu > > On Fri, Jun 29, 2018 at 10:42 PM Pintu Kumar wrote: > > > > On Fri, Jun 29, 2018 at 9:37 PM Henning Schild > > wrote: > > > > > > Am Fri, 29 Jun 2018 17:52:44 +0200 > > > schrieb Philippe Gerum : > > > > > > > On 06/29/2018 05:42 PM, Henning Schild wrote: > > > > > Hi, > > > > > > > > > > i had a closer look. You might want to revert the following commit > > > > > https://gitlab.denx.de/Xenomai/xenomai/commit/408c93e26438c83c08f216a2c8bd7079253cf71a > > > > > > > > > > It does include the testcase in the build even if the feature is > > > > > disabled. > > > > > > > > > > > > > I don't think so. It includes the testcase in the source distribution > > > > allowing "make distcheck" to succeed, even if the test has not be > > > > built in the smokey driver, which fixes a build regression introduced > > > > by dlopen. Tests built in the executable must be listed in > > > > COBALT_SUBDIRS, which dlopen isn't unless --enable-dlopen-libs has > > > > been given. > > > > > > Yes, you are right. I just saw a "+= dlopen" in the resulting Makefile > > > but that should not cause the problem. > > > > > > Pintu: Please build in a completely fresh environment i.e. "git clean > > > -dfx; autoreconf -i; ./configure". And now give me > > > > > > grep dlopening -A1 config.log > > > grep am__append_1 testsuite/smokey/Makefile > > > autoreconf --version > > > > > > > OK. Thank you so much. > > I will try this on Monday and let you know. > > > > > Ideally with a hash i can find on gitlab.denx.de. > > > > > > Henning > > > > > > > > > > If you want to exclude the dlopen test from the test series at > > > > runtime, you can pass "--exclude=dlopen" on the command line to > > > > smokey, as mentioned in the help strings (smokey --help). > > > > > > > > OK. Thanks for this info. However, when I run xeno-test it > > automatically includes smokey test. > > How do I exclude from xeno-test itself ? > > We have a criteria to attach xeno-test report as the success result. > > > > >
Re: beaglebone: arm: xeno-test gives segmentation fault - test dlopen failed
Dear Henning, Greg and all, Sorry, re-checking after a long time. The dlopen test is still failing on Beagle Bone black with xenomai-3. Yesterday, I just freshly cloned the xenomai-3 repo, build and install on beagle bone black, then run xeno-test/smokey test. Found that dlopen test is still crashing. I though this issue is already fixed, but not. Here are the results: ~/xenomai-3# cat .git/config url = https://git.xenomai.org/xenomai-3.git ~/xenomai-3# git branch * master ./configure --with-pic --with-core=cobalt --enable-smp --disable-tls --enable-dlopen-libs make make install :/usr/xenomai/bin# ./smokey --run arith OK bufp skipped (no kernel support) cpu_affinity OK fpu_stress OK iddp skipped (no kernel support) leaks OK memory_coreheap OK memory_heapmem OK memory_tlsf OK net_packet_dgram skipped (no kernel support) net_packet_raw skipped (no kernel support) net_udp skipped (no kernel support) posix_clock OK posix_cond OK posix_fork OK posix_mutex OK posix_select OK rtdm skipped (no kernel support) sched_quota skipped (no kernel support) sched_tp skipped (no kernel support) setsched OK sigdebug skipped (no kernel support) timerfd OK tsc OK vdso_access OK xddp skipped (no kernel support) Segmentation fault ./smokey: test dlopen failed: Unknown error -1 ~/xenomai-3# /usr/xenomai/bin/dlopentest Segmentation fault So, basically, dlopentest results into Segmentation Fault on arm. Note: Disabling dlopen from "configure" does not cause this issue, and xeno-test/smokey test pass successfully. But I am wondering why dlopen test may fail on arm (on x86 it works). So, it some packages or configuration that I am missing on my Beagle Bone ? Can anybody describe the root cause (if already investigated earlier)... Thanks, Pintu On Fri, Jun 29, 2018 at 10:42 PM Pintu Kumar wrote: > > On Fri, Jun 29, 2018 at 9:37 PM Henning Schild > wrote: > > > > Am Fri, 29 Jun 2018 17:52:44 +0200 > > schrieb Philippe Gerum : > > > > > On 06/29/2018 05:42 PM, Henning Schild wrote: > > > > Hi, > > > > > > > > i had a closer look. You might want to revert the following commit > > > > https://gitlab.denx.de/Xenomai/xenomai/commit/408c93e26438c83c08f216a2c8bd7079253cf71a > > > > > > > > It does include the testcase in the build even if the feature is > > > > disabled. > > > > > > > > > > I don't think so. It includes the testcase in the source distribution > > > allowing "make distcheck" to succeed, even if the test has not be > > > built in the smokey driver, which fixes a build regression introduced > > > by dlopen. Tests built in the executable must be listed in > > > COBALT_SUBDIRS, which dlopen isn't unless --enable-dlopen-libs has > > > been given. > > > > Yes, you are right. I just saw a "+= dlopen" in the resulting Makefile > > but that should not cause the problem. > > > > Pintu: Please build in a completely fresh environment i.e. "git clean > > -dfx; autoreconf -i; ./configure". And now give me > > > > grep dlopening -A1 config.log > > grep am__append_1 testsuite/smokey/Makefile > > autoreconf --version > > > > OK. Thank you so much. > I will try this on Monday and let you know. > > > Ideally with a hash i can find on gitlab.denx.de. > > > > Henning > > > > > > > If you want to exclude the dlopen test from the test series at > > > runtime, you can pass "--exclude=dlopen" on the command line to > > > smokey, as mentioned in the help strings (smokey --help). > > > > > OK. Thanks for this info. However, when I run xeno-test it > automatically includes smokey test. > How do I exclude from xeno-test itself ? > We have a criteria to attach xeno-test report as the success result. > > >
Re: rt_e1000e: Detected Hardware Unit Hang
I can't reduce the size of my patch for the e1000e realtime driver existed in Xenomai, because the non-realtime driver evolutions a lot. So I offered the diff file of three files I changed when porting the driver. On my view, the hardware oprations shall be unchanged so I focused on the change of netdev.c. I modfied param.c and e1000.h changed the private data structure and parameters a little. For the question you asked: - the upstream Linux driver I ported over works fine with my hardware, even when I try to put a strong pressure on it(UDP broadcast storm). - when I meet the hardware unit hang, the Tx completion interrupt didn't dissapper but it do reduced a lot. - I didn't enable CONFIG_IPIPE_DEBUG_CONTEXT, but I do uses lock in several places. - then the TX path - the interrupt registertion code was shown below: err = rtdm_irq_request(>irq_handle, adapter->pdev->irq, e1000_intr_msi, 0, netdev->name, adapter); - I write a new index of ring buffer to TDT register to notify the hardware there is an packet should be sent. writel(tx_ring->next_to_use, tx_ring->tail);//after writel, the interrupt routine shall be launched. - If the 'event flow' means the event during the transmit process, the event flow is shown below: e1000e_xmit_framesend an packet atomicly e1000_tx_map use DMA to map the packet(maped before,so just get the DMA address) e1000_tx_queuemake sure the tx ring buffer index right write the TDT register to tell hardware to send an packet after the hardware sent an packet, it supposed to trigger an TX completion interrupt and the driver shall response: e1000_intr_msitriggered when Tx/Rx completion e1000_clean_tx_irqrecycle the transmit resource By the way, I found that every time master station sent an Ready frame belongs to RTcfg, this Tx hung shows up. And the comunication before that works fine: the TDMA sync frame send properly and every stage before the Ready frame goes well. If I let it stay in the RTCFG_MAIN_CLIENT_2 stage(so far the master and slave known each other), master and slave could comunicate properly. So the Ready frame triggers this problem, but why? An frame of specific format triggers the hardware hung, why it happens? Jan Kiszka 于2018年11月23日周五 下午7:59写道: > On 21.11.18 02:36, Means Lee via Xenomai wrote: > > Sure thing. As I ported e1000e-rt driver from mainline kernel e1000e > > driver, which > > the commit id is 089d7720383d7bc9ca6b8824a05dfa66f80d1f41, the patch > file is > > kind of huge so I attach them here in this mail. > > diff-with-nrt.diff is the diff file of mainline driver e1000e and my > ported > > driver. > > diff-with-old-rt-e1000e.diff is the diff file of xenomai v3.0.7 driver > and > > my ported driver. > > > > Unified diff ("diff -u"), please. I got that offlist, which is more > readable, > but it remains huge. > > So, let's analyze systematically: > - the upstream Linux driver you ported over works fine with your > hardware, correct? > - hardware unit hand may mean that no TX completion interrupt arrived - > can you confirm this based on /proc/xenomai/irq? > - did you enable CONFIG_IPIPE_DEBUG_CONTEXT? It can reveal invalid lock > usage (common mistake when porting linux drivers over) > - then look into the TX path > - is the interrupt registered properly? > - is packet submission happening? > - is any interrupt arriving? > - compare event flow to vanilla Linux driver (add instrumentation to >both) > > Jan > > -- > Siemens AG, Corporate Technology, CT RDA IOT SES-DE > Corporate Competence Center Embedded Linux > -- next part -- A non-text attachment was scrubbed... Name: e1000.diff Type: text/x-patch Size: 2569 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment.bin> -- next part -- A non-text attachment was scrubbed... Name: param.diff Type: text/x-patch Size: 800 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment-0001.bin> -- next part -- A non-text attachment was scrubbed... Name: netdev.diff Type: text/x-patch Size: 118926 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20181127/f7d11681/attachment-0002.bin>