Re: initial porting of IPIPE x86_64 patches onto Linux stable 5.4.52
On Tuesday, August 25, 2020 9:25:00 AM EDT Lennart Sorensen wrote: > A 5020 is an e5500, which is nothing like an e500. The e500 is PowerpC > SPE, while the e5500 (and the e300, e500mc and e6500) are "real" PowerPC > instead with a somewhat different instruction set. Sorry e500 was a typo in that case. > GCC 9 even dropped support for the PPC SPE, while PPC is fully supported. > Yeah there are e500 boards being made for military, aerospace, and LEO (low earth orbit) uses. There is concern that gcc 9 dropped SPE support, but it's up to vendors to support maintenance of it. > > The e500's existence offends me since it fragments the powerpc > architecture. Yuck. :) Well I won't say anything good or bad about it, but it's the one I know the most since it's what I work with all the time. ;) The worst thing I ever had to deal with was making changes to a 603 instruction set simulator to make it be a 601. The subtle differences between POWER and PowerPC ISAs were quite a pain to find at the time. This was years ago. But yeah all my work (and tested support with xenomai) have been on a e500v2. (8548) Steven
Re: initial porting of IPIPE x86_64 patches onto Linux stable 5.4.52
> On Mon, Aug 24, 2020 at 2:09 AM Jan Kiszka wrote: > > Greg, with this baseline available, porting ARM/ARM64 to 5.4 should > > become possible. Thought I would wait until we have basic confidence on > > x86 into the generic pieces. Steven, same for powerpc. I'm definitely interested in doing this for PPC e500 32-bit. I now have my hands on a 64-bit e500 based board (PPC 5020) but it's a loaner COTS board from an industry partner. I probably won't be able to do Xenomai on that without a funding source, because of my other obligations. But I am reasonably sure that moving to 5.4 for e500 32-bit is something my partners would support. I get emails occasionally for more modern kernels. If this follows the old noarch pattern for the baseline stuff, then that should make it ideal for bringing the rest of it up to date. Steven
Re: Dovetail <-> PREEMPT_RT hybridization
On Thursday, July 23, 2020 12:23:53 PM EDT Philippe Gerum wrote: > Two misunderstandings it seems: > > - this work is all about evolving Dovetail, not Xenomai. If such work does > bring the upsides I'm expecting, then I would surely switch EVL to it. In > parallel, you would still have the opportunity to keep the current Dovetail > implementation - currently under validation on top of 5.8 - and maintain it > for Xenomai, once the latter is rebased over the former. You could also > stick to the I-pipe for Xenomai, so no issue. That may be my misunderstanding. I thought Dovetail's ultimate goal is at least the performance of IPIPE but being simpler to maintain. > - you seem to be assuming that every code paths of the kernel is > interruptible with the I-pipe/Dovetail, this is not the case, by far. Some > keys portions run with hard irqs off, just because there is no other way to > 1) share some code paths between the regular kernel and the real-time core, > 2) the hardware may require it (as hinted in my introductory post). Some of > those sections may take ages under cache pressure (switch_to comes to > mind), tenths of micro-seconds, happening mostly randomly from the > standpoint of the external observer (i.e. you, me). So much for quantifying > timings by design. So with switch_to having hard irqs off, the cache pressure should be deterministic because there's an upper bound on cache lines, the number of memory pages that need to be accessed, and the code path is pretty straight forward if memory serfves. I would think that this being well bounded should serve to my initial point. > > We can only figure out a worst-case value by submitting the system to a > reckless stress workload, for long enough. This game of sharing the very > same hardware between GPOS and a RTOS activities has been based on a > probabilistic approach so far, which can be summarized as: do your best to > keep the interrupts enabled as long as possible, ensure fine-grained > preemption of tasks, make sure to give the result hell to detect issues, > and hope for the hardware not to rain on the parade. I agree that in practice, a reckless stress workload is necessary to quantify system latency. However, relying on this is a problem when it comes time to convince managers who want to spend tons of money for expensive and proven OS solutions instead of using the fun and cool stuff we do. ;) At some point, if possible, someone should try and actually prove the system given the bounds. 1) There's only so many pages of memory 2) There's only so much cache and so many cache lines 3) There's only so many sources of interrupts 4) There's only so many sources of CPU stalls where those number of stalls should have a limit in hardware. I can't really think of anything else, but I don't know why there'd be any sort of randomness on top of this. One thing we might be not on the same page of is that typically (especially single processor systems) when I talk about timing by design calculations I am referring to one single high priority thing. That could be a timer interrupt to the first instruction running in that timer interrupt handler, or it could be to the point where the highest priority thread in the system resumes. > > Back to the initial point: virtualizing the effect of the local_irq helpers > you refer to is required when their use is front and center in serializing > kernel activities. However, in a preempt-rt kernel, most interrupt handlers > are threaded, regular spinlocks are blocking mutexes in disguise, so what > remains is: Yes but this depends on a cooperative model. Other drivers can mess you up, as described by you below. > > - sections covered by the raw_spin_lock API, which is primarily a problem > because we would spin with hard irqs off attempting to acquire the lock. > There is a proven technical solution to this based on a application of > interrupt pipelining. Yes. > - few remaining local_irq disabled sections which may run for too long, but > could be relaxed enough in order for the real-time core to preempt without > prejudice. This is where pro-actively tracing the kernel under stress comes > into play. This is my problem with preempt-rt. Ipipe forces this preemption by changing what the macros do that linux devs think is turning interrupts off. We never need to worry about this in the RTOS domain. > Working on these three aspects specifically does not bring less guarantees > than hoping for no assembly code to create long uninterruptible section > (therefore not covered by local_irq_* helpers), no driver talking to a GPU > killing latency with CPU stalls, no shared cache architecture causing all > sort of insane traffic between cache levels, causing memory access speed to > sink and overall performances to degrade. I havne't had a chance to work with these sorts of systems but we are doing more wuth arm processors with multi-level MMU and I'm very curious about how this will
Re: Dovetail <-> PREEMPT_RT hybridization
On Tuesday, July 21, 2020 1:18:21 PM EDT Philippe Gerum wrote: > > - identifying and quantifying the longest interrupt-free sections in the > target preempt-rt kernel under meaningful stress load, with the irqoff > tracer. I wrote down some information [1] about the stress workloads which > actually make a difference when benchmarking as far as I can tell. At any > rate, the results we would get there would be crucial in order to figure > out where to add the out-of-band synchronization points, and likely of some > interest upstream too. I'm primarily targeting armv7 and armv8, it would be > great if you could help with x86. So from my perspective, one of the beauties of Xenomai with traditional IPIPE is you can analyze the fast interrupt path and see that by design you have an upper bound on latency. You can even calculate it. It's based on the number of cpu cycles at irq entry multiplied by the total numbers of IRQs that could happen at the same time. Depending on your hardware, maybe you know the priority of handling the interrupt in question. The point was the system was analyzable by design. When you start talking about looking for long critical sections and adding sync points in it, I think you take away the by-design guarantees for latency. This might make it less-suitable for hard realtime systems. IMHO this is not any better than Preempt-RT. But maybe I am missing something. :) Steven
Re: [PATCH] powerpc: ipipe: Do full exit checks after __ipipe_call_mayday
Jan, I took a look at entry-common.S and entry_32.S and I think we have the correct check. The flow is a little different, but it seems to work as far as I can tell. This was originally Philippe's code. Maybe he can take a quick look. ;) Steven
Re: [PATCH] powerpc: ipipe: Do full exit checks after __ipipe_call_mayday
On Thursday, December 19, 2019 11:53:36 AM EST Jan Kiszka wrote: > > Ho, ho, this is an early X-mas gift. Jan, I got "gdb ok" when I removed the block I had in the entry_32.S I sent you around the recheck/do_user_signal path. I was pretty sure I didn't need the intret there. Do you remember if you did that as well on your board? This s the first time the test has passed for me without the extra trace prints in smokey's gdb.c. In fact I got the latest xenomai next branch and tested with that. Some good news: your uclibc fix worked beautifully and I can now build the stock xenomai distro with my board without any of my other patches. I'm going to look at the mayday/DoSyscall issue you suggested now. I'll be checking in a cleaned up entry_32.S as well. Will probably be Monday or so but expect 20 more emails from me until then as per my usual pattern. (You cc'd the list, so now the list must suffer the consequences of your actions.) Steven
Re: [PATCH] powerpc: ipipe: Do full exit checks after __ipipe_call_mayday
Tried running with your patch, Jan. I made sure the files I sent you I had worked on were the same I had sent you. I wound up with a crash in do_user_signal during the smokey gdb test. # LD_LIBRARY_PATH=/usr/xenomai/lib /usr/xenomai/bin/smokey --run=gdb [ 10.650031] Unable to handle kernel paging request for instruction fetch [ 10.656762] Faulting instruction address: 0xc0010b00 [ 10.661736] Oops: Kernel access of bad area, sig: 11 [#1] [ 10.667124] BE PREEMPT Aitech SP0S100 [ 10.670780] Modules linked in: unix [ 10.674268] CPU: 0 PID: 982 Comm: smokey Not tainted 4.19.55-aitech #96 [ 10.680871] I-pipe domain: Linux [ 10.684089] NIP: c0010b00 LR: c0010ad8 CTR: c00a8f1c [ 10.689131] REGS: ee8cbe90 TRAP: 0400 Not tainted (4.19.55-aitech) [ 10.695559] MSR: 9220 CR: 24000284 XER: 2000 [ 10.701998] [ 10.701998] GPR00: ee8cbf40 ef156bc0 c0007890 ef156bc0 5d084753 [ 10.701998] GPR08: c055b068 0001 c0599a60 00021000 22000282 1004870c [ 10.701998] GPR16: 1004 [ 10.701998] GPR24: b7ffc000 1001e480 1001e4b0 10041198 100425d0 0002 b9d0 [ 10.736886] NIP [c0010b00] do_user_signal+0x20/0x34 [ 10.741755] LR [c0010ad8] recheck+0x48/0x50 [ 10.745927] Call Trace: [ 10.748362] Instruction dump: [ 10.751322] 7d400124 48089e21 2c83 4186ffa0 614a8000 7d400124 806100b0 7061 [ 10.759066] 41820010 bda10044 5463003c 906100b0 <38610010> 7d244b78 4bff7add b9a10044 [ 10.766985] ---[ end trace d102d53b0f8d6db9 ]--- [ 10.771591]
Re: [PATCH] powerpc: ipipe: Do full exit checks after __ipipe_call_mayday
On Thursday, December 19, 2019 12:14:27 PM EST Jan Kiszka wrote: > > Check ipipe-arm/ipipe/master:arch/arm/kernel/entry-common.S for the call > to ipipe_call_mayday. I suspect that pattern transfers nicely. > > Jan Will do. Can you point me to a smokey test that will prove the implemention is fixed? Preferably if you have something that shows it's currently broken. Steven
Re: [PATCH] powerpc: ipipe: Do full exit checks after __ipipe_call_mayday
On Thursday, December 19, 2019 11:53:36 AM EST Jan Kiszka wrote: > > Ho, ho, this is an early X-mas gift. Thanks for finding this. I'm a terrible PPC maintainer so if anyone else wants to volunteer to take my place ;) > Looking at DoSyscall in entry_32.S, it seems we lack such a careful > check there as well. But that's too much assembly for me ATM. Thanks for the tip I should be able to look into this tomorrow for you. Can you point me to the relevant sections in arm that I should compare to? Any more details (such as desired check/operation) you can provide would be beneficial. Steven
xenomai in space
List, Early Saturday morning a Space/X dragon rocket launched with STP-H6 on board. I wrote some drivers and was on the software architecture board for the software on the CIB, communications interface bus on that system. There are several (9 if I recall) science experiments that all communicate to ISS (International Space Station) networks through our common communications interface. This is the first time that I've sent Xenomai to space. The cargo was successfully delivered to the ISS about 12 hours ago. Because of our use of Xenomai, we were able to reduce the size of buffers in the FPGA which freed up block ram for other things (CPU cache among them) and made the system perform better overall. Seeing this arrive at the ISS after over a year of testing is a great achievement for us at Goddard Space Flight Center and for the Xenomai project. Thanks to all. Especially to Philippe who answers many of my emails privately. ;) I hope I can continue to contribute to the Xenomai project in my own small way. Steven
Re: I-pipe / Dovetail news
On Thursday, May 2, 2019 12:46:37 PM EDT Philippe Gerum wrote: > At the end of this process, a Dovetail-based Cobalt core should be > available for the ARM, ARM64 and x86_64 architectures. The port is made > in a way that enables the Cobalt core to interface with either the > I-pipe or Dovetail at build time. ppc32 is likely to follow at some > point if Steven is ok with this. I could probably help with that, I > still have my lite52xx workhorse around, and a few 40x and 44x SoCs > generously offered by Denx Engineering (thanks Wolfgang). Philippe, I'm ok with helping. Honestly I am thinking that it may be beneficial for us to move to Dovetail since an arinc-653 implementation on top of EVL might be simpler to cerify that one on top of Cobalt. Plus, it should be more efficient since we'd lose a compatibility layer. I'm going to be sending an email to my co-workers here shortly about the topic. I figured the first step would be a dovetail port. Are you suggesting you'd rather, if I have time to help, prioritize my helping with a Cobalt over EVL port on ppc32? I also have some boards from Wolfgang. Steven
Re: __ipipe_root_sync
On Friday, April 26, 2019 2:11:30 PM EDT Philippe Gerum wrote: > > However, __ipipe_root_sync() is 100% redundant with sync_root_irqs(), > which we need in the generic pipelined syscall handling callable from C. > So the situation is a bit silly ATM. Let's rename sync_root_irqs() to > __ipipe_root_sync() in -noarch, and drop any arch-local equivalent. I fully agree with this. I can issue a second patch for 110 once it's done in noarch, but there's no real reason to hurry on this. I can confirm that the 4.14.110 patch I just pushed for powerpc (for all the thousands of you working with PPC) that the use of __ipipe_root_sync and sync_root_irqs is equivalent with ARM. Steven
release spam
Everyone, Sorry for the release spam. I accidentally pushed all tags to the tag server instead of just the one I was working on. Guess I will buy donuts if I ever make it out to one of the meetings. :) Steven
__ipipe_root_sync
Why was __ipipe_root_sync moved out of kernel/ipipe/core.c? I see it now in arch/arm/kernel/ipipe.c. It is the same exact code I had in the PPC branch in kernel/ipipe/core.c. I can move it to the arch-specific code, but was wondering why. Steven
Re: [PATCH] cobalt/kernel: Simplify mayday processing
On Monday, November 5, 2018 7:20:33 AM EST Jan Kiszka wrote: > > I would appreciate if you could test ARM64 and PowerPC for me. Until we > have QEMU test images for both, it's still tricky for me to do that. I have something I've got to get done before I can do anything else, but once that's done I can take a look a this on a PowerPC board. Steven