One more thing you can try with the TI kernel if your kernel build has 
issues:

The biggest problem is the CPU going in and out of idle states.   Thus, you 
can install the cpufrequtils and linux-cpupower packages and then at boot 
run:

cpufreq-set -g performance
cpupower idle-set -d 1

(and maybe cpupower idle-set -d 0 )

The first will lock the cpu freq at 1Ghz.   The second will disable the 
very costly idle states in the processor.   I believe the M3 processor in 
the L4_WAKEUP is in charge of the power management stuff which includes the 
CPU idle settings.   Flipping the CPU out of idle seems to take a long time 
and blocks the bus while it waits.   Disabling that state helped a lot.    
Alternatively, you can install the "bone" kernel which doesn't have the 
idle driver (or at least didn't early last year, not sure anymore).  
 Anyway, those had a huge impact, but still wasn't 100% which is why we 
decided to compile our own kernel completely disabling everything on the 
L4-WAKEUP.


Dan


On Thursday, March 25, 2021 at 3:01:56 PM UTC-4 Remy Porter wrote:

> > Personal plug:  I'd be happy to sell capes that don't use gpio0.  
> https://kulplights.com
> Heh, we've already got all the boards for this project. Maybe we'll 
> revisit that design in future projects, though. 
>
> Your software is definitely doing a *lot* more than ours, and certainly 
> much more than we need- we just listen for RGB data on a UDP socket. We, 
> uh… don't really treat them like lights, and instead as very large pixels 
> in a screen. All the mapping/direction/orientation stuff is handled in the 
> render stack, which we build custom for pretty much every project. Last one 
> was a Unity App that was part of a kiosk connected to a gigantic 
> chandelier. Our current project is *kinda* like an architectural scale 
> video wall with a C++/OpenCV app driving the pixels.
>
> I might give the FPP image a shot though, if building my own custom kernel 
> doesn't help. Your guidance on that was *super* helpful, though your FPP 
> kernel did *not* get along with our software (LEDs just didn't work- in 
> lieu of diagnosing that, I just opted to compile my own, which is going on… 
> right now).  Thanks a bunch!
>
> On Tuesday, March 23, 2021 at 4:17:37 PM UTC-4 [email protected] wrote:
>
>> The debs for the kernel are at:
>> https://github.com/FalconChristmas/fpp-linux-kernel/tree/master/debs
>> do you should be able to update to our kernel fairly easy.    If you need 
>> to start building your own kernel, I'd suggest grabbing a Beaglebone AI and 
>> building on that.   It's WAY faster for kernel building.  :).    You can 
>> cross-compile from a debian x86_64, but I was never able to get that to 
>> actually produce proper .deb files that could be installed cleanly on the 
>> BBB so I pretty much just use the AI for kernel builds.  (It's actually the 
>> ONLY thing I use my AI for.)
>>
>> FPP provides a complete UI frontend for configuring the pixel strings and 
>> such and we do allow the various 4 channel types.     It does a lot of 
>> other things as well.   That said, most of these things are done on the ARM 
>> side and not the PRU.  Part of trying to figure out the latency issue was 
>> seeing what make sense to do on the arm side and what works best on the PRU 
>> side.    If you actually wanted to try FPP and see if FPP's optimized PRU 
>> code and kernel combination would work, you could use the FPP 4.6.1 image 
>> on an SD card (see the release assets at github).   You would just need to 
>> create a small json file in /opt/fpp/capes/bbb/strings to describe the 
>> pinout of your cape (use any of them in that directory as a starting point) 
>> and it should then "just work".  You would need to configure e1.31/artnet 
>> input universes on the Channel Input tab, put FPP in "bridge" mode, and 
>> then it should work like a normal light controller and accept pixel data.  
>>  (Or use DDP protocol which doesn't require configuring the input)
>>
>> Personal plug:  I'd be happy to sell capes that don't use gpio0.  
>> https://kulplights.com [image: Screen Shot 2021-03-23 at 4.01.22 PM.png]
>>
>>
>>
>> On Tuesday, March 23, 2021 at 3:49:47 PM UTC-4 Remy Porter wrote:
>>
>>> That is *super* helpful. Thanks a bunch. The pinlayout we're using on 
>>> our boards uses a lot of GPIO0 already, so it's definitely too late to 
>>> change on this. The way we're banging things out, all the GPIOs are being 
>>> hit at the same time, so the latency does appear to hit our strings. I'll 
>>> try giving your kernel a shot, though- that'll definitely help. And maybe 
>>> I'll move the GPIO0 bits over to the other PRU. I hate to have to do that, 
>>> but if it's what needs done, it's what needs done.
>>>
>>> Also, off topic, but poking at FPP: SK6182s support the WS281x protocol, 
>>> so you mostly already support them, but if you poke around at 
>>> ThrowingBagels approach a little, it's not a big push to get 32-bit support 
>>> for RGBW LEDs (we use a lot of SK6182s with the warm-white LED, and it 
>>> looks *great*). We've been running a custom hack of LEDScape for *ages* 
>>> so ThrowingBagels is sorta a consolidation of the features we use, stripped 
>>> down to the bare minimum.
>>>
>>> On Tuesday, March 23, 2021 at 3:17:25 PM UTC-4 [email protected] wrote:
>>>
>>>> Wow... You should have contact us before doing a lot of that.   I 
>>>> completely re-wrote most of the LEDScape code over the last couple years 
>>>> to 
>>>> completely optimize things in attempts to reduce some of the timing 
>>>> issues.   Porting to clpru and rproc was already part of that.   All my 
>>>> updates are in FPP ( https://github.com/FalconChristmas/fpp ).
>>>>
>>>> Anyway, to answer your question, the issue is specific to GPIO0.  
>>>>  GPIO1-3 is not affected by the massive latency issues.   Thus, the best 
>>>> option is to chose GPIO pins on GPIO1-3 and not use the GPIO0 pins.   That 
>>>> wasn't an option for me as we needed to output 48 strings.   In the FPP 
>>>> code, if nothing is using the second PRU (the second PRU could be used for 
>>>> DMX or pixelnet output), we divide the work and have one pru do the 
>>>> GPIO1-3 
>>>> and the other do the GPIO0.    If something IS using the other PRU, and 
>>>> the 
>>>> strings are short enough, then we split it on the one pru and do GPIO1-3 
>>>> first, then do the GPIO0's.   For the most part, that keeps the GPIO0 
>>>> problems from affecting all the strings so the random flashes would really 
>>>> just be on the GPIO0 strings.   In the case where the second PRU is used 
>>>> for something else AND the strings are longer, then we do have to do all 4 
>>>> GPIO's at once and all of them can be affected so it's definitely not a 
>>>> perfect solution.   
>>>>
>>>> To minimize the issues (but not entirely eliminate) I do now build a 
>>>> custom 4.19 kernel that disables most of the devices on the L4_WAKEUP 
>>>> interconnect.  Any power management and frequency scaling stuff causes 
>>>> huge 
>>>> issues with GPIO0 latencies so those are the most important things to 
>>>> disable.     I think my notes are at:
>>>>
>>>>
>>>> https://github.com/FalconChristmas/fpp-linux-kernel/tree/master/bbb-kernel
>>>>
>>>> Not sure if that helps enough for you.  Feel free to ask more 
>>>> questions.  :)
>>>> Dan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tuesday, March 23, 2021 at 1:45:51 PM UTC-4 Remy Porter wrote:
>>>>
>>>>> For those that may remember the old LEDScape library, I've been 
>>>>> working on an updated version of that library, which focuses on strips 
>>>>> instead of matrices, uses rproc instead of UIO PRUSS, and updates the PRU 
>>>>> assembly to clpru from pasm. 
>>>>>
>>>>> Link: https://github.com/iontank/ThrowingBagels
>>>>>
>>>>> The key thing you need to know is that we hook up 32 addressable LED 
>>>>> strips and then use the PRU to bitbang out RGB(W) data. We use the PRU 
>>>>> because our timings need to be pretty precise- a few hundred nanoseconds 
>>>>> for each key phase of the operation. 
>>>>>
>>>>> Here's the important issue: we need to address all 32 GPIO pins from 
>>>>> the PRU, but not all of them are bound to the r30 register. So we need to 
>>>>> go through the OCP port. This is exactly how LEDScape worked, and 
>>>>> continues 
>>>>> to work, just fine. We've never been able to get LEDScape working under 
>>>>> 4.x 
>>>>> kernels, mostly because of UIO problems (which is what kicked off this 
>>>>> whole "move to rproc" thing).
>>>>>
>>>>> My upgrade, ThrowingBagels, uses basically the same core logic on the 
>>>>> PRU, just ported to clpru assembly and running on a 4.19 kernel. And 
>>>>> seemingly randomly, the timings hitch which causes the LEDs to flicker to 
>>>>> the wrong color. Phases of our bitbang operation will sometimes take 
>>>>> almost 
>>>>> twice as long as they should- a sleep that should have been 600ns ends up 
>>>>> taking 1100ns. The only operation happening that doesn't have guaranteed 
>>>>> timings is writing to the GPIO pins via OCP, everything else we do 
>>>>> happens 
>>>>> entirely in PRU DRAM. Since this appears to happen randomly, the hitch 
>>>>> *must* be coming from that OCP step, I assume.
>>>>>
>>>>> In support of that hypothesis, if I upgrade from the kernel that ships 
>>>>> with the "AM3358 Debian 10.3 2020-04-06 4GB eMMC IoT Flasher 
>>>>> <https://debian.beagleboard.org/images/bone-eMMC-flasher-debian-10.3-iot-armhf-2020-04-06-4gb.img.xz>"
>>>>>  
>>>>> image to the most recent 4.19 kernel, the problem becomes a lot more 
>>>>> infrequent. We're blasting this data out at 30fps, like video, and when 
>>>>> cut 
>>>>> down on the number of services running and update the kernel, I can get 
>>>>> the 
>>>>> glitches down from happening every few seconds, to happening every few 
>>>>> tens 
>>>>> of seconds.
>>>>>
>>>>> My suspicion, and I can't quite prove anything, is that on 4.19 
>>>>> there's something about the kernel or configuration that sometimes adds 
>>>>> latency to OCP writes, which wasn't there on 3.16. So my key question is: 
>>>>> how do I improve the timing consistency when the PRU uses OCP to write to 
>>>>> DDR RAM? I understand that it will never have *guaranteed* timing, 
>>>>> but sometimes it's hitting me with latencies of up to 500ns. Anything I 
>>>>> can 
>>>>> do to minimize that latency would be a huge help.
>>>>>
>>>>> TL;DR: how can I make PRU->OCP->GPIO more consistent in its timing 
>>>>> under a 4.19 kernel?
>>>>>
>>>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/9e5211b8-7656-4064-a97d-66897ac67d22n%40googlegroups.com.

Reply via email to