Raspbian is built for the original pi, that cpu does not have a neon coprocessor. Basically, use a different distro that supports modern pi hardware.
Philip On 11/3/19 8:10 AM, Amr Bekhit wrote: > Hello all, > > I'm working on a project that involves selecting and filtering 10-15 > narrow channels (10kHz bandwidth) from a relatively broadband input > (1Mhz). I've been working on trying to implement this as performant as > possible using GNURadio companion (see this email thread > https://lists.gnu.org/archive/html/discuss-gnuradio/2019-10/msg00192.html). > I tried of couple of things (using FIR bandpass filters, mixing each > channel down to 0Hz then low pass filtering (both in one step and in > stages), using FIR bandpass filters) and found that simply using FIR > bandpass filters for each channel seemed to provide the best > performance CPU-wise (20% CPU usage on my i7-920 desktop PC). However, > the aim is to run this system on a Raspberry Pi 4 and unfortunately, > the same flow runs at approximately 90% CPU and seems to cause lags > when sending the data to the SDR (LimeSDR-USB). > > I see the problem as potentially one of the following: > - The flow is *still* not as efficient as it could be. > - The RPi4 is just not powerful enough to run something like this and > I need to use something more powerful (perhaps like the x86 Lattepanda > boards?) > - GNURadio is not compiled to use NEON optimisations. > > I've been exploring the last point recently and wanted to check > whether NEON optimisations are indeed being utilised. So here's what I > did: > - I set up a Raspberry Pi 4 (4GB) using Raspbian Buster. > - I installed GNURadio from the standard apt repository. This installs > GNU Radio v3.7.13.4 and Volk 1.4 > - I ran volk_profile to tune the library. > - I then run the bpf-test flow (attached to this email). The CPU usage is 70%. > > Some info about the gnuradio and volk versions: > gnuradio-config-info --cflags: > /usr/bin/cc::: -g -O2 > -fdebug-prefix-map=/build/gnuradio-FK7QfY/gnuradio-3.7.13.4=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -std=gnu99 -fvisibility=hidden -Wsign-compare > -Wall -Wno-uninitialized > /usr/bin/c++::: -g -O2 > -fdebug-prefix-map=/build/gnuradio-FK7QfY/gnuradio-3.7.13.4=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -fvisibility=hidden -Wsign-compare -Wall > -Wno-uninitialized > > volk-config-info --cflags: > /usr/bin/cc::: -g -O2 -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -Wall > /usr/bin/c++::: -g -O2 > -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -Wall > generic_orc:::GNU::: -g -O2 > -fdebug-prefix-map=/build/volk-zBrTqH/volk-1.4=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -Wall > > volk-config-info --avail-machines > generic_orc; > > So based on that, it appears that the gnuradio and volk packages on > Raspbian are not built with NEON support. I then set about compiling > gnuradio and volk from source to ensure that NEON support is included. > I compiled *both* volk and gnuradio using the > arm_cortex_a72_hardfp_native.cmake toolchain file that is included in > the cmake/Toolchains folder in the volk source. I compiled volk > separately and then when compiling gnuradio set > ENABLE_INTERNAL_VOLK=OFF. In this case, I ended up compiling Volk v2.0 > and gnuradio v3.9.0.0 (master from git). Here are the compiler flags: > > gnuradio-config-info --cflags > /usr/bin/gcc:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72 > -mfpu=neon-fp-armv8 -mfloat-abi=hard -fvisibility=hidden > -Wsign-compare -Wall -Wno-uninitialized > /usr/bin/g++:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72 > -mfpu=neon-fp-armv8 -mfloat-abi=hard -fvisibility=hidden > -Wsign-compare -Wall -Wno-uninitialized > > volk-config-info --cflags > /usr/bin/gcc::: -march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8 > -mfloat-abi=hard -Wall > /usr/bin/g++::: -march=armv8-a -mtune=cortex-a72 -mfpu=neon-fp-armv8 > -mfloat-abi=hard -Wall > generic_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72 > -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall > neon_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a -mtune=cortex-a72 > -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall -funsafe-math-optimizations > neonv7_hardfp_orc:::GNU:::-O3 -DNDEBUG -march=armv8-a > -mtune=cortex-a72 -mfpu=neon-fp-armv8 -mfloat-abi=hard -Wall > -funsafe-math-optimizations -mfpu=neon -funsafe-math-optimizations > -mfloat-abi=hard > > volk-config-info --avail-machines > generic_orc;neon_orc;neonv7_hardfp_orc; > > volk-config-info --machine > neonv7_hardfp_orc > > After running volk_profile to tune the library, I then ran the same > flow, hoping that I'd get improved performance. Unfortunately, the > performance was *exactly* the same, with CPU usage also at around 70%. > > I suspect one of the following: > - The flow that I created is not using blocks written using > Volk/optimised for NEON and as such enabling NEON support would make > no difference (doubt it). > - The gnuradio present in the Raspbian repositories *is actually* > compiled using NEON support (despite the cflags showing otherwise) and > I'm just simply running into the limitations of the CPU. > - The gnuradio I compiled myself is actually *not using* NEON support > (despite the cflags showing otherwise) and I need to figure out how to > enable it. > > Any thoughts? > > Thanks, > > Amr >
