Hi all,

I had a quick look at the issue. Let me summarize some info I gathered.
It started with raspbian not having NEON support enabled but current Debian disables it as well. Is this a 32bit or 64bit issue? Or both? Which compiler is this exactly?

Does it only fail with this one kernel `volk_16i_max_star_horizontal_16i`? This is one of the few, "legacy" kernels [0].
We might really want to remove them in a major release.

I know I could build VOLK on an ARM Chromebook in a Linux container. It works on M1 MacOS (with some CMake fixes). I think I was able to compile it on a RPi3 some time ago. I know cross-compiles are preferable but let's start small.

I have a suspicion that

> Error: selected processor does not support `vld2.16 {d16-d19},[r4]!' in ARM mode

really tells us that the target arch and the instruction really aren't compatible. If this would only be an issue for this particular kernel, we might just remove the NEON version (it's 'one-of-those-legacy' ones after all). It sounds weird to state we have NEON but... And then again, we have [1] where we need to fix things depending on the NEON version.

Cheers
Johannes


[0] https://github.com/gnuradio/volk/blob/237a6fc9242ea8c48d2bbd417a6ea14feaf7314a/lib/kernel_tests.h#L165 [1] https://github.com/gnuradio/volk/blob/237a6fc9242ea8c48d2bbd417a6ea14feaf7314a/include/volk/volk_neon_intrinsics.h#L289

On 23.07.21 00:16, Marcus Müller wrote:
Hi Peter,

thanks for digging in so deeply!

First thing I'm going to do: Loop Johannes Demel in here, he's the most 
knowledgeable
person on the CPU detection and build issues.

Johannes, see below. There's two things to unpack here:

1. It looks like debian disables NEON. Do we need to check with Mait on that? 
Or is this
somehow "right"?
2. see the build problem; that looks like an archi/µarch mismatch, but I don't 
know how to
interpret it.

Maybe you can shine some light on this?

Best regards,
Marcus


On 23.07.21 00:07, peter green wrote:
Package: volk
Version: 2.0.0-1
Severity: important
X-debbugs-cc: mmuel...@gnuradio.org

Hi,

While raspbian and Debian armhf have different baselines what they share in
common is that neon is not part of the baseline configuration but is present on
a large proportion of the systems people use in practice (for Debian armhf I
suspect nearly all users have neon, for raspbian it's less because we still
have pi1/pi0 users around) . So where upstream supports runtime detection
of neon said runtime detection should be enabled and used.

Back in 2015 I disabled neon in raspbian's volk package, I can't remember why
but I suspect it was because at the time I had no means of determining whether
the package had runtime CPU   detection.

alle_die_mit_der from gnuradio upstream came into #raspbian on irc (to ask about
options for building stuff) and I took the opportunity to talk about the issue
of runtime CPU detection. He guided me on how to test volk (quotes below) and
I thus decided to revert our raspbian neon-disabling changes and build a package
for testing on raspbian bullseye.

<plugwash> The volk package in raspbian currently has neon disabled. from what 
you have
said I strongly suspect it could be re-enabled but before I actually re-enable 
it I need
a test plan that I can use to make sure i'm not breaking anything.
<alle_die_mit_der> VOLK has a unit test for every single "kernel"
<alle_die_mit_der> `make test` is your friend :)
<plugwash> is there a way to run the tests against an installed version of volk?
<alle_die_mit_der> yeah
<alle_die_mit_der> `volk_profile` essentially does the same, while benchmarking 
them
<--snip-->
<plugwash> does volk_profile use the same runtime cpu detection as normal use 
of volk?
<alle_die_mit_der> yes
<alle_die_mit_der> it should
<--snip-->
<alle_die_mit_der> you can query the runtime-available platforms with 
`volk-config-info
--avail-machines`

However to my surprise I discovered that the package built on raspbian from
unmodified Debian sources didn't have any neon support either. I discovered that
the CMake scripts were failing to detect Neon because they were not using
-mfpu=neon when building test programs.

I have confirmed this is not a raspbian specific issue and it seems to have
been this way since version 2.0.0, this makes it a regression between buster
and bullseye.

I modified the cmake scripts to use -mfpu=neon when detecting neon support
but then the build itself failed with.

cd /volk-2.4.1.new/obj-arm-linux-gnueabihf/lib && /usr/bin/cc -DHAVE_DLFCN_H
-DHAVE_FENV_H -D_GLIBCXX_USE_CXX11_ABI=1 -I/volk-2.4.1.new/kernels/volk/asm/neon
-I/usr/include/orc-0.4 -I/volk-2.4.1.new/obj-arm-linux-gnueabihf/include
-I/volk-2.4.1.new/include -I/volk-2.4.1.new/kernels
-I/volk-2.4.1.new/obj-arm-linux-gnueabihf/lib -I/volk-2.4.1.new/lib
-I/usr/include/cpu_features -O2 -g -DNDEBUG -fPIC -o
CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o
 -c
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s: 
Assembler
messages:
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:11: 
Error:
selected FPU does not support instruction -- `vmov.i32 q12,#0'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:18: 
Error:
selected processor does not support `vld2.16 {d16-d19},[r4]!' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:22: 
Error:
selected FPU does not support instruction -- `vsub.i16 q10,q8,q9'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:23: 
Error:
selected processor does not support `vcge.s16 q11,q10,#0' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:24: 
Error:
selected processor does not support `vcgt.s16 q10,q12,q10' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:25: 
Error:
selected processor does not support `vand.i16 q11,q8,q11' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:26: 
Error:
selected processor does not support `vand.i16 q10,q9,q10' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:27: 
Error:
selected FPU does not support instruction -- `vadd.i16 q10,q11,q10'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:28: 
Error:
selected processor does not support `vst1.16 {d20-d21},[r12]!' in ARM mode
gmake[4]: *** [lib/CMakeFiles/volk_obj.dir/build.make:1780:
lib/CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o]
Error 1

I could go digging deeper into the build scripts to see if I can figure out how
to make the buildsystem build the neon kernels (but not the generic kernels)
with -mfpu=neon, but I felt it was time to seek advice from those more familiar
with the codebase than me.

Reply via email to