Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.

Julian Brown Mon, 28 Apr 2014 04:45:13 -0700

On Mon, 28 Apr 2014 11:44:01 +0100
Ramana Radhakrishnan <ramra...@arm.com> wrote:


> I've special cased the ffast-math case for the _f32 intrinsics to 
> prevent the auto-vectorizer from coming along and vectorizing addv2sf 
> and addv4sf type operations which we don't want to happen by default.
> Patch 1/3 causes apparent "regressions" in the rather ineffective
> neon intrinsics tests that we currently carry soon hopefully to be
> replaced by Christophe Lyon's rewrite that is being reviewed. On the
> whole I deem this patch stack to be safe to go in if necessary. These
> "regressions" are for -O0 with the vbic and vorn intrinsics which
> don't now get combined and well, so be it.

I think reimplementing these intrinsics in C is a mistake if we ever
hope to make big-endian mode work properly, and "fixing" the generated
header file by bypassing the generator makes it harder to accurately
perform the sweeping changes that will probably be necessary to do that.
Recall e.g. the discussion around:

http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00161.html

Generally (though in this case it's merely an implementation detail)
the NEON intrinsics and GCC's generic vector support cannot be expected
to interwork properly (because of incompatible lane ordering). Of
course we get away with it in little-endian mode though, and I guess
the bridge has already been crossed by earlier patches.

Of course it's possible nobody actually wants to use big-endian NEON,
in which case it's probably time to declared it unsupported?

Julian

Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.

Reply via email to