date:20160912

Re: getting an unpleasant lack of symmetry for a cross product with avx2 target only.

2016-09-12 Thread Matt Pharr

I'm sorry, but I'm not quite following what you're expecting versus what you're seeing. e.g. if I compile that with --target=avx2, I see a series of three multiplies followed by FMA instructions, which seems about as good as it gets. Is it that you're expecting that if you have code that does

Re: Surprising code being generated by ARM NEON backend

2016-09-12 Thread Dmitry Babokin

Interesting, it's effectively exploiting functional programming to extract parallelism. I was always wondering why there are no implementations of functional languages, which are specifically targeted at extracting SIMD level parallelism. Or probably I'm just not aware of such languages. Though