On 31 December 2014 at 21:25, Walter Bright via Digitalmars-d <[email protected]> wrote: > What you can contribute that would be very valuable is what we've discussed > before - your simd expertise. Your influence is what has shaped the current > simd support. I don't know anyone who knows even half of what you do about > simd. What you know could make D really fly with vector math. > > You and I both know that auto vectorization, the approach used by everyone > else, is not the key to high performance simd. We have an opportunity here.
Okay, well it's not really useful without a forceinline attribute. std.simd functions need to be pseudo-intrinsics, ie, the cost of a function call will definitely negate the work they perform. Yes, they will (probably) be inlined in release, but debug performance is also important, and I can't have maths code that runs much slower in debug builds because it makes a function call passing structs by value for every single maths opcode in the hottest loops. There were also troubles with GDC; I haven't been able to make it emit the opcode that I want. It reinterprets to emit something else depending on the SSE level argument passed to the compiler. There are attributes to set the 'target' per-function, but that didn't work for some reason, so I need to work out if that can be resolved, otherwise my whole approach (goal of being able to generate multiple SIMD version code paths for runtime selection) won't work (in GCC)... We need to get a quality low-level API out there, that is portable, and fills gaps in the various architectures, then we can focus on high-level wrappers and niceties. I really want to see your half-float module merged. Where did that go? I recall some people were saying it should be conflated with the custom-float stuff, so half-float was just a specialisation of custom float... I'm not so sure about that... but maybe? I have been needing a 3.7 (10bit) float too, maybe that fits in there? That stuff all needs forceinline too to be particularly useful.
