Klaus, that's a great thread, completely missed it. It seems that a universal binary, as Go requires it, would be slow on dispatch, because there would be too much checking for individual intrinsics support. Do I understand it correctly, that to overcome this, people either compile natively (which we cannot do [and don't want to allow]) or JIT it (which we cannot do as there's no runtime assembly allowed)?
Thanks for all the input, much appreciated. On Friday, 28 October 2016 09:08:47 UTC+1, Klaus Post wrote: > > On Friday, 28 October 2016 02:37:38 UTC+2, Erwin Driessens wrote: > > > I'd love to see SIMD intrinsics in the Go compiler(s), even if it would > mean separate packages for all the architectures. I'm not experienced > enough to tell how far one could get with designing a cross-platform set of > intrinsics instructions? > > There are attempts at generalized SIMD out there (google "web simd" for > instance), but while they may be good for web, I think they fall short for > a compiled language. The problem is that they either support a lowest > common denominator or have to fall back to slow reference implementations. > > Take for instance the PSHUFB instruction, which allows a very fast > [16]byte lookup in SSSE3 capable machines. This is helpful in various ways, > but if it isn't available, it will have to commit the XMM register to > memory and do 16 lookups, which is at least an order of magnitude slower > than using the SIMD. Similarly, RSQRT (low precision reciprocal of the > square root) instruction allows a "shortcut", but if it isn't available on > your architecture, it will likely be very expensive. > > Furthermore keeping it close to the C instrinsics would also make porting > existing code easier, which I can only see at a positive. > > However, adding it is not a trivial task, but with the recent compiler > rewrite, it has become much more feasible. There are still issues that > should be worked out, like forced constant parameters, switching between 2 > and 3 parameter (VEX) instructions with compiler flags, intrinsic types, > etc, and of course the entire thing of defining the intrinsics and > supplying that information to the compiler. > > > Using the hardware when it is available, falling back on emulation when >> not? >> > > As you might be able to tell, I am not a big fan of emulation for other > than testing purposes. While it might be reasonable in some cases, I find > that a "clean" Go version is mostly better than an emulated intrinsics > version. > > I could go on. If you haven't already seen it, there is some good ideas > here: https://groups.google.com/forum/#!topic/golang-nuts/yVOfeHYCIT4 > > > /Klaus > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.