Am 07.12.18 um 05:22 schrieb Matt Turner: > On Thu, Dec 6, 2018 at 7:22 PM Roland Scheidegger <srol...@vmware.com> wrote: >> >> Am 07.12.18 um 03:20 schrieb Matt Turner: >>> Since this is for an extension that will be BDW+ can we use the >>> _cvtss_sh() intrinsic instead? It corresponds to an IVB+ instruction >>> and even takes the rounding mode directly as an immediate argument. >> >> Not saying trying to use it isn't a good idea, but you'd need the right >> compile flags, and you can't assume it's present, since even the latest >> pentiums don't support avx (and by extension, f16c). (The same is true >> for atoms too, of course). > > I'm not sure that AVX and F16C are related, but from a quick glance it > seems that you're right that Atoms ("little core") doesn't support > F16C. I had no idea :( > > As far as I can tell all "big cores" have F16C. That's what > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2Fx86-Options.html&data=02%7C01%7Csroland%40vmware.com%7Ca977fe6f49144fb22be608d65bfbb280%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636797533925838415&sdata=oyAmOqL3xyDJ4pWo7jpduH4XawLuSKJf432K7X31094%3D&reserved=0 > indicates. That also indicates SNB and up all have AVX. Despite that, Pentiums/Celerons from those families definitely do not. (I suppose that means cputype=ivbybridge etc. can't be used if you target the pentiums/celerons, at least not for gcc. I know this was a recurring problem for llvm with autodetect of cpu type, when it would recognize newer core and then trying to use avx / avx2 on pentiums, dying in a fire.) That f16c is tied implicitly to avx seems obvious without a doubt, since the instructions (VCVTPH2PS, VCVTPS2PH) only exist with VEX encoding. You cannot issue VEX-encoded instructions without AVX (VEX-encoding _is_ AVX, regardless if you use the 128bit or 256bit variants). If you don't like that pentiums don't support those, complain to intel (as it's just disabled, of course). IMHO it's a bit silly nowadays...
> > If we've got to have the code, we might as well use it and not > complicate it by using _cvtss_sh() then. Dang. > > (Unfortunately there seems to be bad information out there confusing > things though... see > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunities.intel.com%2Fthread%2F121635&data=02%7C01%7Csroland%40vmware.com%7Ca977fe6f49144fb22be608d65bfbb280%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636797533925838415&sdata=KOCiTY%2BLWFc1eu7iMPWPm2PALY7Bl%2FNaEoVk%2FP%2BAvaw%3D&reserved=0) Quite sure this is blatantly false. Seems even intel is confused about it :-). Roland _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev