I was refering mainly to the BMI instructions. Some like bextr should be recognizable by a pattern. Something like: ( x>>>shift ) & ((0x1<<len) -1) for bextr(x,shift,len). The GCC intrinsic actually append shift to length to form an uint. As for pext or pdef, they are full fledge functions that could take dozen of instructions to do replicate.
All that lead to a question: Does Julia compilation operate under the gcc equivalent of -march=native or that needs to be set somehow?
