Hi, Here are a few patches that should improve things lib/int_sqrt. As stated elsewhere; I'm looking at using int_sqrt() to calculate the stdev on a normal distribution and am expecting the input values to be smallish.
In any case, these optimizations should work fine for large numbers too. And if you have a find-last-set or count-leading-zeros instruction they rock ;-) I can post the tool used to generate the numbers or do a patch to add it to tools/testing/ if people care. The cold numbers are fairly sensitive to code layout (GCC version, random changes etc..), so I expect the branch predictor of my SKL is only partially confused or there's other things at play. However the general trend in the numbers seems fairly stable.

