https://github.com/JuliaLang/julia/pull/11087
On April 30, 2015 at 5:56:02 PM, Stefan Karpinski ([email protected]) wrote: Yeah, this seems like a reasonable change. If you want to make a PR, this shouldn't be too hard. Change the relevant definitions, run `make testall` and see what breaks, fix it, repeat. It will potentially cause some breakage in packages, but this is a good time for that and it shouldn't be too bad. On Thu, Apr 30, 2015 at 2:39 PM, Sebastian Good <[email protected]> wrote: And I guess as a matter of practicality, a vectorized leading_zeros instruction should leave its results in the same sized registers as it started, or it would only be possible on Int64s, though I don’t know if LLVM is doing that just yet. On April 30, 2015 at 2:36:53 PM, Sebastian Good ([email protected]) wrote: Existing compiler intrinsics work this way (__lzcnt, __lzcnt64, __lzcnt16), It came up for me in the following line of code in StreamingStats ρ(s::Uint32) = uint32(uint32(leading_zeros(s)) + 0x00000001) The outer uint32 is no longer necessary in v0.4 because the addition no longer expands 32-bit operands to a 64-bit result. The inner one is still necessary because leading_zeros does. I imagine there are many little functions like this that should probably act the same way. I ran into in my own code for converting IBM/370 floating points to IEEE local norml::UInt32 = leading_zeros(fr) fr <<= norml ex = (ex << 2) - 130 - norml Where I had to convert norml to a UInt32 to preserve type stability in the bit shifting operation below, where I’m working with 32 bit numbers. Leaving this convert out causes the usual massive slowdown in speed when converting tens of millions of numbers. Arguments I can make for making them have the same type — recognizing this is quite subjective! - If you’re doing something with leading_zeros, you’re aware you’re working directly in an integer register in binary code; you’re trying to do something clever and you’ll want type stability - No binary number could have more leading zeros than it itself could represent - The existing intrinsics are written this way - Because I ran into it twice and wished it were that way both times! :-D On April 30, 2015 at 2:16:26 PM, Stefan Karpinski ([email protected]) wrote: I'm not sure why the result of leading_zeros should be of the same type as the argument. What's the use case?
