On Wed, Jul 09, 2014 at 04:24:38PM +0000, Dominikus Dittes Scherkl via Digitalmars-d-learn wrote: > On Wednesday, 9 July 2014 at 14:51:41 UTC, Meta wrote: > >One of the uglier things in D is also a long-standing problem with C > >and C++, in that comparison of signed and unsigned values is allowed. > > I would like that, if it would be implemented along this line: > > /// Returns -1 if a < b, 0 if they are equal or 1 if a > b. > /// this will always yield a correct result, no matter which numeric types > are compared. > /// It uses one extra comparison operation if and only if > /// one type is signed and the other unsigned but the signed value is >= 0 > /// (that is what you need to pay for stupid choice of type). [...]
Yeah, I don't see what's the problem with comparing signed and unsigned values, as long as the result is as expected. Currently, however, this code asserts, which is wrong: uint x = uint.max; int y = -1; assert(x > y); > { > static if(Unqual!T == Unqual!U) Nitpick: should be: static if(is(Unqual!T == Unqual!U)) [...] > else static if(isSigned!T && isUnsigned!U) > { > alias CommonType!(Unsigned!T, U) C; > return (a < 0) ? -1 : opCmp!(cast(C)a, cast(C)b); > } > else static if(isUnsigned!T && isSigned!U) > { > alias CommonType!(T, Unsigned!U) C; > return (b < 0) ? 1 : opCmp!(cast(C)a, cast(C)b); > } [...] Hmm. I wonder if there's a more efficient way to do this. For the comparison s < u, where s is a signed value and u is an unsigned value, whenever s is negative, the return value of opCmp must be negative. Assuming 2's-complement representation of integers, this means we simply copy the MSB of s (i.e., the sign bit) to the result. So we can implement s < u as: enum signbitMask = 1u << (s.sizeof*8 - 1); // this is a compile-time constant return (s - u) | (s & signbitMask); // look ma, no branches! which would translate (roughly) to the assembly code: mov eax, [<address of s>] mov ebx, [<address of u>] mov ecx, eax ; save the value of s for signbit extraction sub eax, ebx ; s - u and ecx, #$10000000 ; s & signbitMask or eax, ecx ; (s - u) | (s & signbitMask) (ret ; this is deleted if opCmp is inlined) which avoids a branch hazard in the CPU pipeline. Similarly, for the comparison u < s, whenever s is negative, then opCmp must always be positive. So this means we copy over the *negation* of the sign bit of s to the result. So we get this for u < s: enum signbitMask = 1u << (s.sizeof*8 - 1); // as before return (u - s) & ~(s & signbitMask); // look ma, no branches! which translates roughly to: mov eax, [<address of u>] mov ebx, [<address of s>] sub eax, ebx ; u - s and ebx, #$10000000 ; s & signbitMask not ebx ; ~(s & signbitMask) and eax, ebx ; (u - s) & ~(s & signbitMask) (ret ; this is deleted if opCmp is inlined) Again, this avoid a branch hazard in the CPU pipeline. In both cases, the first 2 instructions are unnecessary if the values to be compared are already in CPU registers. The naïve implementation of opCmp is just a single sub instruction (this is why opCmp is defined the way it is, BTW), whereas the "smart" signed/unsigned comparison is 4 instructions long. The branched version would look something like this: mov eax, [<address of u>] mov ebx, [<address of s>] cmp ebx, $#0 jge label1 ; first branch mov eax, $#FFFFFFFF jmp label2 ; 2nd branch label1: sub eax, ebx label2: (ret) The 2nd branch can be replaced with ret if opCmp is not inlined, but requiring a function call to compare integers seems excessive, so let's assume it's inlined, in which case the 2nd branch is necessary. So as you can see, the branched version is 5 instructions long, and always causes a CPU pipeline hazard. So I submit that the unbranched version is better. ;-) (So much for premature optimization... now lemme go and actually benchmark this stuff and see how well it actually performs in practice. Often, such kinds of hacks often perform more poorly than expected due to unforeseen complications with today's complex CPU's. So for all I know, I could've just been spouting nonsense above. :P) T -- Debian GNU/Linux: Cray on your desktop.