On Wed, Jul 09, 2014 at 05:43:15PM +0000, Dominikus Dittes Scherkl via Digitalmars-d-learn wrote: > On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via > Digitalmars-d-learn wrote: [..] > >Yeah, I don't see what's the problem with comparing signed and unsigned > >values, as long as the result is as expected. Currently, however, this > >code asserts, which is wrong: > > > > uint x = uint.max; > > int y = -1; > > assert(x > y); > > Yes, this is really bad. > But last time I got the response that this is so to be compatible with > C. That is what I really thought was the reason why D throw away > balast from C, to fix bugs.
I think the slogan was that if something in D looks like C, then it should either have C semantics or not compile. According to this logic, the only recourse here is to prohibit comparison of signed with unsigned, which I don't think is workable because there are many valid use cases for it (plus, it will break a ton of code and people will be very unhappy). I don't like the current behaviour, though. It just reeks of wrong in so many different ways. If you *really* want semantives like the above, you really should be casting y to unsigned so that it's clear what exactly you're trying to achieve. [...] > >Hmm. I wonder if there's a more efficient way to do this. > I'm sure. But I think it should be done at the compiler, not in a > library. Obviously, yes. But I wasn't thinking about implementing opCmp in the library -- that would be strange since ints, of all things, need to have native compiler support. I was thinking more about how the compiler would implement "safe" signed/unsigned comparisons. [...] > >So I submit that the unbranched version is better. ;-) > I don't think so, because the branch will only be taken if the signed > type is >= 0 (in fact unsigned). So if the signed/unsigned comparison > is by accident, you pay the extra runtime. But if it is intentional > the signed value is likely to be negative, so you get a correct result > with no extra cost. Good point. Moreover, I have discovered multiple bugs in my proposed implementation; the correct implementation should be as follows: int compare(int x, uint y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); // The (x|y) & signbitMask basically means that if either x is negative // or y > int.max, then the result will always be negative (sign bit // set). return (cast(uint)x - y) | ((x | y) & signbitMask); } unittest { // Basic cases assert(compare(5, 10u) < 0); assert(compare(5, 5u) == 0); assert(compare(10, 5u) > 0); // Large cases assert(compare(0, uint.max) < 0); assert(compare(50, uint.max) < 0); // Sign-dependent cases assert(compare(-1, 0u) < 0); assert(compare(-1, 10u) < 0); assert(compare(-1, uint.max) < 0); } int compare(uint x, int y) { enum signbitMask = 1u << (int.sizeof*8 - 1); static assert(signbitMask == 0x80000000); return ((x - y) & ~(x & signbitMask)) | ((cast(uint)y & signbitMask) >> 1); } unittest { // Basic cases assert(compare(0u, 10) < 0); assert(compare(10u, 10) == 0); assert(compare(10u, 5) > 0); // Large cases assert(compare(uint.max, 10) > 0); assert(compare(uint.max, -10) > 0); // Sign-dependent cases assert(compare(0u, -1) > 0); assert(compare(10u, -1) > 0); assert(compare(uint.max, -1) > 0); } Using gdc -O3, I managed to get a very good result for compare(int,uint), only 5 instructions long. However, for compare(uint,int), there is the annoying special case of compare(uint.max, -1), which can only be fixed by the hack ... | ((y & signbitMask) >> 1). Unfortunately, this makes it 11 instructions long, which is unacceptable. So it looks like a simple compare and branch would be far better in the compare(uint,int) case -- it's far more costly to avoid the branch than to live with it. > Even better for constants, where the compiler can not only evaluate > expressions like (uint.max > -1) correct, but it should optimize them > completely away! Actually, with gdc -O3, I found that the body of the above unittests got completely optimized away at compile-time, so that the unittest body is empty in the executable! So even with a library implementation the compiler was able to maximize performance. DMD left the assert calls in, but then it's not exactly known for generating optimal code anyway. > >(So much for premature optimization... now lemme go and actually > >benchmark this stuff and see how well it actually performs in > >practice. > Yes, we should do this. > > >Often, such kinds of hacks often perform more poorly than expected > >due to unforeseen complications with today's complex CPU's. So for > >all I know, I could've just been spouting nonsense above. :P) > I don't see such a compiler change as a hack. It is a strong > improvement IMHO. I was talking about using | and & to get rid of the branch in signed/unsigned comparison. As it turns out, the compare(uint,int) case seems far more costly than a simple compare-and-branch as you had it at the beginning. So at least that part of what I wrote is probably nonsense. :P But I can't say for sure until I actually run some benchmarks on it. T -- "The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."