Re: Treating the abusive unsigned syndrome

Andrei Alexandrescu Thu, 27 Nov 2008 08:50:14 -0800

Don wrote:

Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:
One fear of mine is the reaction of throwing of hands in the air"how many integral types are enough???". However, if we're to judgeby the addition of long long and a slew of typedefs to C99 andC++0x, the answer is "plenty". I'd be interested in gaging howpeople feel about adding two (bits64, bits32) or even four (bits64,bits32, bits16, and bits8) types as basic types. They'd be bitbagswith undecided sign ready to be converted to their counterparts ofdecided sign.
Here I think we have a fundamental disagreement: what is an 'unsignedint'? There are two disparate ideas:
(A) You think that it is an approximation to a natural number, ie, a'positive int'.(B) I think that it is a 'number with NO sign'; that is, the signdepends on context. It may, for example, be part of a larger number.Thus, I largely agree with the C behaviour -- once you have anunsigned in a calculation, it's up to the programmer to provide aninterpretation.
Unfortunately, the two concepts are mashed together in C-familylanguages. (B) is the concept supported by the language typing rules,but usage of (A) is widespread in practice.
In fact we are in agreement. C tries to make it usable as both, andpartially succeeds by having very lax conversions in all directions.This leads to the occasional puzzling behaviors. I do *want* uint tobe an approximation of a natural number, while acknowledging thattoday it isn't much of that.
If we were going to introduce a slew of new types, I'd want them tobe for 'positive int'/'natural int', 'positive byte', etc.
Natural int can always be implicitly converted to either int or uint,with perfect safety. No other conversions are possible without a cast.
Non-negative literals and manifest constants are naturals.

The rules are:
1. Anything involving unsigned is unsigned, (same as C).
2. Else if it contains an integer, it is an integer.
3. (Now we know all quantities are natural):
If it contains a subtraction, it is an integer [Probably allowsubtraction of compile-time quantities to remain natural, if thevalues stay in range; flag an error if an overflow occurs].
4. Else it is a natural.
The reason I think literals and manifest constants are so importantis that they are a significant fraction of the natural numbers in aprogram.
[Just before posting I've discovered that other people have postedsome similar ideas].
That sounds encouraging. One problem is that your approach leaves theunsigned mess as it is, so although natural types are a nice addition,they don't bring a complete solution to the table.
Andrei
Well, it does make unsigned numbers (case (B)) quite obscure andlow-level. They could be renamed with uglier names to make this clearer.But since in this proposal there are no implicit conversions from uintto anything, it's hard to do any damage with the unsigned type whichresults.Basically, with any use of unsigned, the compiler says "I don't know ifthis thing even has a meaningful sign!".
Alternatively, we could add rule 0: mixing int and unsigned is illegal.But it's OK to mix natural with int, or natural with unsigned.I don't like this as much, since it would make most usage of unsignedugly; but maybe that's justified.

I think we're heading towards an impasse. We wouldn't want to makethings much harder for systems-level programs that mix arithmetic andbit-level operations.

I'm glad there is interest and that quite a few ideas were brought up.Unfortunately, it looks like all have significant disadvantages.

One compromise solution Walter and I discussed in the past is to onlysever one of the dangerous implicit conversions: int -> uint. Other thanthat, it's much like C (everything involving one unsigned is unsignedand unsigned -> signed is implicit) Let's see where that takes us.

(a) There are fewer situations when a small, reasonable numberimplicitly becomes a large, weird numnber.

(b) An exception to (a) is that u1 - u2 is also uint, and that's for thesake of C compatibility. I'd gladly drop it if I could and leaveoperations such as u1 - u2 return a signed number. That assumes theleast and works with small, usual values.

(c) Unlike C, arithmetic and logical operations always return thetightest type possible, not a 32/64 bit value. For example, byte / intyields byte and so on.


What do you think?


Andrei

Re: Treating the abusive unsigned syndrome

Reply via email to