Ken has made what I consider a very reasonable suggestion, to introduce SI prefixes to Python syntax for numbers. For example, typing 1K will be equivalent to 1000.
However, there are some complexities that have been glossed over. (1) Are the results floats, ints, or something else? I would expect that 1K would be int 1000, not float 1000. But what about fractional prefixes, like 1m? Should that be a float or a decimal? If I write 7981m I would expect 7.981, not 7.9809999999999999, so maybe I want a decimal float, not a binary float? Actually, what I would really want is for the scale factor to be tracked separately. If I write 7981m * 1M, I should end up with 7981000 as an int, not a float. Am I being unreasonable? Obviously if I write 1.1K then I'm expecting a float. So I'm not *entirely* unreasonable :-) (2) Decimal or binary scale factors? The SI units are all decimal, and I think if we support these, we should insist that K == 1000, not 1024. For binary scale factors, there is the IEC standard: http://physics.nist.gov/cuu/Units/binary.html which defines Ki = 2**10, Mi = 2**20, etc. (Fortunately this doesn't have to deal with fractional prefixes.) So it would be easy enough to support them as well. (3) µ or u, k or K? I'm going to go to the barricades to fight for the real SI prefixes µ and k to be supported. If people want to support the common fakes u and K as well, that's fine, I have no objection, but I think that its important to support the actual prefixes too. (Python 3 assumes UTF-8 as the default encoding, so it shouldn't cause any technical difficulties to support µ as syntax. The political difficulties though...) (4) What about E? E is tricky if we want 1E to be read as the integer 10**18, because it matches the floating point syntax 1E (which is currently a syntax error). So there's a nasty bit of ambiguity where it may be unclear whether or not 1E is intended as an int or an incomplete float, and then there's 1E1E which might be read as 1E1*10**18 or as just an error. Replacing E with (say) X is risky. The two largest current SI prefixes are Z and Y, it seems very likely that the next one added (if that ever happens) will be X. Actually, using any other letter risks clashing with a future expansion of the SI prefixes. (5) What about other numeric types? Just because there's no syntactic support for Fraction and Decimal shouldn't mean we can't use these scale factors with them. (6) What happens to int(), float() etc? I wouldn't want int("23K") to suddenly change from being an error to returning 23000. Presumably we would want int to take an optional argument to allow the interpretation of scale factors. This gives us an advantage: int("23E", scale=True) is unambiguously an int, and we can ignore the fact that it looks like a float. (7) What about repr() and str()? I don't think that the repr() or str() of numeric types should change. But perhaps format() could grow some new codes to display numbers using either the most obvious scale factor, or some specific scale factor. * * * This leads to my first proposal: require an explicit numeric prefix on numbers before scale factors are allowed, similar to how we treat non-decimal bases. 8M # remains a syntax error 0s8M # unambiguously an int with a scale factor of M = 10**6 0s1E1E # a float 1E1 with a scale factor of E = 10**18 0s1.E # a float 1. with a scale factor of E, not an exponent int('8M') # remains a ValueError int('0s8M', base=0) # returns 8*10**6 Or if that's too heavy (two whole characters, plus the suffix!) perhaps we could have a rule that the suffix must follow the final underscore of the number: 8_M # int 8*10*6 123_456_789_M # int 123456789*10**6 123_M_456 # still an error 8._M # float 8.0*10**6 int() and float() take a keyword only argument to allow a scale factor when converting from strings: int("8_M") # remains an error int("8_M", scale=True) # allowed This solves the problem with E and floats. Its only a scale factor if it immediately follows the final underscore in the float, otherwise it is the regular exponent sign. Proposal number two: don't make any changes to the syntax, but treat these as *literally* numeric scale factors. Add a simple module to the std lib defining the various factors: k = kilo = 10**3 M = mega = 10**6 G = giga = 10**9 etc. and then allow the user to literally treat them as scale factors by multiplying: from scaling import * int_value = 8*M float_value = 8.0*M fraction_value = Fraction(1, 8)*M decimal_value = Decimal("1.2345")*M and so forth. The biggest advantage of this is that there is no syntactic changes needed, it is completely backwards compatible, it works with any numeric type and even non-numbers: py> x = [None]*M py> len(x) 1000000 You can even scale by multiple factors: x = 8*M*K Disadvantages: none I can think of. (Some cleverness may be needed to have fractional scale values work with both floats and Decimals, but that shouldn't be hard.) -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/