Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger pyt...@rcn.com wrote in news:e35271b9-7623-4845-bcb9-d8c33971f...@w24g2000prd.googlegroups.c om: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec [...] Comments and suggestions are welcome but I draw the line at supporting Mayan numbering conventions ;-) Is that inclusive or exclusive? -- rzed -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mon, 16 Mar 2009 02:36:43 -, MRAB goo...@mrabarnett.plus.com wrote: The field name can be an integer or an identifier, so the locale could be too, provided that you know where to look it up! financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:{fin}}.format(1234567, fin=financial)) Then again, shouldn't that be: fin = Locale(group_sep=,, grouping=[3]) print(my number is {0:{fin}}.format(1234567, fin=financial)) Except that loses you the format, since the locale itself is a collection of parameters the format uses. The locale knows how to do groupings, but not whether to do them, nor what the field width should be. Come to think of it, it doesn't know whether to use the LC_NUMERIC grouping information or the LC_MONETARY grouping information. Hmm. I can't believe I'm even suggesting this, but how about: print(my number is {fin.format(10d, {0}, True)}.format(1235467, fin=financial)) assuming the locale.format() method remains unchanged? That's horrible, and I'm pretty sure it can't be right, but I'm too tired to think of anything more sensible right now. -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Rhodri James wrote: On Mon, 16 Mar 2009 02:36:43 -, MRAB goo...@mrabarnett.plus.com wrote: The field name can be an integer or an identifier, so the locale could be too, provided that you know where to look it up! financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:{fin}}.format(1234567, fin=financial)) Then again, shouldn't that be: fin = Locale(group_sep=,, grouping=[3]) print(my number is {0:{fin}}.format(1234567, fin=financial)) Except that loses you the format, since the locale itself is a collection of parameters the format uses. The locale knows how to do groupings, but not whether to do them, nor what the field width should be. Come to think of it, it doesn't know whether to use the LC_NUMERIC grouping information or the LC_MONETARY grouping information. Hmm. I can't believe I'm even suggesting this, but how about: print(my number is {fin.format(10d, {0}, True)}.format(1235467, fin=financial)) assuming the locale.format() method remains unchanged? That's horrible, and I'm pretty sure it can't be right, but I'm too tired to think of anything more sensible right now. It should probably(?) be: financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:fin}.format(1234567, fin=financial)) The format 10n says whether to use separators or a decimal point; the locale fin says what the separator and the decimal point look like. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mon, 16 Mar 2009 23:04:58 -, MRAB goo...@mrabarnett.plus.com wrote: It should probably(?) be: financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:fin}.format(1234567, fin=financial)) The format 10n says whether to use separators or a decimal point; the locale fin says what the separator and the decimal point look like. That works, and isn't an abomination on the face of the existing syntax. Excellent. I'm rather presuming that the n presentation type does grouping. I've only got Python 2.5 here, so I can't check it out (no str.format() method and %n isn't supported by % formatting). If it does, an m type to do the same thing only with the LC_MONETARY group settings instead of the LC_NUMERIC ones would be a good idea. This would be my preferred solution to Raymond's original comma-in-the-format-string proposal, by the way: add an m presentation type as above, and tell people to override the LC_MONETARY group settings in the global locale. It's clear that it's a bodge, and weaning users onto local locales (!) wouldn't be so hard later on. Anyway, time I stopped hypothesising about locales and started looking at the actual code-base, methinks. -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Rhodri James wrote: On Mon, 16 Mar 2009 23:04:58 -, MRAB goo...@mrabarnett.plus.com wrote: It should probably(?) be: financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:fin}.format(1234567, fin=financial)) The format 10n says whether to use separators or a decimal point; the locale fin says what the separator and the decimal point look like. That works, and isn't an abomination on the face of the existing syntax. Excellent. I'm rather presuming that the n presentation type does grouping. I've only got Python 2.5 here, so I can't check it out (no str.format() method and %n isn't supported by % formatting). If it does, an m type to do the same thing only with the LC_MONETARY group settings instead of the LC_NUMERIC ones would be a good idea. This would be my preferred solution to Raymond's original comma-in-the-format-string proposal, by the way: add an m presentation type as above, and tell people to override the LC_MONETARY group settings in the global locale. It's clear that it's a bodge, and weaning users onto local locales (!) wouldn't be so hard later on. Anyway, time I stopped hypothesising about locales and started looking at the actual code-base, methinks. I'm not against putting a comma in the format to indicate that grouping should be used just as a dot indicates that a decimal point should be used. The locale would say what characters would be used for them. I would prefer the format to have a fixed default so that if you don't specify the locale the result is predictable. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Tue, 17 Mar 2009 01:47:32 -, MRAB goo...@mrabarnett.plus.com wrote: I'm not against putting a comma in the format to indicate that grouping should be used just as a dot indicates that a decimal point should be used. The locale would say what characters would be used for them. I would prefer the format to have a fixed default so that if you don't specify the locale the result is predictable. Shouldn't that be the global locale? -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Rhodri James wrote: On Tue, 17 Mar 2009 01:47:32 -, MRAB goo...@mrabarnett.plus.com wrote: I'm not against putting a comma in the format to indicate that grouping should be used just as a dot indicates that a decimal point should be used. The locale would say what characters would be used for them. I would prefer the format to have a fixed default so that if you don't specify the locale the result is predictable. Shouldn't that be the global locale? Other parts of the language, such as str.upper, aren't locale-sensitive, so I think that format shouldn't be either. If you want it to be locale-sensitive, then specify the locale, even if it's the system locale. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Tue, 17 Mar 2009 02:41:23 -, MRAB goo...@mrabarnett.plus.com wrote: Rhodri James wrote: On Tue, 17 Mar 2009 01:47:32 -, MRAB goo...@mrabarnett.plus.com wrote: I'm not against putting a comma in the format to indicate that grouping should be used just as a dot indicates that a decimal point should be used. The locale would say what characters would be used for them. I would prefer the format to have a fixed default so that if you don't specify the locale the result is predictable. Shouldn't that be the global locale? Other parts of the language, such as str.upper, aren't locale-sensitive, so I think that format shouldn't be either. If you want it to be locale-sensitive, then specify the locale, even if it's the system locale. Yes, but the format type 'n' is currently defined as taking its cues from the global locale, so in that sense format already is locale-sensitive. -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Tim Rowe digil.com wrote: 8 - . If Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious then by all means propose a way of making it simpler and clearer, but not a bodge that will increase the amount of bad software in the world. I do not follow the reasoning behind this. It seems to be based on an assumption that the locale approach is some sort of holy grail that solves these problems, and that anybody who does not like or use it is automatically guilty of writing crap code. No account seems to be taken of the fact that the locale approach is a global one that forces uniformity on everything done on a PC or by a user. So when you want to make a report in a format that would suit what your foreign visitors are used to, do you have to change your server's locale, and change it back again afterwards, or what ? The locale approach has all the disadvantages of global variables. To make software usable by, or expandable to, different languages and cultures is a tricky design problem - you have to, at the minimum, do things like storing all your text, both for prompts and errors, in some kind of database and refer to it by its key, everywhere. You cannot simply assume, that because a number represents a monetary value, that it is Yen, or Australian Dollar, or whatever - you may have to convert it first, from its currency, to the currency that you want to display it as, and only then can you worry about the format that you want to display it in. In all of this, as I see it, the locale approach addresses only a small part, and solves very little. Why is it still being defended and touted as if it were 42? * - Hendrik * the answer to life, the universe, and everything. ( - Douglas Adams' Hitchhiker books) -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Paul Rubin http://phr...@nospam.invalid wrote: Paul Rubin http://phr...@nospam.invalid writes: '%.3K' % 1234567 = 1.235K # K = 1000 '%.:3Ki' % 1234567 = 1.206K # K = 1024 I meant 1.235M and 1.177M, of course. I went tilt like a slot machine long before I noticed... :-) - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: No doubt that you're skeptical of anything you didn't already know ;-) I'm a CPA, was a 15 year division controller for a Fortune 500 company, and an auditor for an international accounting firm. Believe me when I say it is the norm in finance. Besides, it seems like you're arguing that thousands separators aren't needed anywhere at all and have doubts about their utility. Pick-up your pocket calculator and take a look. Look at your paycheck or your bank statement. My current bank and my previous bank use 2 ways to write numbers: 1. a decimal comma, and a space (or half-space or any other appropriate small whitespace) as a thousands separator 2. written full out in words (including the currency names) Invoices (not from these banks) often use a point as the thousands separator (although that's wrong according to some national standards, it's probably okay according to accounting standards...). The second formatting (full words) is a legal requirement on certain financial legal documents here (and I can imagine in other countries too?). Anybody working on a PEP about implementing a 'w' (for wordy?) formatting type? ;-) -- JanC -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Sat, 14 Mar 2009 08:20:21 -, Hendrik van Rooyen m...@microcorp.co.za wrote: Tim Rowe digil.com wrote: 8 - . If Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious then by all means propose a way of making it simpler and clearer, but not a bodge that will increase the amount of bad software in the world. I do not follow the reasoning behind this. It seems to be based on an assumption that the locale approach is some sort of holy grail that solves these problems, and that anybody who does not like or use it is automatically guilty of writing crap code. No account seems to be taken of the fact that the locale approach is a global one that forces uniformity on everything done on a PC or by a user. Like unicode, locales should make using your computer with your own cultural settings a one-time configuration, and make using your computer in another setting possible. By and large they do this. Like unicode, locales fail in as much as they make cross-cultural usage difficult. Unlike unicode, there is a lot of failure in the standard locale library, which is almost entirely the fault of the standard C locale library it uses. Nobody's defending the implementation, as far as I've noticed (which isn't saying much at the moment, but still...). A bit of poking around in the cheese shop suggests that Babel (http://www.babel.edgewall.org/) would be better, and Babel with a context manager would be better yet. On the other hand, we have a small addition to format strings. Unfortunately it's a small addition that doesn't feel terribly natural in a mini-language that already runs the risk of looking like line noise when you pull the stops out. Not meaning the term particularly unkindly, it is a bodge; it's quick and dirty, syntactic saccharin rather than sugar for doing one particular thing for one particular interest group, and which looks deceptively like the right thing to do for everyone else. That's a bad thing to do. If we ever do get round to fixing localisation (i.e. making overriding bits of locales easy), it becomes a feature that's automatically present that we have to discourage normal programmers from using despite it's apparent usefulness. Frankly, I'd much rather fix the locale system and extend the format syntax to override the default locale. Perhaps something like financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) It's hard to think of a way of extending % format strings to cope with this that won't look utterly horrid, though! -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Rhodri James wrote: [snip] Frankly, I'd much rather fix the locale system and extend the format syntax to override the default locale. Perhaps something like financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) It's hard to think of a way of extending % format strings to cope with this that won't look utterly horrid, though! The problem with your example is that it magically looks for the locale name financial in the current namespace. Perhaps the name should be registered somewhere like this: locale.predefined[financial] = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
2009/3/14 Hendrik van Rooyen m...@microcorp.co.za: No account seems to be taken of the fact that the locale approach is a global one that forces uniformity on everything done on a PC or by a user. Not so. Under .NET, for instance, the global settings will give you a default CultureInfo class, but you can create your own CultureInfo classes for other cultures in your program and use them in place of the default. So when you want to make a report in a format that would suit what your foreign visitors are used to, do you have to change your server's locale, and change it back again afterwards, or what ? No, you create a local locale and use that. There are essentially three possible levels I can see for this: - programs that will only ever be used in one locale, known in advance. They can have the locale hard-wired into the program. No special support is needed for this. It's pretty easy to write a function to format a number to a hard-wired locale. I've done it in Pascal and FORTH and it was easy-peasy, so I can't imagine it's going to be a big deal in Python. If it's such a big deal for accountants to write this code, if they ask in this forum how to do it somebody will almost certainly supply a function that takes a float and returns a formatted string within a few minutes. It might even be you or me. - Programs that may be used in any unchanging locale. The existing locale support is built for this case. - Programs that nead to operate across locales. This can either be managed by switching global locales (which you rightly deprecate) or by managing alternate locales within the program. The locale approach has all the disadvantages of global variables. No, it has all the advantages of global constants used as overridable defaults for local variables. To make software usable by, or expandable to, different languages and cultures is a tricky design problem - you have to, at the minimum, do things like storing all your text, both for prompts and errors, in some kind of database and refer to it by its key, everywhere. You cannot simply assume, that because a number represents a monetary value, that it is Yen, or Australian Dollar, or whatever - you may have to convert it first, from its currency, to the currency that you want to display it as, and only then can you worry about the format that you want to display it in. Nothing in the proposal being considered addresses any of that. -- Tim Rowe -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Sun, 15 Mar 2009 19:00:43 -, MRAB goo...@mrabarnett.plus.com wrote: Rhodri James wrote: [snip] Frankly, I'd much rather fix the locale system and extend the format syntax to override the default locale. Perhaps something like financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) It's hard to think of a way of extending % format strings to cope with this that won't look utterly horrid, though! The problem with your example is that it magically looks for the locale name financial in the current namespace. True, to an extent. The counter-argument of Is it so much more magical than '{keyword}' looking up the object in the parameter list suggests a less magical approach would be to make the locale a parameter itself: print(my number is {0:10n:{1}}.format(1234567, financial) Perhaps the name should be registered somewhere like this: locale.predefined[financial] = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) I'm not sure that I don't think that *more* magical than my first stab! Regardless of the exact syntax, do you think that being able to specify an overriding locale object (and let's wave our hands over what one of those is too) is the right approach? -- Rhodri James *-* Wildebeeste Herder to the Masses -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Rhodri James wrote: On Sun, 15 Mar 2009 19:00:43 -, MRAB goo...@mrabarnett.plus.com wrote: Rhodri James wrote: [snip] Frankly, I'd much rather fix the locale system and extend the format syntax to override the default locale. Perhaps something like financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) It's hard to think of a way of extending % format strings to cope with this that won't look utterly horrid, though! The problem with your example is that it magically looks for the locale name financial in the current namespace. True, to an extent. The counter-argument of Is it so much more magical than '{keyword}' looking up the object in the parameter list suggests a less magical approach would be to make the locale a parameter itself: print(my number is {0:10n:{1}}.format(1234567, financial) The field name can be an integer or an identifier, so the locale could be too, provided that you know where to look it up! financial = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:{fin}}.format(1234567, fin=financial)) Then again, shouldn't that be: fin = Locale(group_sep=,, grouping=[3]) print(my number is {0:{fin}}.format(1234567, fin=financial)) Perhaps the name should be registered somewhere like this: locale.predefined[financial] = Locale(group_sep=,, grouping=[3]) print(my number is {0:10n:financial}.format(1234567)) I'm not sure that I don't think that *more* magical than my first stab! Regardless of the exact syntax, do you think that being able to specify an overriding locale object (and let's wave our hands over what one of those is too) is the right approach? -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Lie Ryan] My proposition is: make the format specifier a simpler API to locale aware You do know that we already have one, right? That's what the existing n specifier does. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
John Nagle na...@animats.com wrote: Yes. In COBOL, one writes PICTURE $999,999,999.99 which is is way ahead of most of the later approaches. That was fixed width. For zero suppression: PIC ,$$$,$99.99 This will format 1000 as $1,000.00 For fixed width zero suppression: PIC $ZZZ,ZZZ,Z99.99 gives a fixed width field - $ 1,000.00 with a fixed width font, this will line the column up, so that the decimals are under each other. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Hendrik van Rooyen wrote: Ulrich Eckhardt eck...aser.com wrote: IOW, why not explicitly say what you want using keyword arguments with defaults instead of inventing an IMHO cryptic, read-only mini-language? Seriously, the problem I see with this proposal is that its aim to be as short as possible actually makes the resulting format specifications unreadable. Could you even guess what 8T.,1f should mean if you had not written this? +1 Look back in history, and see how COBOL did it with the PICTURE - dead easy and easily understandable. Compared to that, even the C printf stuff and python's % are incomprehensible. - Hendrik Seeing how many people complained for the proposal being unreadable (although it tries to be simple by not including too much features), why not go all the way to unreadability and teach people to always use some sort of convenience function and never use the microlanguage except of very simple cases (or extremely complex cases, in which case you might actually be better served with writing your own formatting function). A hyphotetical code using conv function and the microlanguage could look like this: num = 213210.3242 fmt = create_format(sep='-', decsep='@') print fmt 50|\/|3_v3ry_R34D4|3L3_C0D3 '{0!{1}}'.format(num, fmt) '213-...@3242' -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Lie Ryan] A hyphotetical code using conv function and the microlanguage could look like this: num = 213210.3242 fmt = create_format(sep='-', decsep='@') print fmt 50|\/|3_v3ry_R34D4|3L3_C0D3 '{0!{1}}'.format(num, fmt) '213-...@3242' LOL, it's like APL all over again ;-) FWIW, the latest version of the proposal is dirt simple: format(1234567, 'd') # what we have now '1234567' format(1234567, ',d') # proposed new option '1,234,567' format(1234.5, '.2f') # what we have now '1234.50' format(1234.5, ',.2f') # proposed new option '1,234.50' The proposal is roughly: If you want commas in the output, put a comma in the format string. It's not rocket science. What is rocket science is what you have to do now to achieve the same effect. If someone finds the above to be baffling, how the heck are they going to do the same thing using the locale module? Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: [Lie Ryan] A hyphotetical code using conv function and the microlanguage could look like this: num = 213210.3242 fmt = create_format(sep='-', decsep='@') print fmt 50|\/|3_v3ry_R34D4|3L3_C0D3 '{0!{1}}'.format(num, fmt) '213-...@3242' LOL, it's like APL all over again ;-) FWIW, the latest version of the proposal is dirt simple: format(1234567, 'd') # what we have now '1234567' format(1234567, ',d') # proposed new option '1,234,567' format(1234.5, '.2f') # what we have now '1234.50' format(1234.5, ',.2f') # proposed new option '1,234.50' would it break anything to also allow format(1234567, 'd') # what we have now '1234567' format(1234567, '.d') # proposed new option '1.234.567' format(1234.5, ',2f') # proposed new option '1234,50' format(1234.5, '.,2f') # proposed new option '1.234,50' because that would support a moderate chunk of the non-english speaking users and seems like a natural extension. (i'm still not sure this is that great an idea - if you think using a locale is rocket science then perhaps your excess energy would be better spent making locale easier, rather than tweaking this behaviour for a subset of users?) andrew -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger pyt...@rcn.com writes: The proposal is roughly: If you want commas in the output, put a comma in the format string. It's not rocket science. What if you want to change the separator? Europeans usually use periods instead of commas: one thousand = 1.000. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[andrew cooke] would it break anything to also allow format(1234567, 'd') # what we have now '1234567' format(1234567, '.d') # proposed new option '1.234.567' format(1234.5, ',2f') # proposed new option '1234,50' format(1234.5, '.,2f') # proposed new option Yes, that's allowed too! The separators can be any one of COMMA, SPACE, DOT, UNDERSCORE, or NON-BREAKING-SPACE. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Paul Rubin] What if you want to change the separator? Europeans usually use periods instead of commas: one thousand = 1.000. That is supported also. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
2009/3/12 Raymond Hettinger pyt...@rcn.com: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec As far as I can see you're proposing an amendment to *encourage* writing code that is not locale aware, with the amendment itself being locale specific, which surely has to be a regressive move in the 21st century. Frankly, I'd sooner see it made /harder/ to write code that is not locale aware (warnings, like FxCop gives on .net code?) tnan /easier/. Perhaps that's because I'm British, not American and I'm sick of having date fields get the date wrong because the programmer thinks the USA is the world. It makes me sympathetic to the problems caused to others by programmers who think the English-speaking world is the world. By the way, to others who think that 123,456.7 and 123.456,7 are the only conventions in common use in the West, no they're not. 123 456.7 is in common use in engineering, at least in Europe, precisely to reduce (though not eliminate) problems caused by dot and comma confusion.. -- Tim Rowe -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Freitag, 13. März 2009, Raymond Hettinger wrote: [Paul Rubin] What if you want to change the separator? Europeans usually use periods instead of commas: one thousand = 1.000. That is supported also. do you support just a fixed set of separators or anything? how about this: (Switzerland) 12'000.99 or spacing: 12 000.99 -- Wolfgang -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mar 13, 7:06 am, Tim Rowe digi...@gmail.com wrote: 2009/3/12 Raymond Hettinger pyt...@rcn.com: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec As far as I can see you're proposing an amendment to *encourage* writing code that is not locale aware, with the amendment itself being locale specific, which surely has to be a regressive move in the 21st century. Frankly, I'd sooner see it made /harder/ to write code that is not locale aware (warnings, like FxCop gives on .net code?) tnan /easier/. Perhaps that's because I'm British, not American and I'm sick of having date fields get the date wrong because the programmer thinks the USA is the world. It makes me sympathetic to the problems caused to others by programmers who think the English-speaking world is the world. By the way, to others who think that 123,456.7 and 123.456,7 are the only conventions in common use in the West, no they're not. 123 456.7 is in common use in engineering, at least in Europe, precisely to reduce (though not eliminate) problems caused by dot and comma confusion.. -- Tim Rowe I lived in three different countries and in school used blank for thousand separator to avoid confusion with the multiply operator. I think this proposal is more for debugging big numbers and meant mostly for programmers' eyes. We are already using the dot instead of comma decimal separator in our programming languages that one more Americanism won't kill us. I am leaning towards proposal 1 now just to avoid the thousand variations that will be requested because of this, making the implementation unnecessarily complex. I can always use the 3 replacement hack (conveniently documented in the pep). +1 for Nick's proposal -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
2009/3/13 prueba...@latinmail.com: I think this proposal is more for debugging big numbers and meant mostly for programmers' eyes. We are already using the dot instead of comma decimal separator in our programming languages that one more Americanism won't kill us. If it were for the programmers' eyes then it would be in the code, not in the formatted output. Debugging of big numbers can be done by checking within code, so there's no need to let this escape to the output. And if it's for programmers' eyes then the statement The COMMA is used when a PERIOD is the decimal separator is wrong, at least if it means that the COMMA is the /only/ separator used when a PERIOD is the decimal separator. Ada uses UNDERSCOREs, which can be placed almost anywhere in a numeric literal and are ignored. And if it's mostly for programmers' eyes, why does the motivation state that Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users? The proposal is clearly for the presentation of numbers to end users, and quite simply is an encouragement to sloppiness in presenting those numbers. If Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious then by all means propose a way of making it simpler and clearer, but not a bodge that will increase the amount of bad software in the world. -1 for all of the proposals. -- Tim Rowe -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: [andrew cooke] would it break anything to also allow format(1234567, 'd') # what we have now '1234567' format(1234567, '.d') # proposed new option '1.234.567' format(1234.5, ',2f') # proposed new option '1234,50' format(1234.5, '.,2f') # proposed new option Yes, that's allowed too! The separators can be any one of COMMA, SPACE, DOT, UNDERSCORE, or NON-BREAKING-SPACE. What if I want other separators? How about this idea: make the format has long format, which is a bit more verbose, flexible, and unambiguous, and the current proposal a short format, which is more concise. The long format would be like this (this is much, much more featureful than the current proposition, I think I might have crossed far beyond the Mayan line): [n|sign signnegative[[, signzero], signpositive] | ] [w|min minwidth[, align[, alignfill]]] [x|max maxwidth[, overflowsign[, overflowalign]]] [s|sep [[...]sepsepwidth]sepsepwidth | ] [dp|decpoint decpoint | ] [ds|decsep widthsep[, widthsep[...]] | ] [b|base base-n[, charset]] [p|prec prec | ] t|type type The feel of long format fmt_string: 'type f' number: 876543213456.98765445 result: 876543213456.98765445 fmt_string: 'decpoint ^ | type f' number: 876543213456.98765445 result: 876543213456^98765445 fmt_string: 'sep 21:3.4 | decpoint , | prec 3 | type f' number: 876543213456.98765445 result: 87654:321.3456,988 fmt_string: 'sep 21:3.4 | decpoint , | prec 3 | type f' number: 876543213456.98765445 result: 87654:321.3456,988 fmt_string: 'sep 21:3.4 | decpoint , | prec 3 | type f' number: 876543213456.98765445 result: 87654:321.3456,988 General Rules: - every field, except type is optional - fields are separated by | (this may change), escape literal | with || - every fields starts with an identifier then a mandatory whitespace - subfields are separated by commas. Each identifier has long and short identifier. - Processing precedent is: type, base, prec, sep/decsep, decpoint, sign, min, max Specific rules: - min and max determines width, min determine the rule when the resulting string is shorter than minwidth, max determine rule when the resulting string is longer than maxwidth (basically trimming). alignfill is character/sequence of character to be used to make the resulting string as long as minwidth, overflowsign is character added when maxwidth is exceeded and trimming occurs - sep is basically a separator delimited for each width. The regular latin number system would be represented as sep 3.3 the leftmost number and separator would be repeated. - decsep works similarly to sep - base is the number base, charset is mapping of digits used to represent output number in the certain base. PS: It is not designed for hand written, but is meant to be fairly readable PPS: It is fairly modular too -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
The separators can be any one of COMMA, SPACE, DOT, UNDERSCORE, or NON-BREAKING-SPACE. What if I want other separators? format(n, ',d').replace(,, yoursep) How about this idea: make the format has long format, which is a bit more verbose, flexible, and unambiguous, and the current proposal a short format, which is more concise. The long format would be like this (this is much, much more featureful than the current proposition, I think I might have crossed far beyond the Mayan line): I concur ;-) Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Lie Ryan wrote: Hendrik van Rooyen wrote: Ulrich Eckhardt eck...aser.com wrote: Look back in history, and see how COBOL did it with the PICTURE - dead easy and easily understandable. Compared to that, even the C printf stuff and python's % are incomprehensible. - Hendrik Yes. In COBOL, one writes PICTURE $999,999,999.99 which is is way ahead of most of the later approaches. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Todays updates to: http://www.python.org/dev/peps/pep-0378/ * Detail issues with the locale module. * Summarize commentary to date. -- Opposition to formatting strings in general (preferring a convenience function or PICTURE clause) -- Opposition to any non-locale aware approach * Add APOSTROPHE and non-breaking SPACE to the list of separators. * Add more links to external references (Babel, Excel, ADA, CommonLisp, COBOL, C-Sharp). * Clarify how proposal II is parsed. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: Todays updates to: http://www.python.org/dev/peps/pep-0378/ * Detail issues with the locale module. * Summarize commentary to date. -- Opposition to formatting strings in general (preferring a convenience function or PICTURE clause) -- Opposition to any non-locale aware approach * Add APOSTROPHE and non-breaking SPACE to the list of separators. * Add more links to external references (Babel, Excel, ADA, CommonLisp, COBOL, C-Sharp). * Clarify how proposal II is parsed. I'd just like to make the point that the string methods, eg unicode.upper, aren't locale-sensitive, so 'format' shouldn't be either. The string methods could perhaps retain their current behaviour as the default and accept a parameter to make them locale-sensitive. The same could be the case for 'format' so the format string has . to represent the decimal point and , to represent the digit separator, and those would be the default, but it could accept a flag (L?) to make it locale-sensitive. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Tim Rowe digi...@gmail.com writes: And if it's mostly for programmers' eyes, why does the motivation state that Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users? It occurs to me, at least for quantities of data, one of the most useful aids to readability is scaling down the quantity and suffixing it with K (kilo), M (mega), G (giga), etc. This is sometimes done with K=1000 and sometimes with K=1024 (fancy pronunciation kibi rather than kilo, officially abbreviated Ki). Possible formatting: '%.3K' % 1234567 = 1.235K # K = 1000 '%.:3Ki' % 1234567 = 1.206K # K = 1024 The colon (two dots) signifies base two. The i is not part of the format spec, it's just a literal character, to make the standard abbreviation for kibi. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Paul Rubin http://phr...@nospam.invalid writes: '%.3K' % 1234567 = 1.235K # K = 1000 '%.:3Ki' % 1234567 = 1.206K # K = 1024 I meant 1.235M and 1.177M, of course. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: Motivation: Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Raymod, I think there are several problems with the Motivations: The goal is to make a common task easier for many users. Common task, for most people, means formatting numbers to the locale. We should make converting numbers to locale easier to use, as easy as calling a magic function that can convert the current object to the locale representation or as simple as defining locale ID in the mini language. This proposal, I believe, is for the _less_ common task of formatting a number to a custom format not generally used anywhere else in the world (like formatting a number to form an ipv6 address or formatting a number to html/TeX code[1]). [1] I know one mathematic textbook that uses superscript negative for negative number to disambiguate it with minus sign. In the finance world, output with commas is the norm. I can't cite any source, but I am skeptical with that. And how about non-finance world? Scientific world? Pure math world? Provide a simple, non-locale aware way to format a number with a thousands separator. Many have pointed out, locale is hard to use, this is easier approach but pity it is not locale aware. If we want to provide a non-locale aware formatting, we must make it flexible enough to make it the Ultimate Formatter. Otherwise it will just be redundant to locale. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. There are infinitely many approach to numbers. One Singaporean text book uses half-width space as thousand separator. One Autralian text book uses superscript minus for negative numbers (which I believe would require more than Unicode to represent, TeX or PDF perhaps). The accounting world sometimes uses colors and parentheses to denote negative numbers (this requires emmiting codes for the layout program: HTML, TeX, PDF) Anything less powerful than my proposed Crossing Mayan line is just a harder alternative for locale module. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Lie Ryan] In the finance world, output with commas is the norm. I can't cite any source, but I am skeptical with that. No doubt that you're skeptical of anything you didn't already know ;-) I'm a CPA, was a 15 year division controller for a Fortune 500 company, and an auditor for an international accounting firm. Believe me when I say it is the norm in finance. Besides, it seems like you're arguing that thousands separators aren't needed anywhere at all and have doubts about their utility. Pick-up your pocket calculator and take a look. Look at your paycheck or your bank statement. Check-out a publishing style guide. They are somewhat basic. There's a reason the MS Excel and Lotus offered them from day one. Python's format() style was taken directly from C-Sharp. which offers both an n format that is locale sensitive and a non-locale-sensitive variant that specifies a comma. I'm suggesting that we also do both. Random, make-up statistic: 99% of Python scripts are not internationalized, have no need to be internationalized, and have output intended to be used in the script writer's immediate environment. Another issue I have with locale is that you have to find one that matches every specific need. Quick, which one gives you non-breaking spaces for a thousands separator? If you do find such a locale and it happens to be spelled the same way on every platform, is it self-evident in your program that it will in fact print with spaces or has that become an implicit, behind the scenes operation. If later you need to print another number with a different separator, do you have a way make that happen without breaking the first piece of code you wrote? The locale module has plenty of issues for a programmer to think about: http://docs.python.org/library/locale.html#background-details-hints-tips-and-caveats Besides, lots of people use Python who are not professional programmers. We should not require them enter the complicated world of locale just to do a basic formatting task. When I teach Python to pre-college students, there is no way I'm adding locale to the list of things they need to learn to become functional with the language. Sorry for the long post, but I feel like you keep inventing heavy solutions that don't fit well with what we already have. This should be a simple problem -- when writing a number format, how I specify that I want character X as a thousands separator. The answer to that question should be nothing harder than, add character X to the format string. You're a very creative person, but I don't see Guido accepting any idea that rejects what he has already chosen as the way to format strings. He is no fan of the locale module's API, but it is tightly bound to existing programs and POSIX standards. That greatly limits the options for changing it. I'm sure you can come-up with 500 ways of meeting this need (almost none of which meld with Guido's choice to accept PEP3101 for both 2.6 and 3.0). I'm offering a simple extension to the existing framework that makes the above tasks easy. C-sharp make essentially the same choice in its design. There's no reason for you to have to use it if you hate it. Cheers, Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec - Motivation: Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. The locale module presents two other challenges. First, it is a global setting and not suitable for multi-threaded apps that need to serve-up requests in multiple locales. Second, the name of a relevant locale (such as de_DE) can vary from platform to platform or may not be defined at all. The docs for the locale module describe these and many other challenges [1] in detail. It is not the goal to replace the locale module or to accommodate every possible convention. Such tasks are better suited to robust tools like Babel [2] . Instead, our goal is to make a common, everyday task easier for many users. Comments and suggestions are welcome but I draw the line at supporting Mayan numbering conventions ;-) Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: [Lie Ryan] In the finance world, output with commas is the norm. I can't cite any source, but I am skeptical with that. No doubt that you're skeptical of anything you didn't already know ;-) I'm a CPA, was a 15 year division controller for a Fortune 500 company, and an auditor for an international accounting firm. Believe me when I say it is the norm in finance. Besides, it seems like you're arguing that thousands separators aren't needed anywhere at all and have doubts about their utility. Pick-up your pocket calculator and take a look. Look at your paycheck or your bank statement. Check-out a publishing style guide. They are somewhat basic. There's a reason the MS Excel and Lotus offered them from day one. I have no reason to doubt that output with separators is nice, but I am skeptical that all financial institution in the world (not just US) uses commas for their separators. Python's format() style was taken directly from C-Sharp. which offers both an n format that is locale sensitive and a non-locale-sensitive variant that specifies a comma. I'm suggesting that we also do both. I'm fine with that. But no commas, instead user-defineable separators. Random, make-up statistic: 99% of Python scripts are not internationalized, have no need to be internationalized, and have output intended to be used in the script writer's immediate environment. Random, make up statistic: 95% of which is scripts written for personal/internal use. If you do find such a locale and it happens to be spelled the same way on every platform, is it self-evident in your program that it will in fact print with spaces or has that become an implicit, behind the scenes operation. If later you need to print another number with a different separator, do you have a way make that happen without breaking the first piece of code you wrote? Yeah, every data in transmission should be in locale independent format, it should only be turned to locale aware format just before viewing to the user. That way nothing will break. Since you're an accountant, I am sure you know about Quicken Files, which stores data in locale format, which IMHO is a very BAD design. Another issue I have with locale is that you have to find one that matches every specific need. Quick, which one gives you non-breaking spaces for a thousands separator? That wasn't the issue. Most programs would either use the environment's locale and give user configuration to override the locale or I don't care, the output is for personal/internal consumption or The data only makes sense with certain formatting. I don't see a use case where the programmer would really want to hardcode a locale AND want the output to be exactly like what he sees in the user machine. The first case (use the environment's locale and give user configuration to override the locale) is for internationalized applications, and is served by locale. The locale module is currently difficult to work with, so I believe we should provide a more accessible way. The second case (I don't care, the output is for personal/internal consumption), is well served by python's default view. The third case (The data only makes sense with certain formatting) is the one that will benefit the most from non-locale aware formatting. But they would require a very powerful formatter. Such use case is formatting IP address, telephone number, ID card number, etc. My proposition is: make the format specifier a simpler API to locale aware or make it capable to serve the third case. I would rather prioritize on the former case. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin: http://docs.python.org/library/string.html#formatspec Here's a re-post (hopefully without the line wrapping problems in the previous post). Raymond - Motivation: --- Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Research so far: Scanning the web, I've found that thousands separators are usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The COMMA is used when a PERIOD is the decimal separator. James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. Visual Basic and its brethren (like MS Excel) use a completely different style and have ultra-flexible custom format specifiers like: _($* #,##0_). Proposal I (from Nick Coghlan): --- A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][minimumwidth][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example: format(n, 6,f).replace(,, _) This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped: format(n, 6,f).replace(,, X).replace(., ,).replace(X, .) Proposal II (to meet Antoine Pitrou's request): --- Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type] Examples: format(1234, 8.1f)-- ' 1234.0' format(1234, 8,1f)-- ' 1234,0' format(1234, 8T.,1f) -- ' 1.234,0' format(1234, 8T .f) -- ' 1 234,0' format(1234, 8d) -- '1234' format(1234, 8T,d)-- ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but iIt comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. For the locale module, just the T is necessary in a formatting string since the tool already has procedures for figuring out the actual separators from the local context. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: The idea is to make numbering formatting a little easier with the new format() builtin: http://docs.python.org/library/string.html#formatspec [...] Scanning the web, I've found that thousands separators are usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The COMMA is used when a PERIOD is the decimal separator. James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. IIRC, some cultures use a non-uniform grouping, like e.g. 123 456 78.9. For that, there is also a grouping reserved in the locale (at least in those of C++ IOStreams, that is). Further, an that seems to also be one of your concerns, there are different ways to represent negative numbers like e.g. (123) or -456. Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type] Examples: format(1234, 8.1f)-- ' 1234.0' format(1234, 8,1f)-- ' 1234,0' format(1234, 8T.,1f) -- ' 1.234,0' format(1234, 8T .f) -- ' 1 234,0' format(1234, 8d) -- '1234' format(1234, 8T,d)-- ' 1,234' How about this? format(1234, 8.1, tsep=,) -- ' 1,234.0' format(1234, 8.1, tsep=., dsep=,) -- ' 1.234,0' format(123456, tsep= , grouping=(3, 2,)) -- '1 234 56' IOW, why not explicitly say what you want using keyword arguments with defaults instead of inventing an IMHO cryptic, read-only mini-language? Seriously, the problem I see with this proposal is that its aim to be as short as possible actually makes the resulting format specifications unreadable. Could you even guess what 8T.,1f should mean if you had not written this? This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but iIt comes at the expense of being a little more complicated to learn and remember. Too expensive for my taste. Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Ulrich Eckhardt] IOW, why not explicitly say what you want using keyword arguments with defaults instead of inventing an IMHO cryptic, read-only mini-language? That makes sense to me but I don't think that's the way the format() builtin was implemented (see PEP 3101 which was implemented Py2.6 and 3.0). It is a simple pass-through to a __format__ method for each formattable object. I don't see how keywords would fit in that framework. What is proposed is similar to locale module's existing n specifier except that this lets you say exactly what you want instead of deferring to the locale settings. The mini-language seems to already be the way of things (just as it is many other languages including PHP, C, Fortran, and whatnot). I'm just proposing an addition T, so you add commas as a thousands separator. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: [snip] Proposal I (from Nick Coghlan): --- A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][minimumwidth][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example: format(n, 6,f).replace(,, _) This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped: format(n, 6,f).replace(,, X).replace(., ,).replace(X, .) Proposal II (to meet Antoine Pitrou's request): --- Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type] Examples: format(1234, 8.1f)-- ' 1234.0' format(1234, 8,1f)-- ' 1234,0' format(1234, 8T.,1f) -- ' 1.234,0' format(1234, 8T .f) -- ' 1 234,0' format(1234, 8d) -- '1234' format(1234, 8T,d)-- ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but iIt comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. For the locale module, just the T is necessary in a formatting string since the tool already has procedures for figuring out the actual separators from the local context. [snip] I'd probably prefer Proposal I with . representing the decimal point and , representing the grouping (thousands) separator, although I'd add an L flag to indicate that it should use the locale to provide the actual characters to be used and even the number of digits for the grouping: [[fill]align][sign][#][0][minimumwidth][,][.precision][L][type] Examples: Assuming the locale has: decimal point: , grouping separator: . grouping spacing: 3 format(123456, 10.1f)-- ' 123456.0' format(123456, 10.1Lf) -- ' 123.456,0' format(123456, 10,.1f) -- ' 123,456.0' format(123456, 10,.1Lf) -- ' 123.456,0' -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Ulrich Eckhardt eck...aser.com wrote: IOW, why not explicitly say what you want using keyword arguments with defaults instead of inventing an IMHO cryptic, read-only mini-language? Seriously, the problem I see with this proposal is that its aim to be as short as possible actually makes the resulting format specifications unreadable. Could you even guess what 8T.,1f should mean if you had not written this? +1 Look back in history, and see how COBOL did it with the PICTURE - dead easy and easily understandable. Compared to that, even the C printf stuff and python's % are incomprehensible. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mar 12, 9:56 pm, Raymond Hettinger pyt...@rcn.com wrote: [Ulrich Eckhardt] IOW, why not explicitly say what you want using keyword arguments with defaults instead of inventing an IMHO cryptic, read-only mini-language? That makes sense to me but I don't think that's the way the format() builtin was implemented (see PEP 3101 which was implemented Py2.6 and 3.0). It is a simple pass-through to a __format__ method for each formattable object. I don't see how keywords would fit in that framework. What is proposed is similar to locale module's existing n specifier except that this lets you say exactly what you want instead of deferring to the locale settings. The mini-language seems to already be the way of things (just as it is many other languages including PHP, C, Fortran, and whatnot). I'm just proposing an addition T, so you add commas as a thousands separator. ... and why not C (centum) for hundreds (can't have H(ollerith)) and W for wan (the Chinese word for 10 thousand)? -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mar 12, 3:30 am, Raymond Hettinger pyt...@rcn.com wrote: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec - Motivation: Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Research so far: Scanning the web, I've found that thousands separators are usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The COMMA is used when a PERIOD is the decimal separator. James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. Visual Basic and its brethren (like MS Excel) use a completely different style and have ultra-flexible custom format specifiers like: _($* #,##0_). Proposal I (from Nick Coghlan]: A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][minimumwidth][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example: format(n, 6,f).replace(,, _) This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped. format(n, 6,f).replace(,, X).replace(., ,).replace (X, .) Proposal II (to meet Antoine Pitrou's request): Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore.. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision] [type] Examples: format(1234, 8.1f) -- ' 1234.0' format(1234, 8,1f) -- ' 1234,0' format(1234, 8T.,1f) -- ' 1.234,0' format(1234, 8T .f) -- ' 1 234,0' format(1234, 8d) -- ' 1234' format(1234, 8T,d) -- ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but it comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. For the locale module, just the T is necessary in a formatting string since the tool already has procedures for figuring out the actual separators from the local context. Comments and suggestions are welcome but I draw the line at supporting Mayan numbering conventions ;-) Raymond As far as I am concerned the most simple version plus a way to swap around commas and period is all that is needed. The rest can be done using one replace (because the decimal separator is always one of two options). This should cover everywhere but the far east. 80% of cases for 20% of implementation complexity. For example: [[fill]align][sign][#][0][,|.][minimumwidth][.precision][type] format(1234, .8.1f) -- ' 1.234,0' format(1234, ,8.1f) -- ' 1,234.0' -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
On Mar 12, 7:51 am, prueba...@latinmail.com wrote: On Mar 12, 3:30 am, Raymond Hettinger pyt...@rcn.com wrote: If anyone here is interested, here is a proposal I posted on the python-ideas list. The idea is to make numbering formatting a little easier with the new format() builtin in Py2.6 and Py3.0: http://docs.python.org/library/string.html#formatspec - Motivation: Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Research so far: Scanning the web, I've found that thousands separators are usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The COMMA is used when a PERIOD is the decimal separator. James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. Visual Basic and its brethren (like MS Excel) use a completely different style and have ultra-flexible custom format specifiers like: _($* #,##0_). Proposal I (from Nick Coghlan]: A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][minimumwidth][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example: format(n, 6,f).replace(,, _) This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped. format(n, 6,f).replace(,, X).replace(., ,).replace (X, .) Proposal II (to meet Antoine Pitrou's request): Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore.. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision] [type] Examples: format(1234, 8.1f) -- ' 1234.0' format(1234, 8,1f) -- ' 1234,0' format(1234, 8T.,1f) -- ' 1.234,0' format(1234, 8T .f) -- ' 1 234,0' format(1234, 8d) -- ' 1234' format(1234, 8T,d) -- ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but it comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. For the locale module, just the T is necessary in a formatting string since the tool already has procedures for figuring out the actual separators from the local context. Comments and suggestions are welcome but I draw the line at supporting Mayan numbering conventions ;-) Raymond As far as I am concerned the most simple version plus a way to swap around commas and period is all that is needed. Thanks for the feedback. FWIW, posted a cleaned-up version of the proposal at http://www.python.org/dev/peps/pep-0378/ Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger pyt...@rcn.com writes: FWIW, posted a cleaned-up version of the proposal at http://www.python.org/dev/peps/pep-0378/ It would be nice if the PEP included a comparison between the proposed scheme and how it is done in other programs and languages. For example, I think Common Lisp has a feature for formatting thousands. Spreadsheets like Excel probably have something similar. Those programs are pretty well evolved and probably address the important real use cases by now. It might be best to follow an existing example (with adjustments for Pythonification as necessary) to the extent possible. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Paul Rubin] It would be nice if the PEP included a comparison between the proposed scheme and how it is done in other programs and languages. Good idea. I'm hoping that people will post those here. In my quick research, it looks like many languages offer nothing more than the usual C style % formatting and defer the rest for a local aware module. For example, I think Common Lisp has a feature for formatting thousands. Do you have more detail? Spreadsheets like Excel probably have something similar. I addressed that in the PEP in the section on VB and relatives. Their approach doesn't graft-on to our existing approach. They use format specifiers like: _($* #,##0_). Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
[Paul Rubin] I think Common Lisp has a feature for formatting thousands. I found the Common Lisp spec for this and added it to the PEP. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger pyt...@rcn.com writes: In my quick research, it looks like many languages offer nothing more than the usual C style % formatting and defer the rest for a local aware module. Hendrik van Rooyen's mention of Cobol's picture (aka PIC) specifications might be added to the list. Cautionary tale: I once had a similar idea and suggested including a bastardized version of PIC in an extension language for something I worked on once. Another programmer then coded a reasonable PIC subset and we shipped it. Turned out that a number of our users were Cobol experts and once we had anything like PIC, they expected the weirdest and most obscure features (of which there were quite a few) of real Cobol PIC to work. We ended up having to assign someone a fairly lengthy task of figuring out the Cobol spec and implementing every last damn PIC feature. But I digress. example, I think Common Lisp has a feature for formatting thousands. Do you have more detail? http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node200.html gives as an example: (format nil The answer is ~:D. (expt 47 x)) = The answer is 229,345,007. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger pyt...@rcn.com writes: I found the Common Lisp spec for this and added it to the PEP. Ah, cool, I simultaneously looked for it and posted about it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Rough draft: Proposed format specifier for a thousands separator
Raymond Hettinger wrote: ... a generally interesting PEP... Missing from this PEP: output below the decimal point. show results for something like: format(12345.54321, 15,.5f) -- ' 12,345.543,21' Explain the interaction on sizes and lengths (which numbers are digits, which are length [I vote for length on overall, digits on precision]), and what happens with length-4 -- I'd say explicitly 1000 is show as 1,000 despite style sheets that prefer 1000 and 10,000. FWIW, I agree with pruebano, do the simplest easily usable thing, and provide a way to swap the commas and periods. The rest can be ponied in by string processing. --Scott David Daniels scott.dani...@acm.org -- http://mail.python.org/mailman/listinfo/python-list