Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert wrote: On May 19, 3:33 am, Martin v. Löwis [EMAIL PROTECTED] wrote: That would be invalid syntax since the third line is an assignment with target identifiers separated only by spaces. Plus, the identifier starts with a number (even though 6 is not DIGIT SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't start an identifier). Actually both of these issues point to the real problem with this PEP. I knew about them (note that the colon is also missing) alas I couldn't fix them. My editor would could not remove a space or add a colon anymore, it would immediately change the rest of the characters to something crazy. (Of course now someone might feel compelled to state that this is an editor problem but I digress, the reality is that features need to adapt to reality, moreso had I used a different editor I'd be still unable to write these characters). The reality is that the few users who care about having chinese in their code *will* be using an editor that supports them. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 17, 5:03 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On May 16, 6:38 pm, [EMAIL PROTECTED] wrote: Are you worried that some 3rd-party package you have included in your software will have some non-ascii identifiers buried in it somewhere? Surely that is easy to check for? Far easier that checking that it doesn't have some trojan code it it, it seems to me. What do you mean, check for? If, say, numeric starts using math characters (as has been suggested), I'm not exactly going to stop using numeric. It'll still be a lot better than nothing, just slightly less better than it used to be. The PEP explicitly states that no non-ascii identifiers will be permitted in the standard library. The opinions expressed here seems almost unamimous that non-ascii identifiers are a bad idea in any sort of shared public code. Why do you think the occurance of non-ascii identifiers in Numpy is likely? And I'm often not creating a stack trace procedure, I'm using the built-in python procedure. And I'm often dealing with mailing lists, Usenet, etc where I don't know ahead of time what the other end's display capabilities are, how to fix them if they don't display what I'm trying to send, whether intervening systems will mangle things, etc. I think we all are in this position. I always send plain text mail to mailing lists, people I don't know etc. But that doesn't mean that email software should be contrainted to only 7-bit plain text, no attachements! I frequently use such capabilities when they are appropriate. Sure. But when you're talking about maintaining code, there's a very high value to having all the existing tools work with it whether they're wide-character aware or not. I agree. On Windows I often use Notepad to edit python files. (There goes my credibility! :-) So I don't like tab-only indent proposals that assume I can set tabs to be an arbitrary number of spaces. But tab-only indentation would affect every python program and every python programmer. In the case of non-ascii identifiers, the potential gains are so big for non-english spreakers, and (IMO) the difficulty of working with non-ascii identifiers times the probibility of having to work with them, so low, that the former clearly outweighs the latter. If your response is, yes, but look at the problems html email, virus infected, attachements etc cause, the situation is not the same. You have little control over what kind of email people send you but you do have control over what code, libraries, patches, you choose to use in your software. If you want to use ascii-only, do it! Nobody is making you deal with non-ascii code if you don't want to. Yes. But it's not like this makes things so horribly awful that it's worth my time to reimplement large external libraries. I remain at -0 on the proposal; it'll cause some headaches for the majority of current Python programmers, but it may have some benefits to a sizeable minority This is the crux of the matter I think. That non-ascii identifiers will spead like a virus, infecting program after program until every piece of Python code is nothing but a mass of wreathing unintellagible non- ascii characters. (OK, maybe I am overstating a little. :-) I (and I think other proponents) don't think this is likely to happen, and the the benefits to non-english speakers of being able to write maintainable code far outweigh the very rare case when it does occur. and may help bring in new coders. And it's not going to cause flaming catastrophic death or anything. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
@yahoo.com escribió: Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. The fact my Outlook changed the text is irrelevant for something related to Python. On the contrary, it cuts to the heart of the problem. There are hundreds of tools out there that programmers use, and mailing lists are certainly an incredibly valuable tool--introducing a change that makes code more likely to be silently mangled seems like a negative. In such a case, the Python indentation should be rejected (quite interesting you removed from my post the part mentioning it). I can promise there are Korean groups and there are no problems at all in using Hangul (the Korean writing). Javier - http://www.texytipografia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Providing a method that would translate an arbitrary string into a valid Python identifier would be helpful. It would be even more helpful if it could provide a way of converting untranslatable characters. However, I suspect that the translate (normalize?) routine in the unicode module will do. Not at all. Unicode normalization only unifies different spellings of the same character. For transliteration, no simple algorithm exists, as it generally depends on the language. However, if you just want any kind of ASCII string, you can use the Unicode error handlers (PEP 293). For example, the program import unicodedata, codecs def namereplace(exc): if isinstance(exc, (UnicodeEncodeError, UnicodeTranslateError)): s = u for c in exc.object[exc.start:exc.end]: s += N_+unicode(unicodedata.name(c).replace( ,_))+_ return (s, exc.end) else: raise TypeError(can't handle %s % exc.__name__) codecs.register_error(namereplace, namereplace) print uSchl\xfcssel.encode(ascii, namereplace) prints SchlN_LATIN_SMALL_LETTER_U_WITH_DIAERESIS_ssel. HTH, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
But you're making a strawman argument by using extended ASCII characters that would work anyhow. How about debugging this (I wonder will it even make it through?) : class 6자회담관련론조 6자회 = 0 6자회담관련 고귀 명=10 That would be invalid syntax since the third line is an assignment with target identifiers separated only by spaces. Plus, the identifier starts with a number (even though 6 is not DIGIT SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't start an identifier). Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On Fri, 18 May 2007 06:28:03 +0200, Martin v. Löwis wrote: [excellent as always exposition by Martin] Thanks, Martin. P.S. Anybody who wants to play with generating visualisations of the PEP, here are the functions I used: [code snippets] Thanks for those functions, too -- I've been exploring with them and am slowly coming to some understanding. -- Richard Hanson To many native-English-speaking developers well versed in other programming environments, Python is *already* a foreign language -- judging by the posts here in c.l.py over the years. ;-) __ -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis schrieb: I've reported this before, but happily do it again: I have lived many years without knowing what a hub is, and what to pass means if it's not the opposite of to fail. Yet, I have used their technical meanings correctly all these years. I was not speaking of the more general (non-technical) meanings, but of the technical ones. The claim which I challenged was that people learn just the use (syntax) but not the meaning (semantics) of these terms. I think you are actually supporting my argument ;) -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis schrieb: Then get tools that match your working environment. Integration with existing tools *is* something that a PEP should consider. This one does not do that sufficiently, IMO. What specific tools should be discussed, and what specific problems do you expect? Systems that cannot display code parts correctly. I expect problems with unreadable tracebacks, for example. Also: Are existing tools that somehow process Python source code e.g. to test wether it meets certain criteria (pylint co) or to aid in creating documentation (epydoc co) fully unicode-ready? -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis wrote: Python code is written by many people in the world who are not familiar with the English language, or even well-acquainted with the Latin writing system. I believe that there is a not a single programmer in the world who doesn't know ASCII. It isn't hard to learn the latin alphabet and you have to know it anyway to use the keywords and the other ASCII characters to write numbers, punctuation etc. Most non-western alphabets have ASCII transcription rules and contain ASCII as a subset. On the other hand non-ascii identifiers lead to fragmentation and less understanding in the programming world so I don't like them. I also don't like non-ascii domain names where the same arguments apply. Let the data be expressed with Unicode but the logic with ASCII. -- Regards/Gruesse, Peter Maas, Aachen E-mail 'cGV0ZXIubWFhc0B1dGlsb2cuZGU=\n'.decode('base64') -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 19, 3:33 am, Martin v. Löwis [EMAIL PROTECTED] wrote: That would be invalid syntax since the third line is an assignment with target identifiers separated only by spaces. Plus, the identifier starts with a number (even though 6 is not DIGIT SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't start an identifier). Actually both of these issues point to the real problem with this PEP. I knew about them (note that the colon is also missing) alas I couldn't fix them. My editor would could not remove a space or add a colon anymore, it would immediately change the rest of the characters to something crazy. (Of course now someone might feel compelled to state that this is an editor problem but I digress, the reality is that features need to adapt to reality, moreso had I used a different editor I'd be still unable to write these characters). i. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis [EMAIL PROTECTED] writes: Now I understand it is meaning 12 in Merriam-Webster's dictionary, a) to decline to bid, double, or redouble in a card game, or b) to let something go by without accepting or taking advantage of it. I never thought of it as having that meaning. I thought of it in the sense of going by something without stopping, like I passed a post office on my way to work today. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis [EMAIL PROTECTED] writes: Integration with existing tools *is* something that a PEP should consider. This one does not do that sufficiently, IMO. What specific tools should be discussed, and what specific problems do you expect? Emacs, whose unicode support is still pretty weak. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hendrik van Rooyen [EMAIL PROTECTED] wrote: Now look me in the eye and tell me that you find the mix of proper German and English keywords beautiful. I can't admit that, but I find that using German class and method names is beautiful. The rest around it (keywords and names from the standard library) are not English - they are Python. MvL: (look me in the eye and tell me that def is an English word, or that getattr is one) HvR: LOL - true - but a broken down assembler programmer like me does not use getattr - and def is short for define, and for and while and in are not German. After an intense session of omphaloscopy, I would like another bite at this cherry. I think my problem is something like this - when I see a line of code like: def frobnitz(): I do not actually see the word def - I see something like: define a function with no arguments called frobnitz This expansion process is involuntary, and immediate in my mind. And this is immediately followed by an irritated reaction, like: WTF is frobnitz? What is it supposed to do? What Idiot wrote this? Similarly, when I encounter the word getattr - it is immediately expanded to get attribute and this expansion is kind of dependant on another thing, namely that my mind is in English mode - I refer here to something that only happens rarely, but with devastating effect, experienced only by people who can read more than one language - I am referring to the phenomenon that you look at an unfamiliar piece of writing on say a signboard, with the wrong language switch set in your mind - and you cannot read it, it makes no sense for a second or two - until you kind of step back mentally and have a more deliberate look at it, when it becomes obvious that its not say English, but Afrikaans, or German, or vice versa. So in a sense, I can look you in the eye and assert that def and getattr are in fact English words... (for me, that is) I suppose that this one language track - mindedness of mine is why I find the mix of keywords and German or Afrikaans so abhorrent - I cannot really help it, it feels as if I am eating a sandwich, and that I bite on a stone in the bread. - It just jars. Good luck with your PEP - I don't support it, but it is unlikely that the Python-dev crowd and GvR would be swayed much by the opinions of the egregious HvR. Aesthetics aside, I think that the practical maintenance problems (especially remote maintenance) is the rock on which this ship could founder. - Hendrik -- Philip Larkin (English Poet) : They fuck you up, your mom and dad - They do not mean to, but they do. They fill you with the faults they had, and add some extra, just for you. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Sion Arrowsmith [EMAIL PROTECTED] wrote: Hvr: Would not like it at all, for the same reason I don't like re's - It looks like random samples out of alphabet soup to me. What I meant was, would the use of foreign identifiers look so horrible to you if the core language had fewer English keywords? (Perhaps Perl, with its line-noise, was a poor choice of example. Maybe Lisp would be better, but I'm not so sure of my Lisp as to make such an assertion for it.) I suppose it would jar less - but I avoid such languages, as the whole thing kind of jars - I am not on the python group for nothing.. : - ) - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis [EMAIL PROTECTED] writes: If you doubt the claim, please indicate which of these three aspects you doubt: 1. there are programmers which desire to defined classes and functions with names in their native language. 2. those developers find the code clearer and more maintainable than if they had to use English names. 3. code clarity and maintainability is important. I think it can damage clarity and maintainability and if there's so much demand for it then I'd propose this compromise: non-ascii identifiers are allowed but they produce a compiler warning message (including from eval and exec). You can suppress the warning message with a command line option. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hallöchen! Martin v. Löwis writes: In [EMAIL PROTECTED], Nick Craig-Wood wrote: My initial reaction is that it would be cool to use all those great symbols. A variable called OHM etc! This is a nice candidate for homoglyph confusion. There's the Greek letter omega (U+03A9) Ω and the SI unit symbol (U+2126) Ω, and I think some omegas in the mathematical symbols area too. Under the PEP, identifiers are converted to normal form NFC, and we have py unicodedata.normalize(NFC, u\u2126) u'\u03a9' So, OHM SIGN compares equal to GREEK CAPITAL LETTER OMEGA. It can't be confused with it - it is equal to it by the proposed language semantics. So different unicode sequences in the source code can denote the same identifier? Tschö, Torsten. -- Torsten Bronger, aquisgrana, europa vetus Jabber ID: [EMAIL PROTECTED] (See http://ime.webhop.org for ICQ, MSN, etc.) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= [EMAIL PROTECTED] wrote: 3) Is or will there be a definitive and exhaustive listing (with bitmap representations of the glyphs to avoid the font issues) of the glyphs that the PEP 3131 would allow in identifiers? (Does this question even make sense?) As for the list I generated in HTML: It might be possible to make it include bitmaps instead of HTML character references, but doing so is a licensing problem, as you need a license for a font that has all these characters. If you want to lookup a specific character, I recommend to go to the Unicode code charts, at http://www.unicode.org/charts/ My understanding is also that there are several east-asian characters that display quite differently depending on whether you are in Japan, Taiwan or mainland China. So much differently that for example a Japanese person will not be able to recognize a character rendered in the Taiwanese or mainland Chinese way. -- Thomas Bellman, Lysator Computer Club, Linköping University, Sweden Adde parvum parvo magnus acervus erit ! bellman @ lysator.liu.se (From The Mythical Man-Month) ! Make Love -- Nicht Wahr! -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Long and interresting discussion with different point of view. Personnaly, even if the PEP goes (and its accepted), I'll continue to use identifiers as currently. But I understand those who wants to be able to use chars in their own language. * for people which are not expert developers (non-pros, or in learning context), to be able to use names having meaning, and for pro developers wanting to give a clear domain specific meaning - mainly for languages non based on latin characters where the problem must be exacerbated. They can already use unicode in strings (including documentation ones). * for exchanging with other programing languages having such identifiers... when they are really used (I include binding of table/column names in relational dataabses). * (not read, but I think present) this will allow developers to lock the code so that it could not be easily taken/delocalized anywhere by anybody. In the discussion I've seen that problem of mixing chars having different unicode number but same representation (ex. omega) is resolved (use of an unicode attribute linked to representation AFAIU). I've seen (on fclp) post about speed, it should be verified, I'm not sure we will loose speed with unicode identifiers. On the unicode editing, we have in 2007 enough correct editors supporting unicode (I configure my Windows/Linux editors to use utf-8 by default). I join concern in possibility to read code from a project which may use such identifiers (i dont read cyrillic, neither kanji or hindi) but, this will just give freedom to users. This can be a pain for me in some case, but is this a valuable argument so to forbid this for other people which feel the need ? IMHO what we should have if the PEP goes on: * reworking on traceback to have a general option (like -T) to ensure tracebacks prints only pure ascii, to avoid encoding problem when displaying errors on terminals. * a possibility to specify for modules that they must *define* only ascii-based names, like a from __futur__ import asciionly. To be able to enforce this policy in projects which request it. * and, as many wrote, enforce that standard Python libraries use only ascii identifiers. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hallöchen! Laurent Pointal writes: [...] Personnaly, even if the PEP goes (and its accepted), I'll continue to use identifiers as currently. [...] Me too (mostly), although I do like the PEP. While many people have pointed out possible issues of the PEP, only few have tried to estimate its actual impact. I don't think that it will do harm to Python code because the programmers will know when it's appropriate to use it. The potential trouble is too obvious for being ignored accidentally. And in the case of a bad programmer, you have more serious problems than flawed identifier names, really. But for private utilities for example, such identifiers are really a nice thing to have. The same is true for teaching in some cases. And the small simulation program in my thesis would have been better with some α and φ. At least, the program would be closer to the equations in the text then. [...] * a possibility to specify for modules that they must *define* only ascii-based names, like a from __futur__ import asciionly. To be able to enforce this policy in projects which request it. Please don't. We're all adults. If a maintainer is really concerned about such a thing, he should write a trivial program that ensures it. After all, there are some other coding guidelines too that could be enforced this way but aren't, for good reason. Tschö, Torsten. -- Torsten Bronger, aquisgrana, europa vetus Jabber ID: [EMAIL PROTECTED] (See http://ime.webhop.org for ICQ, MSN, etc.) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hendrik van Rooyen schrieb: I suppose that this one language track - mindedness of mine is why I find the mix of keywords and German or Afrikaans so abhorrent - I cannot really help it, it feels as if I am eating a sandwich, and that I bite on a stone in the bread. - It just jars. Please come to Vienna and learn the local slang. You would be surprised how beautiful and expressive a language mixed up of a lot of very different languages can be. Same for music. It's the secret of success of the music from Vienna. It's just a mix up of all the different cultures once living in a big multicultural kingdom. A mix up of Python key words and German identifiers feels very natural for me. I live in cultural diversity and richness and love it. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote: Is there any difference for you in debugging this code snippets? class Türstock(object): Of course there is, how do I type the ü ? (I can copy/paste for example, but that gets old quick). But you're making a strawman argument by using extended ASCII characters that would work anyhow. How about debugging this (I wonder will it even make it through?) : class 6자회담관련론조 6자회 = 0 6자회담관련 고귀 명=10 (I don't know what it means, just copied over some words from a japanese news site, but the first thing it did it messed up my editor, would not type the colon anymore) i. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert [EMAIL PROTECTED] escribió: How about debugging this (I wonder will it even make it through?) : class 6??? 6?? = 0 6? ?? ?=10 This question is more or less what a Korean who doesn't speak English would ask if he had to debug a program written in English. (I don't know what it means, just copied over some words from a japanese news site, A Japanese speaking Korean, it seems. :-) Javier -- http://www.texytipografia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On 18 Mai, 18:42, Javier Bezos [EMAIL PROTECTED] wrote: Istvan Albert [EMAIL PROTECTED] escribió: How about debugging this (I wonder will it even make it through?) : class 6??? 6?? = 0 6? ?? ?=10 This question is more or less what a Korean who doesn't speak English would ask if he had to debug a program written in English. Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. It's already more difficult than it ought to be to explain to people why they have trouble printing text to the console, for example, and if one considers issues with badly configured text editors putting the wrong character values into programs, even if Python complains about it, there's still going to be some explaining to do. One thing that some people already dislike about Python is the editing discipline required. Although I don't have much time for people whose coding skills involve random edits using badly configured editors, trashing the indentation and the appearance of the code (regardless of the language involved), we do need to consider the need to bring people up to speed gracefully by encouraging the proper use of tools, and so on, all without making it seem really difficult and discouraging people from learning the language. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert schrieb: On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote: Is there any difference for you in debugging this code snippets? class Türstock(object): Of course there is, how do I type the ü ? (I can copy/paste for example, but that gets old quick). I doubt that you can debug the code without Unicode chars. It seems that you do no understand German and therefore you do not know what the purpose of this program is. Can you tell me if there is an error in the snippet without Unicode? I would refuse to try do debug a program that I do not understand. Avoiding Unicode does not help a bit in this regard. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Paul Boddie schrieb: Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. I do not see the point. If my editor or newsreader does display the text correctly or not is no difference for me, since I do not understand a word of it anyway. It's a meaningless stream of bits for me. It's save to assume that for people who are finding this meaningful their setup will display it correctly. Otherwise they could not work with their computer anyway. Until now I did not find a single Computer in my German domain who cannot display: ß. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
This question is more or less what a Korean who doesn't speak English would ask if he had to debug a program written in English. Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. The fact my Outlook changed the text is irrelevant for something related to Python. And just remember how Google mangled the intentation of Python code some time ago. This was a technical issue which has been solved, and no doubt my laziness (I didn't switch to Unicode) won't prevent non-ASCII identifiers be properly showed in general. Javier - http://www.texytipografia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 18, 1:47 pm, Javier Bezos [EMAIL PROTECTED] wrote: This question is more or less what a Korean who doesn't speak English would ask if he had to debug a program written in English. Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. The fact my Outlook changed the text is irrelevant for something related to Python. On the contrary, it cuts to the heart of the problem. There are hundreds of tools out there that programmers use, and mailing lists are certainly an incredibly valuable tool--introducing a change that makes code more likely to be silently mangled seems like a negative. Of course, there are other benefits to the PEP, so I'm only barely opposed. But dismissing the fact that Outlook and other quite common tools may have severe problems with code seems naive (or disingenuous, but I don't think that's the case here). -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gregor Horvath wrote: Paul Boddie schrieb: Perhaps, but the treatment by your mail/news software plus the delightful Google Groups of the original text (which seemed intact in the original, although I don't have the fonts for the content) would suggest that not just social or cultural issues would be involved. I do not see the point. If my editor or newsreader does display the text correctly or not is no difference for me, since I do not understand a word of it anyway. It's a meaningless stream of bits for me. But if your editor doesn't even bother to preserve those bits correctly, it makes a big difference. When 6자회담관련론조 becomes 6??? because someone's tool did the equivalent of unicode_obj.encode(iso-8859-1, replace), then the stream of bits really does become meaningless. (We'll see if the former identifier even resembles what I've just pasted later on, or whether it resembles the latter.) It's save to assume that for people who are finding this meaningful their setup will display it correctly. Otherwise they could not work with their computer anyway. Sure, it's all about editor discipline or tool discipline just as I wrote. I'm in favour of the PEP, generally, but I worry about the long explanations required when people find that their programs are now ill-formed because someone made a quick edit in a bad editor. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert: But you're making a strawman argument by using extended ASCII characters that would work anyhow. How about debugging this (I wonder will it even make it through?) : class 6자회담관련론조 6자회 = 0 6자회담관련 고귀 명=10 That would be invalid syntax since the third line is an assignment with target identifiers separated only by spaces. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 13, 9:44 am, Martin v. Löwis [EMAIL PROTECTED] wrote: PEP 1 specifies that PEP authors need to collect feedback from the community. As the author of PEP 3131, I'd like to encourage comments to the PEP included below, either here (comp.lang.python), or to [EMAIL PROTECTED] In summary, this PEP proposes to allow non-ASCII letters as identifiers in Python. If the PEP is accepted, the following identifiers would also become valid as class, function, or variable names: Löffelstiel, changé, ошибка, or 売り場 (hoping that the latter one means counter). I notice that Guido has approved it, so I'm looking at what it would take to support it for Python FIT. The actual issue (for me) is translating labels for cell columns (and similar) into Python identifiers. After looking at the firestorm, I've come to the conclusion that the old methods need to be retained not only for backwards compatability but also for people who want to translate existing fixtures. The guidelines in PEP 3131 for standard library code appear to be adequate for code that's going to be contributed to the community. I will most likely emphasize those in my documentation. Providing a method that would translate an arbitrary string into a valid Python identifier would be helpful. It would be even more helpful if it could provide a way of converting untranslatable characters. However, I suspect that the translate (normalize?) routine in the unicode module will do. John Roth Phthon FIT -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Sion Arrowsmith [EMAIL PROTECTED] wrote: Hendrik van Rooyen wrote: I still don't like the thought of the horrible mix of foreign identifiers and English keywords, coupled with the English sentence construction. How do you think you'd feel if Python had less in the way of (conventionally used) English keywords/builtins. Like, say, Perl? Would not like it at all, for the same reason I don't like re's - It looks like random samples out of alphabet soup to me. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gregor Horvath [EMAIL PROTECTED] wrote: Hendrik van Rooyen schrieb: It is not so much for technical reasons as for aesthetic ones - I find reading a mix of languages horrible, and I am kind of surprised by the strength of my own reaction. This is a matter of taste. I agree - and about perceptions of quality. Of what is good, and not good. - If you havent yet, read Robert Pfirsig's book: Zen and the art of motorcycle maintenance In some programs I use German identifiers (not unicode). I and others like the mix. My customers can understand the code better. (They are only reading it) I can sympathise a little bit with a customer who tries to read code. Why that should be necessary, I cannot understand - does the stuff not work to the extent that the customer feels he has to help you? You do not talk as if you are incompetent, so I see no reason why the customer should want to meddle in what you have written, unless he is paying you to train him to program, and as Eric Brunel has pointed out, this mixing of languages is all right in a training environment. Beautiful is better than ugly Correct. But why do you think you should enforce your taste to all of us? You misjudge me - the OP asked if I would use the feature, and I am speaking for myself when I explain why I would not use it. With this logic you should all drive Alfa Romeos! Actually no - this is not about logic - my post clearly stated that I was talking about feelings. And the only logic that applies to feelings is the incontrovertible fact that they exist, and that it makes good logical sense to acknowledge them, and to take that into account in one's actions. And as far as Alfa's go - we have found here that they are rather soft - our dirt roads destroy them in no time. : - ( - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gabriel Genellina [EMAIL PROTECTED] wrote: - Someone proposed using escape sequences of some kind, supported by editor plugins, so there is no need to modify the parser. I'm not sure whether my suggestion below is the same as or a variation on this. - Refactoring tools should let you rename foreign identifiers into ASCII only. A possible modification to the PEP would be to permit identifiers to also include \u and \U escape sequences (as some other languages already do). Then you could have a script easily (and reversibly) convert all identifiers to ascii or indeed any other encoding or subset of unicode using escapes only for the unrepresentable characters. I think this would remove several of the objections: such as being unable to tell at a glance whether someone is trying to spoof your variable names, or being unable to do minor maintenance on code using character sets which your editor doesn't support: you just run the script which would be included with every copy of Python to restrict the character set of the source files to whatever character set you feel happy with. The script should also be able to convert unrepresentable characters in strings and comments (although that last operation wouldn't be guaranteed reversible). Of course it doesn't do anything for the objection about such identifiers being ugly, but you can't have everything. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hendrik van Rooyen schrieb: I can sympathise a little bit with a customer who tries to read code. Why that should be necessary, I cannot understand - does the stuff not work to the extent that the customer feels he has to help you? You do not talk as if you are incompetent, so I see no reason why the customer should want to meddle in what you have written, unless he is paying you to train him to program, and as Eric Brunel has pointed out, this mixing of languages is all right in a training environment. That is highly domain and customer specific individual logic, that the costumer knows best. (For example variation logic of window and door manufacturers) He has to understand the code, so that he can verify it's correct. We are in fact developing it together. Some costumers even are coding this logic themselves. Some of them are not fluent in English especially not in the computer domain. Translating the logic into a documentation is a waste of time if the code is self documenting and easy to grasp. (As python usually is) But the code can only be self documenting if it is written in the domain specific language of the customer. Sometimes these are words that are not even used in general German. Even in German different customers are naming the same thing with different words. Talking and coding in the language of the customer is a huge benefit. Gregor -- http://mail.python.org/mailman/listinfo/python-list
PEP 3131: Supporting Non-ASCII Identifiers
Hi All, In summary, this PEP proposes to allow non-ASCII letters as identifiers in Python. In primis, I would like to congratulate with Martin to have started one of the most active threads (flame wars? :- D ) in the python-list history. By scanning the list from January 2000 to now, this is the current Python Premier League for Posts (truncated at position 10): 01) merits of Lisp vs Python | 832 | Mark Tarver | December-2006 02) For review: PEP 308 - If-then-else expression| 728 | Guido van Rossum | February-2003 03) Python syntax in Lisp and Scheme | 665 | mike420 at ziplip.com | October-2003 04) Python from Wise Guy's Viewpoint | 495 | mike420 at ziplip.com | October-2003 05) Microsoft Hatred FAQ | 478 | Xah Lee | October-2005 06) Why is Python popular, while Lisp and Scheme aren't? | 430 | Oleg | November-2002 07) Xah Lee's Unixism| 397 | Pascal Bourguignon| August-2004 08) PEP 285: Adding a bool type | 361 | Guido van Rossum | March-2002 09) Jargons of Info Tech industry| 350 | Xah Lee | August-2005 10) PEP 3131: Supporting Non-ASCII Identifiers | 326 | Martin v. Lowis | May-2007 (It may come screwed up in the mail, so for those interested I attach the results in a small text file which contains the first 50 positions). It has been generated with a simple Python script: you can find it at the end of the message. It's slow as a turtle (mainly because of the use of urllib and the sloppiness of my internet connection yesterday evening), but it works. I obviously will accept all the suggestions for improvements on the script, as I am only a Python amateurish programmer. So, please provide feedback, e.g. perhaps by answering these questions: - should non-ASCII identifiers be supported? why? +1, obviously. As an external observer, it has been extremely interesting to follow all the discussions that this PEP raised. It has also been funny, as by reading some of the posts it seemed to me that my grandmother knows more about unicode with respect to some conclusions depicted there :-D :-D . But keep them coming, they are a valuable resource for low-skilled programmers like me, there is always something new to learn, really. - would you use them if it was possible to do so? in what cases? I will for my personal projects and for our internal applications that will not go public. As for the usual objection: I think your argument about isolated projects is flawed. It is not at all unusual for code that was never intended to be public, whose authors would have sworn that it will never ever be need to read by anyone except themselves, to surprisingly go public at some point in the future. raise NoWayItWillGoPublicError For Public Domain code, I will surely stick with the standard coding style we have right now. I thought we were all adults here. I really imagine what it would happen if we gather all together around a table for a Python-dining: as soon as PEP 3131 discussion pops in, we would start throwing food to each other like 5 years-old puckish boys :-D :-D Andrea. Imagination Is The Only Weapon In The War Against Reality. http://xoomer.virgilio.it/infinity77/ *** * Python Premier League For Posts * *** POST SCORE AUTHORDATE 01) merits of Lisp vs Python | 832 | Mark Tarver | December-2006 02) For review: PEP 308 - If-then-else expression| 728 | Guido van Rossum | February-2003 03) Python syntax in Lisp and Scheme | 665 | mike420 at ziplip.com| October-2003 04) Python from Wise Guy's Viewpoint | 495 | mike420 at ziplip.com| October-2003 05) Microsoft Hatred FAQ | 478 | Xah Lee | October-2005 06) Why is Python popular, while Lisp and Scheme aren't? | 430 | Oleg | November-2002 07) Xah Lee's Unixism| 397 | Pascal Bourguignon | August-2004 08) PEP 285: Adding a bool type | 361 | Guido van Rossum | March-2002 09) Jargons of Info Tech industry| 350 | Xah Lee | August-2005 10) PEP 3131: Supporting Non-ASCII Identifiers | 326 | quot;Martin v. L#246;wisquot; | May-2007 11) Xah's Edu Corner: What is Expressiveness in a Computer Langu | 306 | Xah Lee | March-2006 12
Re: PEP 3131: Supporting Non-ASCII Identifiers
PEP 3131 uses a similar definition to C# except that PEP 3131 disallows formatting characters (category Cf). See section 9.4.2 of http://www.ecma-international.org/publications/standards/Ecma-334.htm UAX#31 discusses formatting characters in 2.2, and recognizes that there might be good reasons to allow (and ignore) them; however, it recommends against doing so except in special cases. So I decided to disallow them. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Now look me in the eye and tell me that you find the mix of proper German and English keywords beautiful. I can't admit that, but I find that using German class and method names is beautiful. The rest around it (keywords and names from the standard library) are not English - they are Python. (look me in the eye and tell me that def is an English word, or that getattr is one) Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
A possible modification to the PEP would be to permit identifiers to also include \u and \U escape sequences (as some other languages already do). Several languages do that (e.g. C and C++), but I deliberately left this out, as I cannot see this work in a practical way. Also, it could be added later as another extension if there is an actual need. I think this would remove several of the objections: such as being unable to tell at a glance whether someone is trying to spoof your variable names, If you are willing to run a script on the patch you receive, you can perform that check even without having support for the \u syntax in the language - either you convert to the \u notation, and then check manually (converting back if all is fine), or you have an automated check (e.g. at commit time) that checks for conformance to the style guide. or being unable to do minor maintenance on code using character sets which your editor doesn't support: you just run the script which would be included with every copy of Python to restrict the character set of the source files to whatever character set you feel happy with. The script should also be able to convert unrepresentable characters in strings and comments (although that last operation wouldn't be guaranteed reversible). Again, if it's reversible, you don't need support for it in the language. You convert to your editor's supported Unicode subset, edit, then convert back. However, I somewhat doubt that this case my editor cannot display my source code is likely to occur: if the editor cannot display it, you likely have a ban on those characters, anyway. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis wrote: I can't admit that, but I find that using German class and method names is beautiful. The rest around it (keywords and names from the standard library) are not English - they are Python. (look me in the eye and tell me that def is an English word, or that getattr is one) He's got a point (a small one though). For example: - self (can be changed though) - is - with - isinstance - try Regards, Björn -- BOFH excuse #435: Internet shut down due to maintenance -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Consequently, Python's keywords and even the standard library can exist with names being just symbols for many people. I already told that on the py3k list: Until a week ago, I didn't know why pass was chosen for the no action statement - with all my English knowledge, I still could not understand why the opposite of fail should mean no action. Still, I have been using pass for more than 10 years now, without ever questioning what it means in English, and I've successfully used it as a token. Except for the first draft of Das Python-Buch, where I, from memory, thought the statement should be skip; I remembered it had four letters, and meant go to the next line. Now I understand it is meaning 12 in Merriam-Webster's dictionary, a) to decline to bid, double, or redouble in a card game, or b) to let something go by without accepting or taking advantage of it. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
IMO, the burden of proof is on you. If this PEP has the potential to introduce another hindrance for code-sharing, the supporters of this PEP should be required to provide a damn good reason for doing so. So far, you have failed to do that, in my opinion. All you have presented are vague notions of rare and isolated use-cases. The PEP explicitly states what the damn good reason is: Such developers often desire to define classes and functions with names in their native languages, rather than having to come up with an (often incorrect) English translation of the concept they want to name. So the reason is that with this PEP, code clarity and readability will become better. It's the same reason as for many other features introduced into Python recently, e.g. the with statement. If you doubt the claim, please indicate which of these three aspects you doubt: 1. there are programmers which desire to defined classes and functions with names in their native language. 2. those developers find the code clearer and more maintainable than if they had to use English names. 3. code clarity and maintainability is important. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
You could say the same about Python standard library and keywords then. Shouldn't these also have to be translated? One can even push things a little further: I don't know about the languages used in the countries you mention, but for example, a simple construction like 'if condition do something' will look weird to a Japanese (the Japanese language has a post-fix feel: the equivalent of the 'if' is put after the condition). So why enforce an English-like sentence structure? The Python syntax does not use an English-like sentence structure. In English, a statement follows the pretty strict sequence of subject, predicate, object (SPO). In Python, statements don't have a subject; some don't even have a verb (e.g. assignments). Regardless, this PEP does not propose to change the syntax of the language, because doing so would cause technical problems - unlike the proposed PEP, which does not cause any technical problems to the language implementation whatsoever (and only slight technical problems to editors, which aren't worse than the ones cause by PEP 263). You have a point here. When learning to program, or when programming for fun without any intention to do something serious, it may be better to have a language supporting native characters in identifiers. My problem is: if you allow these, how can you prevent them from going public someday? You can't, and you shouldn't. What you can prevent is that the code enters *your* project. I cannot see why you want to censor what code other people publish. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Now look me in the eye and tell me that you find the mix of proper German and English keywords beautiful. I can't admit that, but I find that using German class and method names is beautiful. The rest around it (keywords and names from the standard library) are not English - they are Python. (look me in the eye and tell me that def is an English word, or that getattr is one) Regards, Martin LOL - true - but a broken down assembler programmer like me does not use getattr - and def is short for define, and for and while and in are not German. Looks like you have stirred up a hornets nest... - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Hendrik van Rooyen [EMAIL PROTECTED] wrote: Sion Arrowsmith [EMAIL PROTECTED] wrote: Hendrik van Rooyen wrote: I still don't like the thought of the horrible mix of foreign identifiers and English keywords, coupled with the English sentence construction. How do you think you'd feel if Python had less in the way of (conventionally used) English keywords/builtins. Like, say, Perl? Would not like it at all, for the same reason I don't like re's - It looks like random samples out of alphabet soup to me. What I meant was, would the use of foreign identifiers look so horrible to you if the core language had fewer English keywords? (Perhaps Perl, with its line-noise, was a poor choice of example. Maybe Lisp would be better, but I'm not so sure of my Lisp as to make such an assertion for it.) -- \S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/ Frankly I have no feelings towards penguins one way or the other -- Arthur C. Clarke her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
So, please provide feedback, e.g. perhaps by answering these questions: - should non-ASCII identifiers be supported? why? I think the biggest argument against this PEP is how little similar features are used in other languages and how poorly they are supported by third party utilities. Your PEP gives very little thought to how the change would affect the standard Python library. Are non-ASCII identifiers going to be poorly supported in Python's own library and utilities? For other languages (in particular Java), one challenge is that you don't know the source encoding - it's neither fixed, nor is it given in the source code file itself. Instead, the environment has to provide the source encoding, and that makes it difficult to use. The JDK javac uses the encoding from the locale, which is non-sensical if you check-out source from a repository. Eclipse has solved the problem: you can specify source encoding on a per-project basis, and it uses that encoding consistently in the editor and when running the compiler. For Python, this problem was solved long ago: PEP 263 allows to specify the source encoding within the file, and there was always a default encoding. The default encoding will change to UTF-8 in Python 3. IDLE has been supporting PEP 263 from the beginning, and several other editors support it as well. Not sure what other tools you have in mind, and what problems you expect. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg schrieb: Stefan Behnel schrieb: Then get tools that match your working environment. Integration with existing tools *is* something that a PEP should consider. This one does not do that sufficiently, IMO. What specific tools should be discussed, and what specific problems do you expect? Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
In the code I was looking at identifiers were allowed to use non-ASCII characters. For whatever reason, the programmers choose not use non-ASCII indentifiers even though they had no problem using non-ASCII characters in commonets. One possible reason is that the tools processing the program would not know correctly what encoding the source file is in, and would fail when they guessed the encoding incorrectly. For comments, that is not a problem, as an incorrect encoding guess has no impact on the meaning of the program (if the compiler is able to read over the comment in the first place). Another possible reason is that the programmers were unsure whether non-ASCII identifiers are allowed. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
After 175 replies (and counting), the only thing that is clear is the controversy around this PEP. Most people are very strong for or against it, with little middle ground in between. I'm not saying that every change must meet 100% acceptance, but here there is definitely a strong opposition to it. Accepting this PEP would upset lots of people as it seems, and it's interesting that quite a few are not even native english speakers. I believe there is a lot of middle ground, but those people don't speak up. I interviewed about 20 programmers (none of them Python users), and most took the position I might not use it myself, but it surely can't hurt having it, and there surely are people who would use it. 2 people were strongly in favor, and 3 were strongly opposed. Of course, those people wouldn't take a lot of effort to defend their position in a usenet group. So that the majority of the responses comes from people with strong feelings either way is no surprise. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
However, what I want to see is how people deal with such issues when sharing their code: what are their experiences and what measures do they mandate to make it all work properly? You can see some discussions about various IDEs mandating UTF-8 as the default encoding, along with UTF-8 being the required encoding for various kinds of special Java configuration files. I believe the problem is solved when everybody uses Eclipse. You can set a default encoding for all Java source files in a project, and you check the project file into your source repository. Eclipse both provides the editor and drives the compiler, and does so in a consistent way. Yes, it should reduce confusion at a technical level. But what about the tools, the editors, and so on? If every computing environment had decent UTF-8 support, wouldn't it be easier to say that everything has to be in UTF-8? For both Python and Java, it's too much historical baggage already. When source encodings were introduced to Python, allowing UTF-8 only was already proposed. People rejected it at the time, because a) they had source files where weren't encoded in UTF-8, and were afraid of breaking them, and b) their editors would not support UTF-8. So even with Python 3, UTF-8 is *just* the default default encoding. I would hope that all Python IDEs, over time, learn about this default, until then, users may have to manually configure their IDEs and editors. With a default of UTF-8, it's still simpler than with PEP 263: you can say that .py files are UTF-8, and your editor will guess incorrectly only if there is an encoding declaration other than UTF-8. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
I claim that this is *completely unrealistic*. When learning Python, you *do* learn the actual meanings of English terms like open, exception, if and so on if you did not know them before. It would be extremely foolish not to do so. Having taught students for many years now, I can report that this is most certainly *not* the case. Many people learn only ever the technical meaning of some term, and never grasp the English meaning. They could look into a dictionary, but they rather read the documentation. I've reported this before, but happily do it again: I have lived many years without knowing what a hub is, and what to pass means if it's not the opposite of to fail. Yet, I have used their technical meanings correctly all these years. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis schrieb: I've reported this before, but happily do it again: I have lived many years without knowing what a hub is, and what to pass means if it's not the opposite of to fail. Yet, I have used their technical meanings correctly all these years. That's not only true for computer terms. In the German Viennese slang there are a lot of Italian, French, Hungarian, Czech, Hebrew and Serbocroatien words. Nobody knows the exact meaning in their original language (nor does the vast majority actually speak those languages), but all are used in the correct original context. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 16, 8:49 pm, Gregor Horvath [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] schrieb: 2) Create a way to internationalize the standard library (and possibly the language keywords, too). Ideally, create a general standardized way to internationalize code, possibly similiar to how people internationalize strings today. Why? Or more acurately why before adopting the PEP? The library is very usable by non-english speakers as long as there is documentation in their native language. It would be Microsoft once translated their VBA to foreign languages. I didn't use it because I was used to English code. If I program in mixed cultural contexts I have to use to smallest dominator. Mixing the symbols of the programming language is confusing. Yup, I agree wholeheartedly. So do almost all the other people who have responded in this thread. In public code, open source code, code being worked on by people from different countries, English is almost always the best choice. Nothing in the PEP interferes with or prevents this. The PEP only allows non-ascii indentifiers, when they are appropriate: in code that is unlikely to be ever be touched by people who don't know that language. (Obviously any language feature can be misused but peer-pressure, documentation, and education have been very effective in preventing such misuse. There is no reason they shouldn't be effective here too.) And yes, some code will be developed in a single language enviroment and then be found to be useful to a wider audience. It's not the end of the world. It is no worse than when code written with a single language UI that is becomes public -- it will get fixed so that it meets the standards for a internationaly collaborative project. Seems to me that replacing identifiers with english ones is fairly trivial isn't it? One can identify identifiers by parsing the program and replacing them from a prepared table of replacements? This seems much easier than fixing comments and docstrings which need to be done by hand. But the comment/docstring problem exists now and has nothing to do with the PEP. Long time ago at the age of 12 I learned programming using English Computer books. Then there were no German books at all. It was not easy. It would have been completely impossible if our schools system would not have been wise enough to teach as English early. I think millions of people are handicapped because of this. Any step to improve this, is a good step for all of us. In no doubt there are a lot of talents wasted because of this wall. I agree that anyone who wants to be a programmer is well advised to learn English. I would also advise anyone who wants to be a programmer to go to college. But I have met very good programmers who were not college graduates and although I don't know any non- english speakers I am sure there are very good programers who don't know English. There is a big difference between encouraging someone to do something, and taking steps to make them do something. A lot of the english-only retoric in this thread seems very reminiscent of arguments a decade+ ago regarding wide characters and unicode, and other i18n support. Computing is ascii-based, we don't need all this crap, and besides, it doubles the memory used by strings! English is good enough. Except of course that it wasn't. When technology demands that people adapt to it, it looses. When technology adapts to the needs of people, it wins. The fundamental question is whether languages designers, or the people writing the code, should be the ones to decide what language identifiers are most appropriate for their program. Do language designers, all of whom are English speakers, have the wisdom to decide for programmers all over the world, and for years to come, that they must learn English to use Python effectively? And if they do, will the people affected agree, or will they choose a different language? -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 16, 11:09 pm, Gregor Horvath [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] schrieb: On May 16, 12:54 pm, Gregor Horvath [EMAIL PROTECTED] wrote: Istvan Albert schrieb: So the solution is to forbid Chinese XP ? Who said anything like that? It's just an example of surprising and unexpected difficulties that may arise even when doing trivial things, and that proponents do not seem to want to admit to. Should computer programming only be easy accessible to a small fraction of privileged individuals who had the luck to be born in the correct countries? Should the unfounded and maybe xenophilous fear of loosing power and control of a small number of those already privileged be a guide for development? Now that right there is your problem. You are reading a lot more into this than you should. Losing power, xenophilus(?) fear, privileged individuals, just step back and think about it for a second, it's a PEP and people have different opinions, it is very unlikely that there is some generic sinister agenda that one must be subscribed to i. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
I'd suggest restricting identifiers under the rules of UTS-39, profile 2, Highly Restrictive. This limits mixing of scripts in a single identifier; you can't mix Hebrew and ASCII, for example, which prevents problems with mixing right to left and left to right scripts. Domain names have similar restrictions. That sounds interesting, however, I cannot find the document your refer to. In TR 39 (also called Unicode Technical Standard #39), at http://unicode.org/reports/tr39/ there is no mentioning of numbered profiles, or Highly Restrictive. Looking at the document, it seems 3.1., General Security Profile for Identifiers might apply. IIUC, xidmodifications.txt would have to be taken into account. I'm not quite sure what that means; apparently, a number of characters (listed as restricted) should not be used in identifiers. OTOH, it also adds HYPHEN-MINUS and KATAKANA MIDDLE DOT - which surely shouldn't apply to Python identifiers, no? (at least HYPHEN-MINUS already has a meaning in Python, and cannot possibly be part of an identifier). Also, mixed-script detection might be considered, but it is not clear to me how to interpret the algorithm in section 5, plus it says that this is just one of the possible algorithms. Finally, Confusable Detection is difficult to perform on a single identifier - it seems you need two of them to find out whether they are confusable. In any case, I added this as an open issue to the PEP. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 17, 9:07 am, Martin v. Löwis [EMAIL PROTECTED] wrote: up. I interviewed about 20 programmers (none of them Python users), and most took the position I might not use it myself, but it surely can't hurt having it, and there surely are people who would use it. Typically when you ask people about esoteric features that seemingly don't affect them but might be useful to someone, the majority will say yes. Its simply common courtesy, its is not like they have to do anything. At the same time it takes some mental effort to analyze and understand all the implications of a feature, and without taking that effort something will always beat nothing. After the first time that your programmer friends need fix a trivial bug in a piece of code that does not display correctly in the terminal I can assure you that their mellow acceptance will turn to something entirely different. i. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 17, 4:56 am, Martin v. Löwis [EMAIL PROTECTED] wrote: ... (look me in the eye and tell me that def is an English word, or that getattr is one) That's not quite fair. They are not english words but they are derived from english and have a memonic value to english speakers that they don't (or only accidently) have for non-english speakers. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert schrieb: After the first time that your programmer friends need fix a trivial bug in a piece of code that does not display correctly in the terminal I can assure you that their mellow acceptance will turn to something entirely different. Is there any difference for you in debugging this code snippets? class Türstock(object): höhe = 0 breite = 0 tiefe = 0 def _get_fläche(self): return self.höhe * self.breite fläche = property(_get_fläche) #--- class Tuerstock(object): hoehe = 0 breite = 0 tiefe = 0 def _get_flaeche(self): return self.hoehe * self.breite flaeche = property(_get_flaeche) I can tell you that for me and for my costumers this makes a big difference. Whether this PEP gets accepted or not I am going to use German identifiers and you have to be frightened to death by that fact ;-) Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Istvan Albert wrote: On May 17, 9:07 am, Martin v. Löwis [EMAIL PROTECTED] wrote: up. I interviewed about 20 programmers (none of them Python users), and most took the position I might not use it myself, but it surely can't hurt having it, and there surely are people who would use it. Typically when you ask people about esoteric features that seemingly don't affect them but might be useful to someone, the majority will say yes. Its simply common courtesy, its is not like they have to do anything. At the same time it takes some mental effort to analyze and understand all the implications of a feature, and without taking that effort something will always beat nothing. Indeed. For example, getattr() and friends now have to accept Unicode arguments, and presumably to canonicalize correctly to avoid errors, and treat equivalent Unicode and ASCII names as the same (question: if two strings compare equal, do they refer to the same name in a namespace?). After the first time that your programmer friends need fix a trivial bug in a piece of code that does not display correctly in the terminal I can assure you that their mellow acceptance will turn to something entirely different. And pretty quickly, too. If anyone but Martin were the author of the PEP I'd have serious doubts, but if he thinks it's worth proposing there's at least a chance that it will eventually be implemented. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gregor Horvath wrote: Istvan Albert schrieb: After the first time that your programmer friends need fix a trivial bug in a piece of code that does not display correctly in the terminal I can assure you that their mellow acceptance will turn to something entirely different. Is there any difference for you in debugging this code snippets? class Türstock(object): höhe = 0 breite = 0 tiefe = 0 def _get_fläche(self): return self.höhe * self.breite fläche = property(_get_fläche) #--- class Tuerstock(object): hoehe = 0 breite = 0 tiefe = 0 def _get_flaeche(self): return self.hoehe * self.breite flaeche = property(_get_flaeche) I can tell you that for me and for my costumers this makes a big difference. So you are selling to the clothing market? [I think you meant customers. God knows I have no room to be snitty about other people's typos. Just thought it might raise a smile]. Whether this PEP gets accepted or not I am going to use German identifiers and you have to be frightened to death by that fact ;-) That's fine - they will be at least as meaningful to you as my English ones would be to your countrymen who don't speah English. I think we should remember that while programs are about communication there's no requirement for (most of) them to be universally comprehensible. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
At the same time it takes some mental effort to analyze and understand all the implications of a feature, and without taking that effort something will always beat nothing. Indeed. For example, getattr() and friends now have to accept Unicode arguments, and presumably to canonicalize correctly to avoid errors, and treat equivalent Unicode and ASCII names as the same (question: if two strings compare equal, do they refer to the same name in a namespace?). Actually, that is not an issue: In Python 3, there is no data type for ASCII string anymore, so all __name__ attributes and __dict__ keys are Unicode strings - regardless of whether this PEP gets accepted or not (which it just did). Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote: Istvan Albert schrieb: After the first time that your programmer friends need fix a trivial bug in a piece of code that does not display correctly in the terminal I can assure you that their mellow acceptance will turn to something entirely different. Is there any difference for you in debugging this code snippets? class Türstock(object): [snip] class Tuerstock(object): After finding a platform where those are different, I have to say yes. Absolutely. In my normal setup they both display as class Tuerstock (three letters 'T' 'u' 'e' starting the class name). If, say, an exception was raised, it'd be fruitless for me to grep or search for Tuerstock in the first one, and I might wind up wasting a fair amount of time if a user emailed that to me before realizing that the stack trace was just wrong. Even if I had extended character support, there's no guarantee that all the users I'm supporting do. If they do, there's no guarantee that some intervening email system (or whatever) won't munge things. With the second one, all my standard tools would work fine. My user's setups will work with it. And there's a much higher chance that all the intervening systems will work with it. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis: ... regardless of whether this PEP gets accepted or not (which it just did). Which version can we expect this to be implemented in? Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Neil Hodgson schrieb: Martin v. Löwis: ... regardless of whether this PEP gets accepted or not (which it just did). Which version can we expect this to be implemented in? The PEP says 3.0, and the planned implementation also targets that release. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 16, 6:38 pm, [EMAIL PROTECTED] wrote: On May 16, 11:41 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Christophe wrote: snip... Who displays stack frames? Your code. Whose code includes unicode identifiers? Your code. Whose fault is it to create a stack trace display procedure that cannot handle unicode? You. Thanks but no--I work with a _lot_ of code I didn't write, and looking through stack traces from 3rd party packages is not uncommon. Are you worried that some 3rd-party package you have included in your software will have some non-ascii identifiers buried in it somewhere? Surely that is easy to check for? Far easier that checking that it doesn't have some trojan code it it, it seems to me. What do you mean, check for? If, say, numeric starts using math characters (as has been suggested), I'm not exactly going to stop using numeric. It'll still be a lot better than nothing, just slightly less better than it used to be. And I'm often not creating a stack trace procedure, I'm using the built-in python procedure. And I'm often dealing with mailing lists, Usenet, etc where I don't know ahead of time what the other end's display capabilities are, how to fix them if they don't display what I'm trying to send, whether intervening systems will mangle things, etc. I think we all are in this position. I always send plain text mail to mailing lists, people I don't know etc. But that doesn't mean that email software should be contrainted to only 7-bit plain text, no attachements! I frequently use such capabilities when they are appropriate. Sure. But when you're talking about maintaining code, there's a very high value to having all the existing tools work with it whether they're wide-character aware or not. If your response is, yes, but look at the problems html email, virus infected, attachements etc cause, the situation is not the same. You have little control over what kind of email people send you but you do have control over what code, libraries, patches, you choose to use in your software. If you want to use ascii-only, do it! Nobody is making you deal with non-ascii code if you don't want to. Yes. But it's not like this makes things so horribly awful that it's worth my time to reimplement large external libraries. I remain at -0 on the proposal; it'll cause some headaches for the majority of current Python programmers, but it may have some benefits to a sizeable minority and may help bring in new coders. And it's not going to cause flaming catastrophic death or anything. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On Sun, 13 May 2007 17:44:39 +0200, Martin v. Löwis wrote: The syntax of identifiers in Python will be based on the Unicode standard annex UAX-31 [1]_, with elaboration and changes as defined below. Within the ASCII range (U+0001..U+007F), the valid characters for identifiers are the same as in Python 2.5. This specification only introduces additional characters from outside the ASCII range. For other characters, the classification uses the version of the Unicode Character Database as included in the ``unicodedata`` module. The identifier syntax is ``ID_Start ID_Continue*``. ``ID_Start`` is defined as all characters having one of the general categories uppercase letters (Lu), lowercase letters (Ll), titlecase letters (Lt), modifier letters (Lm), other letters (Lo), letter numbers (Nl), plus the underscore (XXX what are stability extensions listed in UAX 31). ``ID_Continue`` is defined as all characters in ``ID_Start``, plus nonspacing marks (Mn), spacing combining marks (Mc), decimal number (Nd), and connector punctuations (Pc). [...] .. [1] http://www.unicode.org/reports/tr31/ First, to Martin: Thanks for writing this PEP. While I have been reading both sides of this debate and finding both sides reasonable and understandable in the main, I have several questions which seem to not have been raised in this thread so far. Currently, in Python 2.5, identifiers are specified as starting with an upper- or lowercase letter or underscore ('_') with the following characters of the identifier also optionally being a numerical digit (0...9). This current state seems easy to remember even if felt restrictive by many. Contrawise, the referenced document UAX-31 is a bit obscure to me (which is not eased by the fact that various browsers render non-ASCII characters differently or not at all depending on the setup and font sets available). Further, a cursory perusing of the unicodedata module seems to refer me back to the Unicode docs. I note that UAX-31 seems to allow ideographs as ``ID_Start``, for example. From my relative state of ignorance, several questions come to mind: 1) Will this allow me to use, say, a right-arrow glyph (if I can find one) to start my identifier? 2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL (reversed or mirrored) identifier? (Probably not, but I don't know.) 3) Is or will there be a definitive and exhaustive listing (with bitmap representations of the glyphs to avoid the font issues) of the glyphs that the PEP 3131 would allow in identifiers? (Does this question even make sense?) I have long programmed in RPL and have appreciated being able to use, say, a right arrow symbol to start a name of a function (e.g., -R or -HMS where the '-' is a single, right-arrow glyph).[1] While it is not clear that identifiers I may wish to use would still be prohibited under PEP 3131, I vote: +0 __ [1] RPL (HP's Dr. William Wickes' language and environment circa the 1980s) allows for a few specific non-ASCII glyphs as the start of a name. I have solved my problem with my Python appliance computer project by having up to three representations for my names: Python 2.x acceptable names as the actual Python identifier, a Unicode text display exposed to the end user, and also if needed, a bitmap display exposed to the end user. So -- IAGNI. :-) -- Richard Hanson -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 16, 6:38 pm, [EMAIL PROTECTED] wrote: On May 16, 11:41 am, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Christophe wrote: snip... Who displays stack frames? Your code. Whose code includes unicode identifiers? Your code. Whose fault is it to create a stack trace display procedure that cannot handle unicode? You. Thanks but no--I work with a _lot_ of code I didn't write, and looking through stack traces from 3rd party packages is not uncommon. Are you worried that some 3rd-party package you have included in your software will have some non-ascii identifiers buried in it somewhere? Surely that is easy to check for? Far easier that checking that it doesn't have some trojan code it it, it seems to me. What do you mean, check for? If, say, numeric starts using math characters (as has been suggested), I'm not exactly going to stop using numeric. It'll still be a lot better than nothing, just slightly less better than it used to be. And I'm often not creating a stack trace procedure, I'm using the built-in python procedure. And I'm often dealing with mailing lists, Usenet, etc where I don't know ahead of time what the other end's display capabilities are, how to fix them if they don't display what I'm trying to send, whether intervening systems will mangle things, etc. I think we all are in this position. I always send plain text mail to mailing lists, people I don't know etc. But that doesn't mean that email software should be contrainted to only 7-bit plain text, no attachements! I frequently use such capabilities when they are appropriate. Sure. But when you're talking about maintaining code, there's a very high value to having all the existing tools work with it whether they're wide-character aware or not. If your response is, yes, but look at the problems html email, virus infected, attachements etc cause, the situation is not the same. You have little control over what kind of email people send you but you do have control over what code, libraries, patches, you choose to use in your software. If you want to use ascii-only, do it! Nobody is making you deal with non-ascii code if you don't want to. Yes. But it's not like this makes things so horribly awful that it's worth my time to reimplement large external libraries. I remain at -0 on the proposal; it'll cause some headaches for the majority of current Python programmers, but it may have some benefits to a sizeable minority and may help bring in new coders. And it's not going to cause flaming catastrophic death or anything. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Löwis wrote: Neil Hodgson schrieb: Martin v. Löwis: ... regardless of whether this PEP gets accepted or not (which it just did). Which version can we expect this to be implemented in? The PEP says 3.0, and the planned implementation also targets that release. Can we take it this change *won't* be backported to the 2.X series? regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?= [EMAIL PROTECTED] wrote: One possible reason is that the tools processing the program would not know correctly what encoding the source file is in, and would fail when they guessed the encoding incorrectly. For comments, that is not a problem, as an incorrect encoding guess has no impact on the meaning of the program (if the compiler is able to read over the comment in the first place). Possibly. One Java program I remember had Japanese comments encoded in Shift-JIS. Will Python be better here? Will it support the source code encodings that programmers around the world expect? Another possible reason is that the programmers were unsure whether non-ASCII identifiers are allowed. If that's the case, I'm not sure how you can improve on that in Python. There are lots of possible reasons why all these programmers around the world who want to use non-ASCII identifiers end-up not using them. One is simply that very people ever really want to do so. However, if you're to assume that they do, then you should look the existing practice in other languages to find out what they did right and what they did wrong. You don't have to speculate. Ross Ridge -- l/ // Ross Ridge -- The Great HTMU [oo][oo] [EMAIL PROTECTED] -()-/()/ http://www.csclub.uwaterloo.ca/~rridge/ db // -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
[EMAIL PROTECTED] schrieb: With the second one, all my standard tools would work fine. My user's setups will work with it. And there's a much higher chance that all the intervening systems will work with it. Please fix your setup. This is the 21st Century. Unicode is the default in Python 3000. Wake up before it is too late for you. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Currently, in Python 2.5, identifiers are specified as starting with an upper- or lowercase letter or underscore ('_') with the following characters of the identifier also optionally being a numerical digit (0...9). This current state seems easy to remember even if felt restrictive by many. Contrawise, the referenced document UAX-31 is a bit obscure to me It's actually very easy. The basic principle will stay: the first character must be a letter or an underscore, followed by letters, underscores, and digits. The question really is what is a letter? what is an underscore? what is a digit? 1) Will this allow me to use, say, a right-arrow glyph (if I can find one) to start my identifier? No. A right-arrow (such as U+2192, RIGHTWARDS ARROW) is a symbol (general category Sm: Symbol, Math). See http://unicode.org/Public/UNIDATA/UCD.html for a list of general category values, and http://unicode.org/Public/UNIDATA/UnicodeData.txt for a textual description of all characters. Now, there is a special case in that Unicode supports combining modifier characters, i.e. characters that are not characters themselves, but modify previous characters, to add diacritical marks to letters. Unicode has great flexibility in applying these, to form characters that are not supported themselves. Among those, there is U+20D7, COMBINING RIGHT ARROW ABOVE, which is of general category Mn, Mark, Nonspacing. In PEP 3131, such marks may not appear as the first character (since they need to modify a base character), but as subsequent characters. This allows you to form identifiers such as v⃗ (which should render as a small letter v, with an vector arrow on top). 2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL (reversed or mirrored) identifier? (Probably not, but I don't know.) Unicode, and this PEP, always uses logical order, not rendering order. What matters is in what order the characters appear in the source code string. RTL languages do pose a challenge, in particular since bidirectional algorithms apparently aren't implemented correctly in many editors. 3) Is or will there be a definitive and exhaustive listing (with bitmap representations of the glyphs to avoid the font issues) of the glyphs that the PEP 3131 would allow in identifiers? (Does this question even make sense?) It makes sense, but it is difficult to implement. The PEP already links to a non-normative list that is exhaustive for Unicode 4.1. Future Unicode versions may add additional characters, so the a list that is exhaustive now might not be in the future. The Unicode consortium promises stability, meaning that what is an identifier now won't be reclassified as a non-identifier in the future, but the reverse is not true, as new code points get assigned. As for the list I generated in HTML: It might be possible to make it include bitmaps instead of HTML character references, but doing so is a licensing problem, as you need a license for a font that has all these characters. If you want to lookup a specific character, I recommend to go to the Unicode code charts, at http://www.unicode.org/charts/ Notice that an HTML page that includes individual bitmaps for all characters would take *ages* to load. Regards, Martin P.S. Anybody who wants to play with generating visualisations of the PEP, here are the functions I used: def isnorm(c): return unicodedata.normalize(NFC, c) def start(c): if not isnorm(c): return False if unicodedata.category(c) in ('Ll', 'Lt', 'Lm', 'Lo', 'Nl'): return True if c==u'_': return True if c in u\u2118\u212E\u309B\u309C: return True return False def cont_only(c): if not isnorm(c): return False if unicodedata.category(c) in ('Mn', 'Mc', 'Nd', 'Pc'): return True if 0x1369 = ord(c) = 0x1371: return True return False def cont(c): return start(c) or cont_only(c) The isnorm() aspect excludes characters from the list which change under NFC. This excludes a few compatibility characters which are allowed in source code, but become indistinguishable from their canonical form semantically. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Possibly. One Java program I remember had Japanese comments encoded in Shift-JIS. Will Python be better here? Will it support the source code encodings that programmers around the world expect? It's not a question of will it. It does today, starting from Python 2.3. Another possible reason is that the programmers were unsure whether non-ASCII identifiers are allowed. If that's the case, I'm not sure how you can improve on that in Python. It will change on its own over time. Not allowed could mean not permitted by policy. Indeed, the PEP explicitly mandates a policy that bans non-ASCII characters from source (whether in identifiers or comments) for Python itself, and encourages other projects to define similar policies. What projects pick up such a policy, or pick a different policy (e.g. all comments must be in Korean) remains to be seen. Then, programmers will not be sure whether the language and the tools allow it. For Python, it will be supported from 3.0, so people will be worried initially whether their code needs to run on older Python versions. When Python 3.5 comes along, people hopefully have lost interest in supporting 2.x, so they will start using 3.x features, including this one. Now, it may be tempting to say ok, so lets wait until 3.5, if people won't use it before anyway. That is trick logic: if we add it only to 3.5, people won't be using it before 4.0. *Any* new feature takes several years to get into wide acceptance, but years pass surprisingly fast. There are lots of possible reasons why all these programmers around the world who want to use non-ASCII identifiers end-up not using them. One is simply that very people ever really want to do so. However, if you're to assume that they do, then you should look the existing practice in other languages to find out what they did right and what they did wrong. You don't have to speculate. That's indeed how this PEP came about. There were early adapters, like Java, then experience gained from it (resulting in PEP 263, implemented in Python 2.3 on the Python side, and resulting in UAX#39 on the Unicode consortium side), and that experience now flows into PEP 3131. If you think I speculated in reasoning why people did not use the feature in Java: sorry for expressing myself unclearly. I know for a fact that the reasons I suggested were actual reasons given by actual people. I'm just not sure whether this was an exhaustive list (because I did not interview every programmer in the world), and what statistical relevance each of these reasons had (because I did not conduct a scientific research to gain statistically relevant data on usage of non-ASCII identifiers in different regions of the world). Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
[EMAIL PROTECTED] a écrit : On May 15, 3:28 pm, René Fleschenberg [EMAIL PROTECTED] wrote: We all know what the PEP is about (we can read). The point is: If we do not *need* non-English/ASCII identifiers, we do not need the PEP. If the PEP does not solve an actual *problem* and still introduces some potential for *new* problems, it should be rejected. So far, the problem seems to just not exist. The burden of proof is on those who support the PEP. it *does* solve a huge problem: i have to use degenerate french, with orthographic mistakes, or select in a small subset of words to use only ascii. I'm limited in my expression, and I ressent this everyday! This is true, even if commercial french programmers don't object the pep because they have to use english in their own work. This is something i really cannot understand. it's a problem of everyday, for million people! and yes sometimes i publish code (rarely), even if it uses french identifiers, because someone looking after a real solution *does* prefer an existing solution than nothing. -- Pierre -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Steven D'Aprano schrieb: But they aren't new risks and problems, that's the point. So far, every single objection raised ALREADY EXISTS in some form or another. No. The problem The traceback shows function names having characters that do not display on most systems' screens for example does not exist today, to the best of my knowledge. And in some form or another basically means that the PEP would create more possibilities for things to go wrong. That things can already go wrong today does not mean that it does not matter if we create more occasions were things can go wrong even worse. There's all this hysteria about the problems the proposed change will cause, but those problems already exist. When was the last time a Black Hat tried to smuggle in bad code by changing an identifier from xyz0 to xyzO? Agreed, I don't think intended malicious use of the proposed feature would be a big problem. I think it is not. I think that the problem only really applies to very isolated use-cases. Like the 5.5 billion people who speak no English. No. The X people who speak no English and program in Python. I think X actually is very low (close to zero), because programming in Python virtually does require you to know some English, wether you can use non-ASCII characters in identifiers or not. It is naive to believe that you can program in Python without understanding any English once you can use your native characters in identifiers. That will not happen. Please understand that: You basically *must* know some English to program in Python, and the reason for that is not that you cannot use non-ASCII identifiers. I admit that there may be occasions where you have domain-specific terms that are hard to translate into English for a programmer. But is it really not feasible to use an ASCII transliteration in these cases? This does not seem to have been such a big problem so far, or else we would have seen more discussions about it, I think. So isolated that they do not justify a change to mainline Python. If someone thinks that non-ASCII identifiers are really needed, he could maintain a special Python branch that supports them. I doubt that there would be alot of demand for it. Maybe so. But I guarantee with a shadow of a doubt that if the change were introduced, people would use it -- even if right now they say they don't want it. Well, that is exactly what I would like to avoid ;-) -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Steven D'Aprano schrieb: Any program that uses non-English identifiers in Python is bound to become gibberish, since it *will* be cluttered with English identifiers all over the place anyway, wether you like it or not. It won't be gibberish to the people who speak the language. Hmmm, did you read my posting? By my experience, it will. I wonder: is English an acquired language for you? -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gregor Horvath schrieb: If comments are allowed to be none English, then why are identifier not? I don't need to be able to type in the exact characters of a comment in order to properly change the code, and if a comment does not display on my screen correctly, I am not as fscked as badly as when an identifier does not display (e.g. in a traceback). -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg schrieb: today, to the best of my knowledge. And in some form or another basically means that the PEP would create more possibilities for things to go wrong. That things can already go wrong today does not mean that it does not matter if we create more occasions were things can go wrong even worse. Following this logic we should not add any new features at all, because all of them can go wrong and can be used the wrong way. I love Python because it does not dictate how to do things. I do not need a ASCII-Dictator, I can judge myself when to use this feature and when to avoid it, like any other feature. Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
[EMAIL PROTECTED] schrieb: I'm not sure how you conclude that no problem exists. - Meaningful identifiers are critical in creating good code. I agree. - Non-english speakers can not create or understand english identifiers hence can't create good code nor easily grok existing code. I agree that this is a problem, but please understand that is problem is _not_ solved by allowing non-ASCII identifiers! Considering the vastly greater number of non-English spreakers in the world, who are not thus unable to use Python effectively, seems like a problem to me. Yes, but this problem is not really addressed by the PEP. If you want to do something about this: 1) Translate documentation. 2) Create a way to internationalize the standard library (and possibly the language keywords, too). Ideally, create a general standardized way to internationalize code, possibly similiar to how people internationalize strings today. When that is done, non-ASCII identifiers could become useful. But of course, doing that might create a hog of other problems. That all programers know enough english to create and understand english identifiers is currently speculation or based on tiny personaly observed samples. It is based on a look at the current Python environment. You do *at least* have the problem that the standard library uses English names. This assumes that there is documentation in the native language that is good enough (i.e. almost as good as the official one), which I can tell is not the case for German. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
After reading all thread, and based on my experience (I'm italian, english is not my native language) Martin v. Löwis wrote: - should non-ASCII identifiers be supported? yes - why? Years ago I've read C code written by a turkish guy, and all identifiers were transliteration of arab (persian? don't know) words. What I've understand of this code? Nothing. 0 (zero ;) ). Not a word. It would have been different if it was used unicode identifiers? Not at all. - would you use them if it was possible to do so? yes -- ()_() | NN KAPISCO XK' CELLHAVETE T'ANNTO CN ME SL | + (o.o) | XK' SKRIVO 1 P'HO VELLOCE MA HALL'ORA DITTELO | +---+ 'm m' | KE SIETE VOI K CI HAVVETE PROBBLEMI NO PENSATECI | O | (___) | HE SENZA RANKORI CIA | raffaele punto salmaso at gmail punto com -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Gregor Horvath schrieb: René Fleschenberg schrieb: today, to the best of my knowledge. And in some form or another basically means that the PEP would create more possibilities for things to go wrong. That things can already go wrong today does not mean that it does not matter if we create more occasions were things can go wrong even worse. Following this logic we should not add any new features at all, because all of them can go wrong and can be used the wrong way. No, that does not follow from my logic. What I say is: When thinking about wether to add a new feature, the potential benefits should be weighed against the potential problems. I see some potential problems with this PEP and very little potential benefits. I love Python because it does not dictate how to do things. I do not need a ASCII-Dictator, I can judge myself when to use this feature and when to avoid it, like any other feature. *That* logic can be used to justify the introduction of *any* feature. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On Tue, 15 May 2007 17:35:11 +0200, Stefan Behnel [EMAIL PROTECTED] wrote: Eric Brunel wrote: On Tue, 15 May 2007 15:57:32 +0200, Stefan Behnel In-house developers are rather for this PEP as they see the advantage of expressing concepts in the way the non-techies talk about it. No: I *am* an in-house developer. The argument is not public/open-source against private/industrial. As I said in some of my earlier posts, any code can pass through many people in its life, people not having the same language. I dare to say that starting a project today in any other language than english is almost irresponsible: the chances that it will get at least read by people not talking the same language as the original coders are very close to 100%, even if it always stays private. Ok, so I'm an Open-Source guy who happens to work in-house. And I'm a supporter of PEP 3131. I admit that I was simplifying in my round-up. :) But I would say that irresponsible is a pretty self-centered word in this context. Can't you imagine that those who take the irresponsible decisions of working on (and starting) projects in another language than English are maybe as responsible as you are when you take the decision of starting a project in English, but in a different context? It all depends on the specific constraints of the project, i.e. environment, developer skills, domain, ... The more complex an application domain, the more important is clear and correct domain terminology. And software developers just don't have that. They know their own domain (software development with all those concepts, languages and keywords), but there is a reason why they develop software for those who know the complex professional domain in detail but do not know how to develop software. And it's a good idea to name things in a way that is consistent with those who know the professional domain. That's why keywords are taken from the domain of software development and identifiers are taken (mostly) from the application domain. And that's why I support PEP 3131. You keep eluding the question: even if the decisions made at the project start seem quite sensible *at that time*, if the project ends up maintained in Korea, you *will have* to translate all your identifiers to something displayable, understandable and typable by (almost) anyone, a.k.a ASCII-English... Since - as I already said - I'm quite convinced that any application bigger than the average quick-n-dirty throwable script is highly likely to end up in a different country than its original coders', you'll end up losing the time you appeared to have gained in the beginning. That's what I called irresponsible (even if I admit that the word was a bit strong...). Anyway, concerning the PEP, I've finally put some water in my wine as we say in French, and I'm not so strongly against it now... Not for the reasons you give (so we can continue our flame war on this ;-) ), but mainly considering Python's usage in a learning context: this is a valid reason why non-ASCII identifiers should be supported. I just wish I'll get a '--ascii-only' switch on my Python interpreter (or any other means to forbid non-ASCII identifiers and/or strings and/or comments). -- python -c print ''.join([chr(154 - ord(c)) for c in 'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-']) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg wrote: [EMAIL PROTECTED] schrieb: I'm not sure how you conclude that no problem exists. - Meaningful identifiers are critical in creating good code. I agree. - Non-english speakers can not create or understand english identifiers hence can't create good code nor easily grok existing code. I agree that this is a problem, but please understand that is problem is _not_ solved by allowing non-ASCII identifiers! Well, as I said before, there are three major differences between the stdlib and keywords on one hand and identifiers on the other hand. Ignoring arguments does not make them any less true. So, the problem is partly tackled by the people who face it by writing degenerated transliterations and language mix in identifiers, but it would be *solved* by means of the language if Unicode identifiers were available. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
In [EMAIL PROTECTED], Stefan Behnel wrote: René Fleschenberg wrote: We all know what the PEP is about (we can read). The point is: If we do not *need* non-English/ASCII identifiers, we do not need the PEP. If the PEP does not solve an actual *problem* and still introduces some potential for *new* problems, it should be rejected. So far, the problem seems to just not exist. The burden of proof is on those who support the PEP. The main problem here seems to be proving the need of something to people who do not need it themselves. So, if a simple but I need it because a, b, c is not enough, what good is any further prove? Maybe all the (potential) programmers that can't understand english and would benefit from the ability to use non-ASCII characters in identifiers could step up and take part in this debate. In an english speaking newsgroup… =:o) There are potential users of Python who don't know much english or no english at all. This includes kids, old people, people from countries that have letters that are not that easy to transliterate like european languages, people who just want to learn Python for fun or to customize their applications like office suites or GIS software with a Python scripting option. Some people here seem to think the user base is or should be only from the computer science domain. Yes, if you are a programming professional it may be mandatory to be able to write english identifiers, comments and documentation, but there are not just programming professionals out there. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg wrote: Gregor Horvath schrieb: If comments are allowed to be none English, then why are identifier not? I don't need to be able to type in the exact characters of a comment in order to properly change the code, and if a comment does not display on my screen correctly, I am not as fscked as badly as when an identifier does not display (e.g. in a traceback). Then get tools that match your working environment. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Martin v. Lowis wrote: Lorenzo Gatti wrote: Not providing an explicit listing of allowed characters is inexcusable sloppiness. That is a deliberate part of the specification. It is intentional that it does *not* specify a precise list, but instead defers that list to the version of the Unicode standard used (in the unicodedata module). Ok, maybe you considered listing characters but you earnestly decided to follow an authority; but this reliance on the Unicode standard is not a merit: it defers to an external entity (UAX 31 and the Unicode database) a foundation of Python syntax. The obvious purpose of Unicode Annex 31 is defining a framework for parsing the identifiers of arbitrary programming languages, it's only, in its own words, specifications for recommended defaults for the use of Unicode in the definitions of identifiers and in pattern-based syntax. It suggests an orderly way to add tens of thousands of exotic characters to programming language grammars, but it doesn't prove it would be wise to do so. You seem to like Unicode Annex 31, but keep in mind that: - it has very limited resources (only the Unicode standard, i.e. lists and properties of characters, and not sensible programming language design, software design, etc.) - it is culturally biased in favour of supporting as much of the Unicode character set as possible, disregarding the practical consequences and assuming without discussion that programming language designers want to do so - it is also culturally biased towards the typical Unicode patterns of providing well explained general algorithms, ensuring forward compatibility, and relying on existing Unicode standards (in this case, character types) rather than introducing new data (but the character list of Table 3 is unavoidable); the net result is caring even less for actual usage. The XML standard is an example of how listings of large parts of the Unicode character set can be provided clearly, exactly and (almost) concisely. And, indeed, this is now recognized as one of the bigger mistakes of the XML recommendation: they provide an explicit list, and fail to consider characters that are unassigned. In XML 1.1, they try to address this issue, by now allowing unassigned characters in XML names even though it's not certain yet what those characters mean (until they are assigned). XML 1.1 is, for practical purposes, not used except by mistake. I challenge you to show me XML languages or documents of some importance that need XML 1.1 because they use non-ASCII names. XML 1.1 is supported by many tools and standards because of buzzword compliance, enthusiastic obedience to the W3C and low cost of implementation, but this doesn't mean that its features are an improvement over XML 1.0. ``ID_Continue`` is defined as all characters in ``ID_Start``, plus nonspacing marks (Mn), spacing combining marks (Mc), decimal number (Nd), and connector punctuations (Pc). Am I the first to notice how unsuitable these characters are? Probably. Nobody in the Unicode consortium noticed, but what do they know about suitability of Unicode characters... Don't be silly. These characters are suitable for writing text, not for use in identifiers; the fact that UAX 31 allows them merely proves how disconnected from actual programming language needs that document is. In typical word processing, what characters are used is the editor's problem and the only thing that matters is the correctness of the printed result; program code is much more demanding, as it needs to do more (exact comparisons, easy reading...) with less (straightforward keyboard inputs and monospaced fonts instead of complex input systems and WYSIWYG graphical text). The only way to work with program text successfully is limiting its complexity. Hard to input characters, hard to see characters, ambiguities and uncertainty in the sequence of characters, sets of hard to distinguish glyphs and similar problems are unacceptable. It seems I'm not the first to notice a lot of Unicode characters that are unsuitable for identifiers. Appendix I of the XML 1.1 standard recommends to avoid variation selectors, interlinear annotations (I missed them...), various decomposable characters, and names which are nonsensical, unpronounceable, hard to read, or easily confusable with other names. The whole appendix I is a clear admission of self-defeat, probably the result of committee compromises. Do you think you could do better? Regards, Lorenzo Gatti -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
[EMAIL PROTECTED] a écrit : Steven D'Aprano wrote: I would find it useful to be able to use non-ASCII characters for heavily mathematical programs. There would be a closer correspondence between the code and the mathematical equations if one could write D(u*p) instead of delta(mu*pi). Just as one risk here: When reading the above on Google groups, it showed up as if one could write ?(u*p)... When quoting it for response, it showed up as could write D(u*p). I'm sure that the symbol you used was neither a capital letter d nor a question mark. Using identifiers that are so prone to corruption when posting in a rather popular forum seems dangerous to me--and I'd guess that a lot of source code highlighters, email lists, etc have similar problems. I'd even be surprised if some programming tools didn't have similar problems. So, it was google groups that continuously corrupted the good UTF-8 posts by force converting them to ISO-8859-1? Of course, there's also the possibility that it is a problem on *your* side so, to be fair I've launched google groups and looked for this thread. And of course the result was that Steven's post displayed perfectly. I didn't try to reply to it of course, no need to clutter that thread anymore than it is. -- Δ(µ*π) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On Tue, 15 May 2007 21:07:30 +0200, Pierre Hanser [EMAIL PROTECTED] wrote: hello i work for a large phone maker, and for a long time we thought, very arrogantly, our phones would be ok for the whole world. After all, using a phone uses so little words, and some of them where even replaced with pictograms! every body should be able to understand appel, bis, renvoi, mévo, ... nowdays we make chinese, corean, japanese talking phones. because we can do it, because graphics are cheaper than they were, because it augments our market. (also because some markets require it) see the analogy? Absolutely not: you're talking about internationalization of the user-interface here, not about the code. There are quite simple ways to ensure users will see the displays in their own language, even if the source code is the same for everyone. But your source code will not automagically translate itself to the language of the guy who'll have to maintain it or make it evolve. So the analogy actually seems to work backwards: if you want any coder to be able to read/understand/edit your code, just don't write it in your own language... -- python -c print ''.join([chr(154 - ord(c)) for c in 'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-']) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Stefan Behnel schrieb: Then get tools that match your working environment. Integration with existing tools *is* something that a PEP should consider. This one does not do that sufficiently, IMO. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
[EMAIL PROTECTED] wrote: I even sometimes read code snippets on email lists and websites from my handheld, which is sadly still memory-limited enough that I'm really unlikely to install anything approaching a full set of Unicode fonts. One of the arguments against this PEP was that it seemed to be impossible to find either transliterated identifiers in code or native identifiers in Java code using a web search. So it is very unlikely that you will need to upgrade your handheld as it is very unlikely for you to stumble into such code. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Stefan Behnel schrieb: - Non-english speakers can not create or understand english identifiers hence can't create good code nor easily grok existing code. I agree that this is a problem, but please understand that is problem is _not_ solved by allowing non-ASCII identifiers! Well, as I said before, there are three major differences between the stdlib and keywords on one hand and identifiers on the other hand. Ignoring arguments does not make them any less true. BTW: Please stop replying to my postings by E-Mail (in Thunderbird, use Reply in stead of Reply to all). I agree that keywords are a different matter in many respects, but the only difference between stdlib interfaces and other intefaces is that the stdlib interfaces are part of the stdlib. That's it. You are still ignoring the fact that, contrary to what has been suggested in this thread, it is _not_ possible to write German or Chinese Python without cluttering it up with many many English terms. It's not only the stdlib, but also many many third party libraries. Show me one real Python program that is feasibly written without throwing in tons of English terms. Now, very special environments (what I called rare and isolated earlier) like special learning environments for children are a different matter. It should be ok if you have to use a specially patched Python branch there, or have to use an interpreter option that enables the suggested behaviour. For general programming, it IMO is a bad idea. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Marc 'BlackJack' Rintsch schrieb: There are potential users of Python who don't know much english or no english at all. This includes kids, old people, people from countries that have letters that are not that easy to transliterate like european languages, people who just want to learn Python for fun or to customize their applications like office suites or GIS software with a Python scripting option. Make it an interpreter option that can be turned on for those cases. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Eric Brunel wrote: reason why non-ASCII identifiers should be supported. I just wish I'll get a '--ascii-only' switch on my Python interpreter (or any other means to forbid non-ASCII identifiers and/or strings and/or comments). I could certainly live with that as it would be the right way around. Support Unicode by default, but allow those who require the lowest common denominator to enforce it. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Stefan Behnel schrieb: *Your* logic can be used to justify dropping *any* feature. No. I am considering both the benefits and the problems. You just happen to not like the outcome of my considerations [again, please don't reply by E-Mail, I read the NG]. -- René -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On Wed, 16 May 2007 02:14:58 +0200, Steven D'Aprano [EMAIL PROTECTED] wrote: On Tue, 15 May 2007 09:09:30 +0200, Eric Brunel wrote: Joke aside, this just means that I won't ever be able to program math in ADA, because I have absolutely no idea on how to do a 'pi' character on my keyboard. Maybe you should find out then? Personal ignorance is never an excuse for rejecting technology. My personal ignorance is fine, thank you; how is yours?: there is no keyboard *on Earth* allowing to type *all* characters in the whole Unicode set. So my keyboard may just happen to provide no means at all to type a greek 'pi', as it doesn't provide any to type Chinese, Japanese, Korean, Russian, Hebrew, or whatever character set that is not in usage in my country. And so are all keyboards all over the world. Have I made my point clear or do you require some more explanations? -- python -c print ''.join([chr(154 - ord(c)) for c in 'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-']) -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg wrote: Stefan Behnel schrieb: - Non-english speakers can not create or understand english identifiers hence can't create good code nor easily grok existing code. I agree that this is a problem, but please understand that is problem is _not_ solved by allowing non-ASCII identifiers! Well, as I said before, there are three major differences between the stdlib and keywords on one hand and identifiers on the other hand. Ignoring arguments does not make them any less true. I agree that keywords are a different matter in many respects, but the only difference between stdlib interfaces and other intefaces is that the stdlib interfaces are part of the stdlib. That's it. You are still ignoring the fact that, contrary to what has been suggested in this thread, it is _not_ possible to write German or Chinese Python without cluttering it up with many many English terms. It's not only the stdlib, but also many many third party libraries. Show me one real Python program that is feasibly written without throwing in tons of English terms. Now, very special environments (what I called rare and isolated earlier) like special learning environments for children are a different matter. It should be ok if you have to use a specially patched Python branch there, or have to use an interpreter option that enables the suggested behaviour. For general programming, it IMO is a bad idea. Ok, let me put it differently. You *do not* design Python's keywords. You *do not* design the stdlib. You *do not* design the concepts behind all that. You *use* them as they are. So you can simply take the identifiers they define and use them the way the docs say. You do not have to understand these names, they don't have to be words, they don't have to mean anything to you. They are just tools. Even if you do not understand English, they will not get in your way. You just learn them. But you *do* design your own software. You *do* design its concepts. You *do* design its APIs. You *do* choose its identifiers. And you want them to be clear and telling. You want them to match your (or your clients) view of the application. You do not care about the naming of the tools you use inside. But you do care about clarity and readability in *your own software*. See the little difference here? Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg wrote: Marc 'BlackJack' Rintsch schrieb: There are potential users of Python who don't know much english or no english at all. This includes kids, old people, people from countries that have letters that are not that easy to transliterate like european languages, people who just want to learn Python for fun or to customize their applications like office suites or GIS software with a Python scripting option. Make it an interpreter option that can be turned on for those cases. No. Make ASCII-only an interpreter option that can be turned on for the cases where it is really required. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg schrieb: Gregor Horvath schrieb: René Fleschenberg schrieb: today, to the best of my knowledge. And in some form or another basically means that the PEP would create more possibilities for things to go wrong. That things can already go wrong today does not mean that it does not matter if we create more occasions were things can go wrong even worse. Following this logic we should not add any new features at all, because all of them can go wrong and can be used the wrong way. No, that does not follow from my logic. What I say is: When thinking about wether to add a new feature, the potential benefits should be weighed against the potential problems. I see some potential problems with this PEP and very little potential benefits. I love Python because it does not dictate how to do things. I do not need a ASCII-Dictator, I can judge myself when to use this feature and when to avoid it, like any other feature. *That* logic can be used to justify the introduction of *any* feature. *Your* logic can be used to justify dropping *any* feature. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
On May 15, 11:25 pm, Stefan Behnel [EMAIL PROTECTED] wrote: René Fleschenberg wrote: Javier Bezos schrieb: But having, for example, things like open() from the stdlib in your code and then öffnen() as a name for functions/methods written by yourself is just plain silly. It makes the code inconsistent and ugly without significantly improving the readability for someone who speaks German but not English. Agreed. I always use English names (more or less :-)), but this is not the PEP is about. We all know what the PEP is about (we can read). The point is: If we do not *need* non-English/ASCII identifiers, we do not need the PEP. If the PEP does not solve an actual *problem* and still introduces some potential for *new* problems, it should be rejected. So far, the problem seems to just not exist. The burden of proof is on those who support the PEP. The main problem here seems to be proving the need of something to people who do not need it themselves. So, if a simple but I need it because a, b, c is not enough, what good is any further prove? Stefan For what it's worth, I can only speak English (bad English schooling!) and I'm definitely +1 on the PEP. Anyone using tools from the last 5 years can handle UTF-8 Cheers, Ben -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
René Fleschenberg schrieb: I love Python because it does not dictate how to do things. I do not need a ASCII-Dictator, I can judge myself when to use this feature and when to avoid it, like any other feature. *That* logic can be used to justify the introduction of *any* feature. No. That logic can only be used to justify the introduction of a feature that brings freedom. Who are we to dictate the whole python world how to spell an identifier? Gregor -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Ben wrote: On May 15, 11:25 pm, Stefan Behnel [EMAIL PROTECTED] wrote: Rene Fleschenberg wrote: Javier Bezos schrieb: But having, for example, things like open() from the stdlib in your code and then o:ffnen() as a name for functions/methods written by yourself is just plain silly. It makes the code inconsistent and ugly without significantly improving the readability for someone who speaks German but not English. Agreed. I always use English names (more or less :-)), but this is not the PEP is about. We all know what the PEP is about (we can read). The point is: If we do not *need* non-English/ASCII identifiers, we do not need the PEP. If the PEP does not solve an actual *problem* and still introduces some potential for *new* problems, it should be rejected. So far, the problem seems to just not exist. The burden of proof is on those who support the PEP. The main problem here seems to be proving the need of something to people who do not need it themselves. So, if a simple but I need it because a, b, c is not enough, what good is any further prove? Stefan For what it's worth, I can only speak English (bad English schooling!) and I'm definitely +1 on the PEP. Anyone using tools from the last 5 years can handle UTF-8 The falsehood of the last sentence is why I'm moderately against this PEP. Even examples within this thread don't display correctly on several of the machines I have access too (all of which are less than 5 year old OS/browser environments). It strikes me a similar to the arguments for quoted-printable in the early 1990s, claiming that everyone can view it or will be able to soon--and even a decade _after_ everyone can deal with latin1 just fine it was still causing massive headaches. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 3131: Supporting Non-ASCII Identifiers
Christophe wrote: [EMAIL PROTECTED] a ecrit : Steven D'Aprano wrote: I would find it useful to be able to use non-ASCII characters for heavily mathematical programs. There would be a closer correspondence between the code and the mathematical equations if one could write D(u*p) instead of delta(mu*pi). Just as one risk here: When reading the above on Google groups, it showed up as if one could write ?(u*p)... When quoting it for response, it showed up as could write D(u*p). I'm sure that the symbol you used was neither a capital letter d nor a question mark. Using identifiers that are so prone to corruption when posting in a rather popular forum seems dangerous to me--and I'd guess that a lot of source code highlighters, email lists, etc have similar problems. I'd even be surprised if some programming tools didn't have similar problems. So, it was google groups that continuously corrupted the good UTF-8 posts by force converting them to ISO-8859-1? Of course, there's also the possibility that it is a problem on *your* side Well, that's part of the point isn't it? It seems incredibly naive to me to think that you could use whatever symbol was intended and have it show up, and the well fix your machine! argument doesn't fly. A lot of the time programmers have to look at stack traces on end-user's machines (whatever they may be) to help debug. They have to look at code on the (GUI-less) production servers over a terminal link. They have to use all kinds of environments where they can't install the latest and greatest fonts. Promoting code that becomes very hard to read and debug in real situations seems like a sound negative to me. -- http://mail.python.org/mailman/listinfo/python-list