Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-20 Thread Christophe Cavalaria
Istvan Albert wrote:

 On May 19, 3:33 am, Martin v. Löwis [EMAIL PROTECTED] wrote:
 
 That would be invalid syntax since the third line is an assignment
  with target identifiers separated only by spaces.

 Plus, the identifier starts with a number (even though 6 is not DIGIT
 SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
 start an identifier).
 
 Actually both of these issues point to the real problem with this PEP.
 
 I knew about them (note that the colon is also missing) alas I
 couldn't fix them.
 My editor would could not remove a space or add a colon anymore, it
 would immediately change the rest of the characters to something
 crazy.
 
 (Of course now someone might feel compelled to state that this is an
 editor problem but I digress, the reality is that features need to
 adapt to reality, moreso had I used a different editor I'd be still
 unable to write these characters).

The reality is that the few users who care about having chinese in their
code *will* be using an editor that supports them.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-20 Thread rurpy
On May 17, 5:03 pm, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
  Are you worried that some 3rd-party package you have
  included in your software will have some non-ascii identifiers
  buried in it somewhere?  Surely that is easy to check for?
  Far easier that checking that it doesn't have some trojan
  code it it, it seems to me.

 What do you mean, check for?  If, say, numeric starts using math
 characters (as has been suggested), I'm not exactly going to stop
 using numeric.  It'll still be a lot better than nothing, just
 slightly less better than it used to be.

The PEP explicitly states that no non-ascii identifiers
will be permitted in the standard library.  The opinions
expressed here seems almost unamimous that non-ascii
identifiers are a bad idea in any sort of shared public
code.  Why do you think the occurance of non-ascii
identifiers in Numpy is likely?

   And I'm often not creating a stack trace procedure, I'm using the
   built-in python procedure.
 
   And I'm often dealing with mailing lists, Usenet, etc where I don't
   know ahead of time what the other end's display capabilities are, how
   to fix them if they don't display what I'm trying to send, whether
   intervening systems will mangle things, etc.
 
  I think we all are in this position.  I always send plain
  text mail to mailing lists, people I don't know etc.  But
  that doesn't mean that email software should be contrainted
  to only 7-bit plain text, no attachements!  I frequently use
  such capabilities when they are appropriate.

 Sure.  But when you're talking about maintaining code, there's a very
 high value to having all the existing tools work with it whether
 they're wide-character aware or not.

I agree.  On Windows I often use Notepad to edit
python files.  (There goes my credibility! :-)
So I don't like tab-only indent proposals that assume
I can set tabs to be an arbitrary number of spaces.
But tab-only indentation would affect every python
program and every python programmer.

In the case of non-ascii identifiers, the potential
gains are so big for non-english spreakers, and (IMO)
the difficulty of working with non-ascii identifiers
times the probibility of having to work with them,
so low, that the former clearly outweighs the latter.

  If your response is, yes, but look at the problems html
  email, virus infected, attachements etc cause, the situation
  is not the same.  You have little control over what kind of
  email people send you but you do have control over what
  code, libraries, patches, you choose to use in your
  software.
 
  If you want to use ascii-only, do it!  Nobody is making
  you deal with non-ascii code if you don't want to.

 Yes.  But it's not like this makes things so horribly awful that it's
 worth my time to reimplement large external libraries.  I remain at -0
 on the proposal;

 it'll cause some headaches for the majority of
 current Python programmers, but it may have some benefits to a
 sizeable minority

This is the crux of the matter I think.  That
non-ascii identifiers will spead like a virus, infecting
program after program until every piece of Python code
is nothing but a mass of wreathing unintellagible non-
ascii characters.  (OK, maybe I am overstating a little. :-)

I (and I think other proponents) don't think this is
likely to happen, and the the benefits to non-english
speakers of being able to write maintainable code far
outweigh the very rare case when it does occur.

 and may help bring in new coders.  And it's not
 going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Javier Bezos
@yahoo.com escribió:

  Perhaps, but the treatment by your mail/news software plus the
  delightful Google Groups of the original text (which seemed intact in
  the original, although I don't have the fonts for the content) would
  suggest that not just social or cultural issues would be involved.

 The fact my Outlook changed the text is irrelevant
 for something related to Python.

 On the contrary, it cuts to the heart of the problem.  There are
 hundreds of tools out there that programmers use, and mailing lists
 are certainly an incredibly valuable tool--introducing a change that
 makes code more likely to be silently mangled seems like a negative.

In such a case, the Python indentation should be
rejected (quite interesting you removed from my
post the part mentioning it). I can promise there
are Korean groups and there are no problems at
all in using Hangul (the Korean writing).

Javier
-
http://www.texytipografia.com 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Martin v. Löwis
 Providing a method that would translate an arbitrary string into a
 valid Python identifier would be helpful. It would be even more
 helpful if it could provide a way of converting untranslatable
 characters. However, I suspect that the translate (normalize?) routine
 in the unicode module will do.

Not at all. Unicode normalization only unifies different spellings
of the same character.

For transliteration, no simple algorithm exists, as it generally depends
on the language. However, if you just want any kind of ASCII string,
you can use the Unicode error handlers (PEP 293). For example, the
program

import unicodedata, codecs

def namereplace(exc):
if isinstance(exc,
   (UnicodeEncodeError, UnicodeTranslateError)):
s = u
for c in exc.object[exc.start:exc.end]:
s += N_+unicode(unicodedata.name(c).replace( ,_))+_
return (s, exc.end)
else:
raise TypeError(can't handle %s % exc.__name__)

codecs.register_error(namereplace, namereplace)

print uSchl\xfcssel.encode(ascii, namereplace)

prints SchlN_LATIN_SMALL_LETTER_U_WITH_DIAERESIS_ssel.

HTH,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Martin v. Löwis
 But you're making a strawman argument by using extended ASCII
 characters that would work anyhow. How about debugging this (I wonder
 will it even make it through?) :

 class 6자회담관련론조
6자회 = 0
6자회담관련 고귀 명=10
 
That would be invalid syntax since the third line is an assignment
 with target identifiers separated only by spaces.

Plus, the identifier starts with a number (even though 6 is not DIGIT
SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
start an identifier).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Richard Hanson
On Fri, 18 May 2007 06:28:03 +0200, Martin v. Löwis wrote:

[excellent as always exposition by Martin]

Thanks, Martin. 

 P.S. Anybody who wants to play with generating visualisations
 of the PEP, here are the functions I used:

[code snippets]

Thanks for those functions, too -- I've been exploring with them and
am slowly coming to some understanding.

 -- Richard Hanson

To many native-English-speaking developers well versed in other
programming environments, Python is *already* a foreign language --
judging by the posts here in c.l.py over the years. ;-)
__

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread René Fleschenberg
Martin v. Löwis schrieb:
 I've reported this before, but happily do it again: I have lived many
 years without knowing what a hub is, and what to pass means if
 it's not the opposite of to fail. Yet, I have used their technical
 meanings correctly all these years.

I was not speaking of the more general (non-technical) meanings, but of
the technical ones. The claim which I challenged was that people learn
just the use (syntax) but not the meaning (semantics) of these
terms. I think you are actually supporting my argument ;)

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread René Fleschenberg
Martin v. Löwis schrieb:
 Then get tools that match your working environment.
 Integration with existing tools *is* something that a PEP should
 consider. This one does not do that sufficiently, IMO.
 
 What specific tools should be discussed, and what specific problems
 do you expect?

Systems that cannot display code parts correctly. I expect problems with
unreadable tracebacks, for example.

Also: Are existing tools that somehow process Python source code e.g. to
test wether it meets certain criteria (pylint  co) or to aid in
creating documentation (epydoc  co) fully unicode-ready?

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Peter Maas
Martin v. Löwis wrote:
 Python code is written by many people in the world who are not familiar
 with the English language, or even well-acquainted with the Latin
 writing system.

I believe that there is a not a single programmer in the world who doesn't
know ASCII. It isn't hard to learn the latin alphabet and you have to know
it anyway to use the keywords and the other ASCII characters to write numbers,
punctuation etc. Most non-western alphabets have ASCII transcription rules
and contain ASCII as a subset. On the other hand non-ascii identifiers
lead to fragmentation and less understanding in the programming world so I
don't like them. I also don't like non-ascii domain names where the same
arguments apply.

Let the data be expressed with Unicode but the logic with ASCII.

-- 
Regards/Gruesse,

Peter Maas, Aachen
E-mail 'cGV0ZXIubWFhc0B1dGlsb2cuZGU=\n'.decode('base64')
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-19 Thread Istvan Albert
On May 19, 3:33 am, Martin v. Löwis [EMAIL PROTECTED] wrote:

 That would be invalid syntax since the third line is an assignment
  with target identifiers separated only by spaces.

 Plus, the identifier starts with a number (even though 6 is not DIGIT
 SIX, but FULLWIDTH DIGIT SIX, it's still of category Nd, and can't
 start an identifier).

Actually both of these issues point to the real problem with this PEP.

I knew about them (note that the colon is also missing) alas I
couldn't fix them.
My editor would could not remove a space or add a colon anymore, it
would immediately change the rest of the characters to something
crazy.

(Of course now someone might feel compelled to state that this is an
editor problem but I digress, the reality is that features need to
adapt to reality, moreso had I used a different editor I'd be still
unable to write these characters).

i.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin
Martin v. Löwis [EMAIL PROTECTED] writes:
 Now I understand it is meaning 12 in Merriam-Webster's dictionary,
 a) to decline to bid, double, or redouble in a card game, or b)
 to let something go by without accepting or taking
 advantage of it.

I never thought of it as having that meaning.  I thought of it in the
sense of going by something without stopping, like I passed a post
office on my way to work today.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin
Martin v. Löwis [EMAIL PROTECTED] writes:
  Integration with existing tools *is* something that a PEP should
  consider. This one does not do that sufficiently, IMO.
 What specific tools should be discussed, and what specific problems
 do you expect?

Emacs, whose unicode support is still pretty weak.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Hendrik van Rooyen

Hendrik van Rooyen [EMAIL PROTECTED] wrote:

 
 
   Now look me in the eye and tell me that you find
   the mix of proper German and English keywords
   beautiful.
  
  I can't admit that, but I find that using German
  class and method names is beautiful. The rest around
  it (keywords and names from the standard library)
  are not English - they are Python.
  
MvL:
  (look me in the eye and tell me that def is
  an English word, or that getattr is one)
  
 
HvR:
 LOL - true - but a broken down assembler programmer like me
 does not use getattr - and def is short for define, and for and while
 and in are not German.

After an intense session of omphaloscopy, I would like another bite 
at this cherry.

I think my problem is something like this - when I see a line of code
like:

def frobnitz():

I do not actually see the word def - I see something like:

define a function with no arguments called frobnitz

This expansion process is involuntary, and immediate in my mind.

And this is immediately followed by an irritated reaction, like:

WTF is frobnitz? What is it supposed to do? What Idiot wrote this?

Similarly, when I encounter the word getattr - it is immediately
expanded to get attribute and this expansion is kind of
dependant on another thing, namely that my mind is in English
mode - I refer here to something that only happens rarely, but
with devastating effect, experienced only by people who can read
more than one language - I am referring to the phenomenon that you 
look at an unfamiliar piece of writing on say a signboard, with the 
wrong language switch set in your mind - and you cannot read it,
it makes no sense for a second or two - until you kind of step back 
mentally and have a more deliberate look at it, when it becomes 
obvious that its not say English, but Afrikaans, or German, or vice 
versa.

So in a sense, I can look you in the eye and assert that def and 
getattr are in fact English words...  (for me, that is)

I suppose that this one language track - mindedness of mine
is why I find the mix of keywords and German or Afrikaans so 
abhorrent - I cannot really help it, it feels as if I am eating a 
sandwich, and that I bite on a stone in the bread. - It just jars.

Good luck with your PEP - I don't support it, but it is unlikely
that the Python-dev crowd and GvR would be swayed much
by the opinions of the egregious HvR.

Aesthetics aside, I think that the practical maintenance problems
(especially remote maintenance) is the rock on which this
ship could founder.

- Hendrik

--
Philip Larkin (English Poet) :
They fuck you up, your mom and dad -
They do not mean to, but they do.
They fill you with the faults they had,
and add some extra, just for you.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Hendrik van Rooyen
Sion Arrowsmith [EMAIL PROTECTED] wrote:

Hvr:
Would not like it at all, for the same reason I don't like re's -
It looks like random samples out of alphabet soup to me.

What I meant was, would the use of foreign identifiers look so
horrible to you if the core language had fewer English keywords?
(Perhaps Perl, with its line-noise, was a poor choice of example.
Maybe Lisp would be better, but I'm not so sure of my Lisp as to
make such an assertion for it.)

I suppose it would jar less - but I avoid such languages, as the whole
thing kind of jars - I am not on the python group for nothing..

: - )

- Hendrik

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Rubin
Martin v. Löwis [EMAIL PROTECTED] writes:
 If you doubt the claim, please indicate which of these three aspects
 you doubt:
 1. there are programmers which desire to defined classes and functions
with names in their native language.
 2. those developers find the code clearer and more maintainable than
if they had to use English names.
 3. code clarity and maintainability is important.

I think it can damage clarity and maintainability and if there's so
much demand for it then I'd propose this compromise: non-ascii
identifiers are allowed but they produce a compiler warning message
(including from eval and exec).  You can suppress the warning message
with a command line option.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Torsten Bronger
Hallöchen!

Martin v. Löwis writes:

 In [EMAIL PROTECTED], Nick Craig-Wood
 wrote:
 
 My initial reaction is that it would be cool to use all those
 great symbols.  A variable called OHM etc!
 
 This is a nice candidate for homoglyph confusion.  There's the
 Greek letter omega (U+03A9) Ω and the SI unit symbol (U+2126) Ω,
 and I think some omegas in the mathematical symbols area too.

 Under the PEP, identifiers are converted to normal form NFC, and
 we have

 py unicodedata.normalize(NFC, u\u2126)
 u'\u03a9'

 So, OHM SIGN compares equal to GREEK CAPITAL LETTER OMEGA. It can't
 be confused with it - it is equal to it by the proposed language
 semantics.

So different unicode sequences in the source code can denote the
same identifier?

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
  Jabber ID: [EMAIL PROTECTED]
  (See http://ime.webhop.org for ICQ, MSN, etc.)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Thomas Bellman
=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= [EMAIL PROTECTED] wrote:

 3) Is or will there be a definitive and exhaustive listing (with
 bitmap representations of the glyphs to avoid the font issues) of the
 glyphs that the PEP 3131 would allow in identifiers? (Does this
 question even make sense?)

 As for the list I generated in HTML: It might be possible to
 make it include bitmaps instead of HTML character references,
 but doing so is a licensing problem, as you need a license
 for a font that has all these characters. If you want to
 lookup a specific character, I recommend to go to the Unicode
 code charts, at

 http://www.unicode.org/charts/

My understanding is also that there are several east-asian
characters that display quite differently depending on whether
you are in Japan, Taiwan or mainland China.  So much differently
that for example a Japanese person will not be able to recognize
a character rendered in the Taiwanese or mainland Chinese way.


-- 
Thomas Bellman,   Lysator Computer Club,   Linköping University,  Sweden
Adde parvum parvo magnus acervus erit   ! bellman @ lysator.liu.se
  (From The Mythical Man-Month)   ! Make Love -- Nicht Wahr!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Laurent Pointal
Long and interresting discussion with different point of view.

Personnaly, even if the PEP goes (and its accepted), I'll continue to use
identifiers as currently. But I understand those who wants to be able to
use chars in their own language.

* for people which are not expert developers (non-pros, or in learning
context), to be able to use names having meaning, and for pro developers
wanting to give a clear domain specific meaning - mainly for languages non
based on latin characters where the problem must be exacerbated.
They can already use unicode in strings (including documentation ones).

* for exchanging with other programing languages having such identifiers...
when they are really used (I include binding of table/column names in
relational dataabses).

* (not read, but I think present) this will allow developers to lock the
code so that it could not be easily taken/delocalized anywhere by anybody.


In the discussion I've seen that problem of mixing chars having different
unicode number but same representation (ex. omega) is resolved (use of an
unicode attribute linked to representation AFAIU).

I've seen (on fclp) post about speed, it should be verified, I'm not sure we
will loose speed with unicode identifiers.

On the unicode editing, we have in 2007 enough correct editors supporting
unicode (I configure my Windows/Linux editors to use utf-8 by default).


I join concern in possibility to read code from a project which may use such
identifiers (i dont read cyrillic, neither kanji or hindi) but, this will
just give freedom to users.

This can be a pain for me in some case, but is this a valuable argument so
to forbid this for other people which feel the need ?


IMHO what we should have if the PEP goes on:

* reworking on traceback to have a general option (like -T) to ensure
tracebacks prints only pure ascii, to avoid encoding problem when
displaying errors on terminals.

* a possibility to specify for modules that they must *define* only
ascii-based names, like a from __futur__ import asciionly. To be able to
enforce this policy in projects which request it.

* and, as many wrote, enforce that standard Python libraries use only ascii
identifiers.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Torsten Bronger
Hallöchen!

Laurent Pointal writes:

 [...]

 Personnaly, even if the PEP goes (and its accepted), I'll continue
 to use identifiers as currently. [...]

Me too (mostly), although I do like the PEP.  While many people have
pointed out possible issues of the PEP, only few have tried to
estimate its actual impact.  I don't think that it will do harm to
Python code because the programmers will know when it's appropriate
to use it.  The potential trouble is too obvious for being ignored
accidentally.  And in the case of a bad programmer, you have more
serious problems than flawed identifier names, really.

But for private utilities for example, such identifiers are really a
nice thing to have.  The same is true for teaching in some cases.
And the small simulation program in my thesis would have been better
with some α and φ.  At least, the program would be closer to the
equations in the text then.

 [...]

 * a possibility to specify for modules that they must *define*
 only ascii-based names, like a from __futur__ import asciionly. To
 be able to enforce this policy in projects which request it.

Please don't.  We're all adults.  If a maintainer is really
concerned about such a thing, he should write a trivial program that
ensures it.  After all, there are some other coding guidelines too
that could be enforced this way but aren't, for good reason.

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
  Jabber ID: [EMAIL PROTECTED]
  (See http://ime.webhop.org for ICQ, MSN, etc.)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath
Hendrik van Rooyen schrieb:

 I suppose that this one language track - mindedness of mine
 is why I find the mix of keywords and German or Afrikaans so 
 abhorrent - I cannot really help it, it feels as if I am eating a 
 sandwich, and that I bite on a stone in the bread. - It just jars.

Please come to Vienna and learn the local slang.
You would be surprised how beautiful and expressive a language mixed up 
of a lot of very different languages can be. Same for music. It's the 
secret of success of the music from Vienna. It's just a mix up of all 
the different cultures once living in a big multicultural kingdom.

A mix up of Python key words and German identifiers feels very natural 
for me. I live in cultural diversity and richness and love it.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Istvan Albert
On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote:

 Is there any difference for you in debugging this code snippets?

 class Türstock(object):

Of course there is, how do I type the ü ? (I can copy/paste for
example, but that gets old quick).

But you're making a strawman argument by using extended ASCII
characters that would work anyhow. How about debugging this (I wonder
will it even make it through?) :

class 6자회담관련론조
   6자회 = 0
   6자회담관련 고귀 명=10


(I don't know what it means, just copied over some words from a
japanese news site, but the first thing it did it messed up my editor,
would not type the colon anymore)

i.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Javier Bezos
Istvan Albert [EMAIL PROTECTED] escribió:

 How about debugging this (I wonder will it even make it through?) :

 class 6???
6?? = 0
   6? ?? ?=10

This question is more or less what a Korean who doesn't
speak English would ask if he had to debug a program
written in English.

 (I don't know what it means, just copied over some words
 from a japanese news site,

A Japanese speaking Korean, it seems. :-)

Javier
--
http://www.texytipografia.com 


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Boddie
On 18 Mai, 18:42, Javier Bezos [EMAIL PROTECTED] wrote:
 Istvan Albert [EMAIL PROTECTED] escribió:

  How about debugging this (I wonder will it even make it through?) :

  class 6???

 6?? = 0
6? ?? ?=10

 This question is more or less what a Korean who doesn't
 speak English would ask if he had to debug a program
 written in English.

Perhaps, but the treatment by your mail/news software plus the
delightful Google Groups of the original text (which seemed intact in
the original, although I don't have the fonts for the content) would
suggest that not just social or cultural issues would be involved.
It's already more difficult than it ought to be to explain to people
why they have trouble printing text to the console, for example, and
if one considers issues with badly configured text editors putting the
wrong character values into programs, even if Python complains about
it, there's still going to be some explaining to do.

One thing that some people already dislike about Python is the
editing discipline required. Although I don't have much time for
people whose coding skills involve random edits using badly
configured editors, trashing the indentation and the appearance of the
code (regardless of the language involved), we do need to consider the
need to bring people up to speed gracefully by encouraging the
proper use of tools, and so on, all without making it seem really
difficult and discouraging people from learning the language.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath
Istvan Albert schrieb:
 On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
 
 Is there any difference for you in debugging this code snippets?
 
 class Türstock(object):
 
 Of course there is, how do I type the ü ? (I can copy/paste for
 example, but that gets old quick).
 

I doubt that you can debug the code without Unicode chars. It seems that 
you do no understand German and therefore you do not know what the 
purpose of this program is.
Can you tell me if there is an error in the snippet without Unicode?

I would refuse to try do debug a program that I do not understand. 
Avoiding Unicode does not help a bit in this regard.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Gregor Horvath
Paul Boddie schrieb:

 Perhaps, but the treatment by your mail/news software plus the
 delightful Google Groups of the original text (which seemed intact in
 the original, although I don't have the fonts for the content) would
 suggest that not just social or cultural issues would be involved.

I do not see the point.
If my editor or newsreader does display the text correctly or not is no 
difference for me, since I do not understand a word of it anyway. It's a 
meaningless stream of bits for me.
It's save to assume that for people who are finding this meaningful 
their setup will display it correctly. Otherwise they could not work 
with their computer anyway.

Until now I did not find a single Computer in my German domain who 
cannot display: ß.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Javier Bezos

 This question is more or less what a Korean who doesn't
 speak English would ask if he had to debug a program
 written in English.

 Perhaps, but the treatment by your mail/news software plus the
 delightful Google Groups of the original text (which seemed intact in
 the original, although I don't have the fonts for the content) would
 suggest that not just social or cultural issues would be involved.

The fact my Outlook changed the text is irrelevant
for something related to Python. And just remember
how Google mangled the intentation of Python code
some time ago. This was a technical issue which has
been solved, and no doubt my laziness (I didn't
switch to Unicode) won't prevent non-ASCII identifiers
be properly showed in general.

Javier
-
http://www.texytipografia.com



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread [EMAIL PROTECTED]
On May 18, 1:47 pm, Javier Bezos [EMAIL PROTECTED] wrote:
  This question is more or less what a Korean who doesn't
  speak English would ask if he had to debug a program
  written in English.

  Perhaps, but the treatment by your mail/news software plus the
  delightful Google Groups of the original text (which seemed intact in
  the original, although I don't have the fonts for the content) would
  suggest that not just social or cultural issues would be involved.

 The fact my Outlook changed the text is irrelevant
 for something related to Python.

On the contrary, it cuts to the heart of the problem.  There are
hundreds of tools out there that programmers use, and mailing lists
are certainly an incredibly valuable tool--introducing a change that
makes code more likely to be silently mangled seems like a negative.

Of course, there are other benefits to the PEP, so I'm only barely
opposed.  But dismissing the fact that Outlook and other quite common
tools may have severe problems with code seems naive (or disingenuous,
but I don't think that's the case here).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Paul Boddie
Gregor Horvath wrote:
 Paul Boddie schrieb:

  Perhaps, but the treatment by your mail/news software plus the
  delightful Google Groups of the original text (which seemed intact in
  the original, although I don't have the fonts for the content) would
  suggest that not just social or cultural issues would be involved.

 I do not see the point.
 If my editor or newsreader does display the text correctly or not is no
 difference for me, since I do not understand a word of it anyway. It's a
 meaningless stream of bits for me.

But if your editor doesn't even bother to preserve those bits
correctly, it makes a big difference. When 6자회담관련론조 becomes 6???
because someone's tool did the equivalent of
unicode_obj.encode(iso-8859-1, replace), then the stream of bits
really does become meaningless. (We'll see if the former identifier
even resembles what I've just pasted later on, or whether it resembles
the latter.)

 It's save to assume that for people who are finding this meaningful
 their setup will display it correctly. Otherwise they could not work
 with their computer anyway.

Sure, it's all about editor discipline or tool discipline just as
I wrote. I'm in favour of the PEP, generally, but I worry about the
long explanations required when people find that their programs are
now ill-formed because someone made a quick edit in a bad editor.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread Neil Hodgson
Istvan Albert:

 But you're making a strawman argument by using extended ASCII
 characters that would work anyhow. How about debugging this (I wonder
 will it even make it through?) :
 
 class 6자회담관련론조
6자회 = 0
6자회담관련 고귀 명=10

That would be invalid syntax since the third line is an assignment 
with target identifiers separated only by spaces.

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-18 Thread John Roth
On May 13, 9:44 am, Martin v. Löwis [EMAIL PROTECTED] wrote:
 PEP 1 specifies that PEP authors need to collect feedback from the
 community. As the author of PEP 3131, I'd like to encourage comments
 to the PEP included below, either here (comp.lang.python), or to
 [EMAIL PROTECTED]

 In summary, this PEP proposes to allow non-ASCII letters as
 identifiers in Python. If the PEP is accepted, the following
 identifiers would also become valid as class, function, or
 variable names: Löffelstiel, changé, ошибка, or 売り場
 (hoping that the latter one means counter).

I notice that Guido has approved it, so I'm looking at what it would
take to support it for Python FIT. The actual issue (for me) is
translating labels for cell columns (and similar) into Python
identifiers. After looking at the firestorm, I've come to the
conclusion that the old methods need to be retained not only for
backwards compatability but also for people who want to translate
existing fixtures.

The guidelines in PEP 3131 for standard library code appear to be
adequate for code that's going to be contributed to the community. I
will most likely emphasize those in my documentation.

Providing a method that would translate an arbitrary string into a
valid Python identifier would be helpful. It would be even more
helpful if it could provide a way of converting untranslatable
characters. However, I suspect that the translate (normalize?) routine
in the unicode module will do.

John Roth
Phthon FIT


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen

Sion Arrowsmith [EMAIL PROTECTED] wrote:


Hendrik van Rooyen wrote:

I still don't like the thought of the horrible mix of foreign
identifiers and English keywords, coupled with the English 
sentence construction.

How do you think you'd feel if Python had less in the way of
(conventionally used) English keywords/builtins. Like, say, Perl?

Would not like it at all, for the same reason I don't like re's -
It looks like random samples out of alphabet soup to me.

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen


Gregor Horvath [EMAIL PROTECTED] wrote:


 Hendrik van Rooyen schrieb:
 
  It is not so much for technical reasons as for aesthetic 
  ones - I find reading a mix of languages horrible, and I am
  kind of surprised by the strength of my own reaction.
 
 This is a matter of taste.

I agree - and about perceptions of quality. Of what is good, 
and not good. - If you havent yet, read Robert Pfirsig's book:
Zen and the art of motorcycle maintenance

 In some programs I use German identifiers (not unicode). I and others 
 like the mix. My customers can understand the code better. (They are 
 only reading it)
 

I can sympathise a little bit with a customer who tries to read code.
Why that should be necessary, I cannot understand - does the stuff
not work to the extent that the customer feels he has to help you?
You do not talk as if you are incompetent, so I see no reason why 
the customer should want to meddle in what you have written, unless
he is paying you to train him to program, and as Eric Brunel has 
pointed out, this mixing of languages is all right in a training environment.

  
  Beautiful is better than ugly
 
 Correct.
 But why do you think you should enforce your taste to all of us?

You misjudge me - the OP asked if I would use the feature, and I am 
speaking for myself when I explain why I would not use it.

 
 With this logic you should all drive Alfa Romeos!
 

Actually no - this is not about logic - my post clearly stated
that I was talking about feelings.  And the only logic that applies 
to feelings is the incontrovertible fact that they exist, and that it
makes good logical sense to acknowledge them, and to take that
into account in one's actions.

And as far as Alfa's go - we have found here that they are rather 
soft - our dirt roads destroy them in no time.  : - (

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Duncan Booth
Gabriel Genellina [EMAIL PROTECTED] wrote:

 - Someone proposed using escape sequences of some kind, supported by  
 editor plugins, so there is no need to modify the parser.

I'm not sure whether my suggestion below is the same as or a variation
on this. 

 
 - Refactoring tools should let you rename foreign identifiers into
 ASCII  only.

A possible modification to the PEP would be to permit identifiers to
also include \u and \U escape sequences (as some other
languages already do). Then you could have a script easily (and
reversibly) convert all identifiers to ascii or indeed any other
encoding or subset of unicode using escapes only for the unrepresentable
characters. 

I think this would remove several of the objections: such as being
unable to tell at a glance whether someone is trying to spoof your
variable names, or being unable to do minor maintenance on code using
character sets which your editor doesn't support: you just run the
script which would be included with every copy of Python to restrict the
character set of the source files to whatever character set you feel
happy with. The script should also be able to convert unrepresentable
characters in strings and comments (although that last operation
wouldn't be guaranteed reversible). 

Of course it doesn't do anything for the objection about such
identifiers being ugly, but you can't have everything.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath
Hendrik van Rooyen schrieb:

 I can sympathise a little bit with a customer who tries to read code.
 Why that should be necessary, I cannot understand - does the stuff
 not work to the extent that the customer feels he has to help you?
 You do not talk as if you are incompetent, so I see no reason why 
 the customer should want to meddle in what you have written, unless
 he is paying you to train him to program, and as Eric Brunel has 
 pointed out, this mixing of languages is all right in a training environment.

That is highly domain and customer specific individual logic, that the 
costumer knows best. (For example variation logic of window and door 
manufacturers)
He has to understand the code, so that he can verify it's correct.
We are in fact developing it together.
Some costumers even are coding this logic themselves. Some of them are 
not fluent in English especially not in the computer domain.

Translating the logic into a documentation is a waste of time if the 
code is self documenting and easy to grasp. (As python usually is) But 
the code can only be self documenting if it is written in the domain 
specific language of the customer. Sometimes these are words that are 
not even used in general German. Even in German different customers are 
naming the same thing with different words. Talking and coding in the 
language of the customer is a huge benefit.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list


PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Andrea Gavana

Hi All,


In summary, this PEP proposes to allow non-ASCII letters as
identifiers in Python.


In primis, I would like to congratulate with Martin to have started
one of the most active threads (flame wars? :- D ) in the python-list
history. By scanning the list from January 2000 to now, this is the
current Python Premier League for Posts (truncated at position 10):

01) merits of Lisp vs Python |  832 | Mark
Tarver   | December-2006
02) For review: PEP 308 - If-then-else expression|  728 |
Guido van Rossum  | February-2003
03) Python syntax in Lisp and Scheme |  665 |
mike420 at ziplip.com | October-2003
04) Python from Wise Guy's Viewpoint |  495 |
mike420 at ziplip.com | October-2003
05) Microsoft Hatred FAQ |  478 | Xah
Lee   | October-2005
06) Why is Python popular, while Lisp and Scheme aren't? |  430 | Oleg
| November-2002
07) Xah Lee's Unixism|  397 |
Pascal Bourguignon|  August-2004
08) PEP 285: Adding a bool type  |  361 |
Guido van Rossum  | March-2002
09) Jargons of Info Tech industry|  350 | Xah
Lee   | August-2005
10) PEP 3131: Supporting Non-ASCII Identifiers   |  326 |
Martin v. Lowis   | May-2007

(It may come screwed up in the mail, so for those interested I attach
the results in a small text file which contains the first 50
positions).
It has been generated with a simple Python script: you can find it at
the end of the message. It's slow as a turtle (mainly because of the
use of urllib and the sloppiness of my internet connection yesterday
evening), but it works. I obviously will accept all the suggestions
for improvements on the script, as I am only a Python amateurish
programmer.


So, please provide feedback, e.g. perhaps by answering these
questions:
- should non-ASCII identifiers be supported? why?


+1, obviously. As an external observer, it has been extremely
interesting to follow all the discussions that this PEP raised. It has
also been funny, as by reading some of the posts it seemed to me that
my grandmother knows more about unicode with respect to some
conclusions depicted there :-D :-D .
But keep them coming, they are a valuable resource for low-skilled
programmers like me, there is always something new to learn, really.


- would you use them if it was possible to do so? in what cases?


I will for my personal projects and for our internal applications that
will not go public. As for the usual objection:


I think your argument about isolated projects is flawed. It is not at
all unusual for code that was never intended to be public, whose authors
would have sworn that it will never ever be need to read by anyone
except themselves, to surprisingly go public at some point in the future.


raise NoWayItWillGoPublicError

For Public Domain code, I will surely stick with the standard coding
style we have right now. I thought we were all adults here.
I really imagine what it would happen if we gather all together around
a table for a Python-dining: as soon as PEP 3131 discussion pops in,
we would start throwing food to each other like 5 years-old puckish
boys :-D :-D

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.virgilio.it/infinity77/
***
* Python Premier League For Posts *
***

POST  SCORE   
AUTHORDATE

01) merits of Lisp vs Python |  832 | Mark 
Tarver  | December-2006
02) For review: PEP 308 - If-then-else expression|  728 | Guido 
van Rossum | February-2003
03) Python syntax in Lisp and Scheme |  665 | 
mike420 at ziplip.com| October-2003
04) Python from Wise Guy's Viewpoint |  495 | 
mike420 at ziplip.com| October-2003
05) Microsoft Hatred FAQ |  478 | Xah 
Lee  | October-2005
06) Why is Python popular, while Lisp and Scheme aren't? |  430 | Oleg  
   | November-2002
07) Xah Lee's Unixism|  397 | 
Pascal Bourguignon   | August-2004
08) PEP 285: Adding a bool type  |  361 | Guido 
van Rossum | March-2002
09) Jargons of Info Tech industry|  350 | Xah 
Lee  | August-2005
10) PEP 3131: Supporting Non-ASCII Identifiers   |  326 | 
quot;Martin v. L#246;wisquot; | May-2007
11) Xah's Edu Corner: What is Expressiveness in a Computer Langu |  306 | Xah 
Lee  | March-2006
12

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
PEP 3131 uses a similar definition to C# except that PEP 3131
 disallows formatting characters (category Cf). See section 9.4.2 of
 http://www.ecma-international.org/publications/standards/Ecma-334.htm

UAX#31 discusses formatting characters in 2.2, and recognizes that
there might be good reasons to allow (and ignore) them; however,
it recommends against doing so except in special cases.

So I decided to disallow them.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 Now look me in the eye and tell me that you find
 the mix of proper German and English keywords
 beautiful.

I can't admit that, but I find that using German
class and method names is beautiful. The rest around
it (keywords and names from the standard library)
are not English - they are Python.

(look me in the eye and tell me that def is
an English word, or that getattr is one)

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 A possible modification to the PEP would be to permit identifiers to
 also include \u and \U escape sequences (as some other
 languages already do).

Several languages do that (e.g. C and C++), but I deliberately left
this out, as I cannot see this work in a practical way. Also,
it could be added later as another extension if there is an actual
need.

 I think this would remove several of the objections: such as being
 unable to tell at a glance whether someone is trying to spoof your
 variable names,

If you are willing to run a script on the patch you receive, you
can perform that check even without having support for the \u
syntax in the language - either you convert to the \u notation,
and then check manually (converting back if all is fine), or you
have an automated check (e.g. at commit time) that checks for
conformance to the style guide.

 or being unable to do minor maintenance on code using
 character sets which your editor doesn't support: you just run the
 script which would be included with every copy of Python to restrict the
 character set of the source files to whatever character set you feel
 happy with. The script should also be able to convert unrepresentable
 characters in strings and comments (although that last operation
 wouldn't be guaranteed reversible). 

Again, if it's reversible, you don't need support for it in the
language. You convert to your editor's supported Unicode subset,
edit, then convert back.

However, I somewhat doubt that this case my editor cannot display
my source code is likely to occur: if the editor cannot display
it, you likely have a ban on those characters, anyway.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Bjoern Schliessmann
Martin v. Löwis wrote:
 I can't admit that, but I find that using German
 class and method names is beautiful. The rest around
 it (keywords and names from the standard library)
 are not English - they are Python.
 
 (look me in the eye and tell me that def is
 an English word, or that getattr is one)

He's got a point (a small one though). For example:

- self (can be changed though)
- is 
- with
- isinstance
- try

Regards,


Björn

-- 
BOFH excuse #435:

Internet shut down due to maintenance

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 Consequently, Python's keywords and even the standard library can
 exist with names being just symbols for many people.

I already told that on the py3k list: Until a week ago, I didn't know
why pass was chosen for the no action statement - with all my
English knowledge, I still could not understand why the opposite
of fail should mean no action.

Still, I have been using pass for more than 10 years now, without
ever questioning what it means in English, and I've successfully
used it as a token. Except for the first draft of Das Python-Buch,
where I, from memory, thought the statement should be skip;
I remembered it had four letters, and meant go to the next line.

Now I understand it is meaning 12 in Merriam-Webster's dictionary,
a) to decline to bid, double, or redouble in a card game, or b)
to let something go by without accepting or taking
advantage of it.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 IMO, the burden of proof is on you. If this PEP has the potential to
 introduce another hindrance for code-sharing, the supporters of this PEP
 should be required to provide a damn good reason for doing so. So far,
 you have failed to do that, in my opinion. All you have presented are
 vague notions of rare and isolated use-cases.

The PEP explicitly states what the damn good reason is: Such developers
often desire to define classes and functions with names in their native
languages, rather than having to come up with an (often incorrect)
English translation of the concept they want to name.

So the reason is that with this PEP, code clarity and readability will
become better. It's the same reason as for many other features
introduced into Python recently, e.g. the with statement.

If you doubt the claim, please indicate which of these three aspects
you doubt:
1. there are programmers which desire to defined classes and functions
   with names in their native language.
2. those developers find the code clearer and more maintainable than
   if they had to use English names.
3. code clarity and maintainability is important.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 You could say the same about Python standard library and keywords then.
 Shouldn't these also have to be translated? One can even push things a
 little further: I don't know about the languages used in the countries
 you mention, but for example, a simple construction like 'if condition
 do something' will look weird to a Japanese (the Japanese language has
 a post-fix feel: the equivalent of the 'if' is put after the
 condition). So why enforce an English-like sentence structure?

The Python syntax does not use an English-like sentence structure.
In English, a statement follows the pretty strict sequence of subject,
predicate, object (SPO). In Python, statements don't have a subject;
some don't even have a verb (e.g. assignments).

Regardless, this PEP does not propose to change the syntax of the
language, because doing so would cause technical problems - unlike
the proposed PEP, which does not cause any technical problems to
the language implementation whatsoever (and only slight technical
problems to editors, which aren't worse than the ones cause by
PEP 263).

 You have a point here. When learning to program, or when programming for
 fun without any intention to do something serious, it may be better to
 have a language supporting native characters in identifiers. My
 problem is: if you allow these, how can you prevent them from going
 public someday?

You can't, and you shouldn't. What you can prevent is that the code
enters *your* project. I cannot see why you want to censor what code
other people publish.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Hendrik van Rooyen


  Now look me in the eye and tell me that you find
  the mix of proper German and English keywords
  beautiful.
 
 I can't admit that, but I find that using German
 class and method names is beautiful. The rest around
 it (keywords and names from the standard library)
 are not English - they are Python.
 
 (look me in the eye and tell me that def is
 an English word, or that getattr is one)
 
 Regards,
 Martin

LOL - true - but a broken down assembler programmer like me
does not use getattr - and def is short for define, and for and while
and in are not German.

Looks like you have stirred up a hornets nest...

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Sion Arrowsmith
Hendrik van Rooyen [EMAIL PROTECTED] wrote:
Sion Arrowsmith [EMAIL PROTECTED] wrote:
Hendrik van Rooyen wrote:
I still don't like the thought of the horrible mix of foreign
identifiers and English keywords, coupled with the English 
sentence construction.
How do you think you'd feel if Python had less in the way of
(conventionally used) English keywords/builtins. Like, say, Perl?
Would not like it at all, for the same reason I don't like re's -
It looks like random samples out of alphabet soup to me.

What I meant was, would the use of foreign identifiers look so
horrible to you if the core language had fewer English keywords?
(Perhaps Perl, with its line-noise, was a poor choice of example.
Maybe Lisp would be better, but I'm not so sure of my Lisp as to
make such an assertion for it.)

-- 
\S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/
   Frankly I have no feelings towards penguins one way or the other
-- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 So, please provide feedback, e.g. perhaps by answering these
 questions:
 - should non-ASCII identifiers be supported? why?
 
 I think the biggest argument against this PEP is how little similar
 features are used in other languages and how poorly they are supported
 by third party utilities.  Your PEP gives very little thought to how
 the change would affect the standard Python library.  Are non-ASCII
 identifiers going to be poorly supported in Python's own library and
 utilities?

For other languages (in particular Java), one challenge is that
you don't know the source encoding - it's neither fixed, nor is
it given in the source code file itself.

Instead, the environment has to provide the source encoding, and that
makes it difficult to use. The JDK javac uses the encoding from the
locale, which is non-sensical if you check-out source from a
repository. Eclipse has solved the problem: you can specify source
encoding on a per-project basis, and it uses that encoding
consistently in the editor and when running the compiler.

For Python, this problem was solved long ago: PEP 263 allows to
specify the source encoding within the file, and there was
always a default encoding. The default encoding will change to
UTF-8 in Python 3.

IDLE has been supporting PEP 263 from the beginning, and several
other editors support it as well. Not sure what other tools
you have in mind, and what problems you expect.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
René Fleschenberg schrieb:
 Stefan Behnel schrieb:
 Then get tools that match your working environment.
 
 Integration with existing tools *is* something that a PEP should
 consider. This one does not do that sufficiently, IMO.

What specific tools should be discussed, and what specific problems
do you expect?

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 In the code I was looking at identifiers were allowed to use non-ASCII
 characters.  For whatever reason, the programmers choose not use non-ASCII
 indentifiers even though they had no problem using non-ASCII characters
 in commonets.

One possible reason is that the tools processing the program would not
know correctly what encoding the source file is in, and would fail
when they guessed the encoding incorrectly. For comments, that is not
a problem, as an incorrect encoding guess has no impact on the meaning
of the program (if the compiler is able to read over the comment
in the first place).

Another possible reason is that the programmers were unsure whether
non-ASCII identifiers are allowed.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 After 175 replies (and counting), the only thing that is clear is the
 controversy around this PEP. Most people are very strong for or
 against it, with little middle ground in between. I'm not saying that
 every change must meet 100% acceptance, but here there is definitely a
 strong opposition to it. Accepting this PEP would upset lots of people
 as it seems, and it's interesting that quite a few are not even native
 english speakers.

I believe there is a lot of middle ground, but those people don't speak
up. I interviewed about 20 programmers (none of them Python users), and
most took the position I might not use it myself, but it surely
can't hurt having it, and there surely are people who would use it.
2 people were strongly in favor, and 3 were strongly opposed.

Of course, those people wouldn't take a lot of effort to defend their
position in a usenet group. So that the majority of the responses
comes from people with strong feelings either way is no surprise.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 However, what I want to see is how people deal with such issues when
 sharing their code: what are their experiences and what measures do
 they mandate to make it all work properly? You can see some
 discussions about various IDEs mandating UTF-8 as the default
 encoding, along with UTF-8 being the required encoding for various
 kinds of special Java configuration files. 

I believe the problem is solved when everybody uses Eclipse.
You can set a default encoding for all Java source files in a project,
and you check the project file into your source repository.
Eclipse both provides the editor and drives the compiler, and
does so in a consistent way.

 Yes, it should reduce confusion at a technical level. But what about
 the tools, the editors, and so on? If every computing environment had
 decent UTF-8 support, wouldn't it be easier to say that everything has
 to be in UTF-8? 

For both Python and Java, it's too much historical baggage already.
When source encodings were introduced to Python, allowing UTF-8
only was already proposed. People rejected it at the time, because
a) they had source files where weren't encoded in UTF-8, and
   were afraid of breaking them, and
b) their editors would not support UTF-8.

So even with Python 3, UTF-8 is *just* the default default encoding.
I would hope that all Python IDEs, over time, learn about this
default, until then, users may have to manually configure their
IDEs and editors. With a default of UTF-8, it's still simpler than
with PEP 263: you can say that .py files are UTF-8, and your
editor will guess incorrectly only if there is an encoding
declaration other than UTF-8.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 I claim that this is *completely unrealistic*. When learning Python, you
 *do* learn the actual meanings of English terms like open,
 exception, if and so on if you did not know them before. It would be
 extremely foolish not to do so.

Having taught students for many years now, I can report that this is
most certainly *not* the case. Many people learn only ever the technical
meaning of some term, and never grasp the English meaning. They could
look into a dictionary, but they rather read the documentation.

I've reported this before, but happily do it again: I have lived many
years without knowing what a hub is, and what to pass means if
it's not the opposite of to fail. Yet, I have used their technical
meanings correctly all these years.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath
Martin v. Löwis schrieb:

 I've reported this before, but happily do it again: I have lived many
 years without knowing what a hub is, and what to pass means if
 it's not the opposite of to fail. Yet, I have used their technical
 meanings correctly all these years.

That's not only true for computer terms.
In the German Viennese slang there are a lot of Italian, French, 
Hungarian, Czech, Hebrew and Serbocroatien words. Nobody knows the exact 
meaning in their original language (nor does the vast majority actually 
speak those languages), but all are used in the correct original context.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread rurpy
On May 16, 8:49 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] schrieb:
  2) Create a way to internationalize the standard library (and possibly
  the language keywords, too). Ideally, create a general standardized way
  to internationalize code, possibly similiar to how people
  internationalize strings today.
 
  Why?  Or more acurately why before adopting the PEP?
  The library is very usable by non-english speakers as long as
  there is documentation in their native language.  It would be

 Microsoft once translated their VBA to foreign languages.
 I didn't use it because I was used to English code.
 If I program in mixed cultural contexts I have to use to smallest
 dominator. Mixing the symbols of the programming language is confusing.

Yup, I agree wholeheartedly.  So do almost all
the other people who have responded in this thread.
In public code, open source code, code being worked
on by people from different countries, English is almost
always the best choice.

Nothing in the PEP interferes with or prevents this.
The PEP only allows non-ascii indentifiers, when they
are appropriate: in code that is unlikely to be ever
be touched by people who don't know that language.
(Obviously any language feature can be misused
but peer-pressure, documentation, and education
have been very effective in preventing such misuse.
There is no reason they shouldn't be effective
here too.)

And yes, some code will be developed in a single
language enviroment and then be found to be useful
to a wider audience.  It's not the end of the world.
It is no worse than when code written with a single
language UI that is becomes public -- it will get
fixed so that it meets the standards for a internationaly
collaborative project.  Seems to me that replacing
identifiers with english ones is fairly trivial
isn't it?  One can identify identifiers by parsing
the program and replacing them from a prepared table
of replacements?  This seems much easier than fixing
comments and docstrings which need to be done by
hand.  But the comment/docstring problem exists now
and has nothing to do with the PEP.

 Long time ago at the age of 12 I learned programming using English
 Computer books. Then there were no German books at all. It was not easy.
 It would have been completely impossible if our schools system would not
 have been wise enough to teach as English early.

 I think millions of people are handicapped because of this.
 Any step to improve this, is a good step for all of us. In no doubt
 there are a lot of talents wasted because of this wall.

I agree that anyone who wants to be a programmer is
well advised to learn English.  I would also advise
anyone who wants to be a programmer to go to college.
But I have met very good programmers who were not
college graduates and although I don't know any non-
english speakers I am sure there are very good programers
who don't know English.

There is a big difference between encouraging someone
to do something, and taking steps to make them do
something.

A lot of the english-only retoric in this thread seems
very reminiscent of arguments a decade+ ago regarding
wide characters and unicode, and other i18n support.
Computing is ascii-based, we don't need all this
crap, and besides, it doubles the memory used by strings!
English is good enough.  Except of course that it wasn't.

When technology demands that people adapt to it, it looses.
When technology adapts to the needs of people, it wins.

The fundamental question is whether languages designers,
or the people writing the code, should be the ones to
decide what language identifiers are most appropriate
for their program.  Do language designers, all of whom
are English speakers, have the wisdom to decide for
programmers all over the world, and for years to come,
that they must learn English to use Python effectively?
And if they do, will the people affected agree, or
will they choose a different language?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert
On May 16, 11:09 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] schrieb:

  On May 16, 12:54 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
  Istvan Albert schrieb:

  So the solution is to forbid Chinese XP ?

Who said anything like that? It's just an example of surprising and
unexpected difficulties that may arise even when doing trivial things,
and that proponents do not seem to want to admit to.

 Should computer programming only be easy accessible to a small fraction
 of privileged individuals who had the luck to be born in the correct
 countries?

 Should the unfounded and maybe xenophilous fear of loosing power and
 control of a small number of those already privileged be a guide for
 development?

Now that right there is your problem. You are reading a lot more into
this than you should. Losing power, xenophilus(?) fear, privileged
individuals,

just step back and think about it for a second, it's a PEP and people
have different opinions, it is very unlikely that there is some
generic sinister agenda that one must be subscribed to

i.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 I'd suggest restricting identifiers under the rules of UTS-39,
 profile 2, Highly Restrictive.  This limits mixing of scripts
 in a single identifier; you can't mix Hebrew and ASCII, for example,
 which prevents problems with mixing right to left and left to right
 scripts.  Domain names have similar restrictions.

That sounds interesting, however, I cannot find the document
your refer to. In TR 39 (also called Unicode Technical Standard #39),
at http://unicode.org/reports/tr39/ there is no mentioning
of numbered profiles, or Highly Restrictive.

Looking at the document, it seems 3.1., General Security Profile
for Identifiers might apply. IIUC, xidmodifications.txt would
have to be taken into account.

I'm not quite sure what that means; apparently, a number of
characters (listed as restricted) should not be used in
identifiers. OTOH, it also adds HYPHEN-MINUS and KATAKANA
MIDDLE DOT - which surely shouldn't apply to Python
identifiers, no? (at least HYPHEN-MINUS already has a meaning
in Python, and cannot possibly be part of an identifier).

Also, mixed-script detection might be considered, but it is
not clear to me how to interpret the algorithm in section
5, plus it says that this is just one of the possible
algorithms.

Finally, Confusable Detection is difficult to perform on
a single identifier - it seems you need two of them to
find out whether they are confusable.

In any case, I added this as an open issue to the PEP.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Istvan Albert
On May 17, 9:07 am, Martin v. Löwis [EMAIL PROTECTED] wrote:

 up. I interviewed about 20 programmers (none of them Python users), and
 most took the position I might not use it myself, but it surely
 can't hurt having it, and there surely are people who would use it.

Typically when you ask people about esoteric features that seemingly
don't affect them but might be useful to someone, the majority will
say yes. Its simply common courtesy, its is not like they have to do
anything.

At the same time it takes some mental effort to analyze and understand
all the implications of a feature, and without taking that effort
something will always beat nothing.

After the first time that your programmer friends need fix a trivial
bug in a piece of code that does not display correctly in the terminal
I can assure you that their mellow acceptance will turn to something
entirely different.

i.





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread rurpy
On May 17, 4:56 am, Martin v. Löwis [EMAIL PROTECTED] wrote:
...
 (look me in the eye and tell me that def is
 an English word, or that getattr is one)

That's not quite fair.  They are not english
words but they are derived from english and
have a memonic value to english speakers that
they don't (or only accidently) have for
non-english speakers.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath
Istvan Albert schrieb:
 
 After the first time that your programmer friends need fix a trivial
 bug in a piece of code that does not display correctly in the terminal
 I can assure you that their mellow acceptance will turn to something
 entirely different.
 

Is there any difference for you in debugging this code snippets?

class Türstock(object):
   höhe = 0
   breite = 0
   tiefe = 0

   def _get_fläche(self):
 return self.höhe * self.breite

   fläche = property(_get_fläche)

#---

class Tuerstock(object):
   hoehe = 0
   breite = 0
   tiefe = 0

   def _get_flaeche(self):
 return self.hoehe * self.breite

   flaeche = property(_get_flaeche)


I can tell you that for me and for my costumers this makes a big difference.

Whether this PEP gets accepted or not I am going to use German 
identifiers and you have to be frightened to death by that fact ;-)

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden
Istvan Albert wrote:
 On May 17, 9:07 am, Martin v. Löwis [EMAIL PROTECTED] wrote:
 
 up. I interviewed about 20 programmers (none of them Python users), and
 most took the position I might not use it myself, but it surely
 can't hurt having it, and there surely are people who would use it.
 
 Typically when you ask people about esoteric features that seemingly
 don't affect them but might be useful to someone, the majority will
 say yes. Its simply common courtesy, its is not like they have to do
 anything.
 
 At the same time it takes some mental effort to analyze and understand
 all the implications of a feature, and without taking that effort
 something will always beat nothing.
 
Indeed. For example, getattr() and friends now have to accept Unicode 
arguments, and presumably to canonicalize correctly to avoid errors, and 
treat equivalent Unicode and ASCII names as the same (question: if two 
strings compare equal, do they refer to the same name in a namespace?).

 After the first time that your programmer friends need fix a trivial
 bug in a piece of code that does not display correctly in the terminal
 I can assure you that their mellow acceptance will turn to something
 entirely different.
 
And pretty quickly, too.  If anyone but Martin were the author of the 
PEP I'd have serious doubts, but if he thinks it's worth proposing 
there's at least a chance that it will eventually be implemented.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden
Gregor Horvath wrote:
 Istvan Albert schrieb:
 After the first time that your programmer friends need fix a trivial
 bug in a piece of code that does not display correctly in the terminal
 I can assure you that their mellow acceptance will turn to something
 entirely different.

 
 Is there any difference for you in debugging this code snippets?
 
 class Türstock(object):
höhe = 0
breite = 0
tiefe = 0
 
def _get_fläche(self):
  return self.höhe * self.breite
 
fläche = property(_get_fläche)
 
 #---
 
 class Tuerstock(object):
hoehe = 0
breite = 0
tiefe = 0
 
def _get_flaeche(self):
  return self.hoehe * self.breite
 
flaeche = property(_get_flaeche)
 
 
 I can tell you that for me and for my costumers this makes a big difference.
 
So you are selling to the clothing market? [I think you meant 
customers. God knows I have no room to be snitty about other people's 
typos. Just thought it might raise a smile].

 Whether this PEP gets accepted or not I am going to use German 
 identifiers and you have to be frightened to death by that fact ;-)
 
That's fine - they will be at least as meaningful to you as my English 
ones would be to your countrymen who don't speah English.

I think we should remember that while programs are about communication 
there's no requirement for (most of) them to be universally comprehensible.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 At the same time it takes some mental effort to analyze and understand
 all the implications of a feature, and without taking that effort
 something will always beat nothing.

 Indeed. For example, getattr() and friends now have to accept Unicode
 arguments, and presumably to canonicalize correctly to avoid errors, and
 treat equivalent Unicode and ASCII names as the same (question: if two
 strings compare equal, do they refer to the same name in a namespace?).

Actually, that is not an issue: In Python 3, there is no data type for
ASCII string anymore, so all __name__ attributes and __dict__ keys
are Unicode strings - regardless of whether this PEP gets accepted
or not (which it just did).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]
On May 17, 2:30 pm, Gregor Horvath [EMAIL PROTECTED] wrote:
 Istvan Albert schrieb:



  After the first time that your programmer friends need fix a trivial
  bug in a piece of code that does not display correctly in the terminal
  I can assure you that their mellow acceptance will turn to something
  entirely different.

 Is there any difference for you in debugging this code snippets?

 class Türstock(object):
[snip]
 class Tuerstock(object):

After finding a platform where those are different, I have to say
yes.  Absolutely.  In my normal setup they both display as class
Tuerstock (three letters 'T' 'u' 'e' starting the class name).  If,
say, an exception was raised, it'd be fruitless for me to grep or
search for Tuerstock in the first one, and I might wind up wasting a
fair amount of time if a user emailed that to me before realizing that
the stack trace was just wrong.  Even if I had extended character
support, there's no guarantee that all the users I'm supporting do.
If they do, there's no guarantee that some intervening email system
(or whatever) won't munge things.

With the second one, all my standard tools would work fine.  My user's
setups will work with it.  And there's a much higher chance that all
the intervening systems will work with it.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Neil Hodgson
Martin v. Löwis:

 ... regardless of whether this PEP gets accepted
 or not (which it just did).

Which version can we expect this to be implemented in?

Neil
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
Neil Hodgson schrieb:
 Martin v. Löwis:
 
 ... regardless of whether this PEP gets accepted
 or not (which it just did).
 
Which version can we expect this to be implemented in?

The PEP says 3.0, and the planned implementation also targets
that release.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]
On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
 On May 16, 11:41 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
 wrote:

  Christophe wrote:
 snip...
   Who displays stack frames? Your code. Whose code includes unicode
   identifiers? Your code. Whose fault is it to create a stack trace
   display procedure that cannot handle unicode? You.

  Thanks but no--I work with a _lot_ of code I didn't write, and looking
  through stack traces from 3rd party packages is not uncommon.

 Are you worried that some 3rd-party package you have
 included in your software will have some non-ascii identifiers
 buried in it somewhere?  Surely that is easy to check for?
 Far easier that checking that it doesn't have some trojan
 code it it, it seems to me.

What do you mean, check for?  If, say, numeric starts using math
characters (as has been suggested), I'm not exactly going to stop
using numeric.  It'll still be a lot better than nothing, just
slightly less better than it used to be.

  And I'm often not creating a stack trace procedure, I'm using the
  built-in python procedure.

  And I'm often dealing with mailing lists, Usenet, etc where I don't
  know ahead of time what the other end's display capabilities are, how
  to fix them if they don't display what I'm trying to send, whether
  intervening systems will mangle things, etc.

 I think we all are in this position.  I always send plain
 text mail to mailing lists, people I don't know etc.  But
 that doesn't mean that email software should be contrainted
 to only 7-bit plain text, no attachements!  I frequently use
 such capabilities when they are appropriate.

Sure.  But when you're talking about maintaining code, there's a very
high value to having all the existing tools work with it whether
they're wide-character aware or not.

 If your response is, yes, but look at the problems html
 email, virus infected, attachements etc cause, the situation
 is not the same.  You have little control over what kind of
 email people send you but you do have control over what
 code, libraries, patches, you choose to use in your
 software.

 If you want to use ascii-only, do it!  Nobody is making
 you deal with non-ascii code if you don't want to.

Yes.  But it's not like this makes things so horribly awful that it's
worth my time to reimplement large external libraries.  I remain at -0
on the proposal; it'll cause some headaches for the majority of
current Python programmers, but it may have some benefits to a
sizeable minority and may help bring in new coders.  And it's not
going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Richard Hanson
On Sun, 13 May 2007 17:44:39 +0200, Martin v. Löwis wrote:

 The syntax of identifiers in Python will be based on the Unicode
 standard annex UAX-31 [1]_, with elaboration and changes as defined
 below.

 Within the ASCII range (U+0001..U+007F), the valid characters for
 identifiers are the same as in Python 2.5.  This specification only
 introduces additional characters from outside the ASCII range.  For
 other characters, the classification uses the version of the Unicode
 Character Database as included in the ``unicodedata`` module.

 The identifier syntax is ``ID_Start ID_Continue*``.

 ``ID_Start`` is defined as all characters having one of the general
 categories uppercase letters (Lu), lowercase letters (Ll), titlecase
 letters (Lt), modifier letters (Lm), other letters (Lo), letter numbers
 (Nl), plus the underscore (XXX what are stability extensions listed in
 UAX 31).

 ``ID_Continue`` is defined as all characters in ``ID_Start``, plus
 nonspacing marks (Mn), spacing combining marks (Mc), decimal number
 (Nd), and connector punctuations (Pc).


 [...]

.. [1] http://www.unicode.org/reports/tr31/

First, to Martin: Thanks for writing this PEP.

While I have been reading both sides of this debate and finding both
sides reasonable and understandable in the main, I have several
questions which seem to not have been raised in this thread so far. 

Currently, in Python 2.5, identifiers are specified as starting with
an upper- or lowercase letter or underscore ('_') with the following
characters of the identifier also optionally being a numerical digit
(0...9).

This current state seems easy to remember even if felt restrictive by
many.

Contrawise, the referenced document UAX-31 is a bit obscure to me
(which is not eased by the fact that various browsers render non-ASCII
characters differently or not at all depending on the setup and font
sets available). Further, a cursory perusing of the unicodedata module
seems to refer me back to the Unicode docs.

I note that UAX-31 seems to allow ideographs as ``ID_Start``, for
example. From my relative state of ignorance, several questions come
to mind:

1) Will this allow me to use, say, a right-arrow glyph (if I can
find one) to start my identifier? 

2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL
(reversed or mirrored) identifier? (Probably not, but I don't know.)

3) Is or will there be a definitive and exhaustive listing (with
bitmap representations of the glyphs to avoid the font issues) of the
glyphs that the PEP 3131 would allow in identifiers? (Does this
question even make sense?)

I have long programmed in RPL and have appreciated being able to use,
say, a right arrow symbol to start a name of a function (e.g., -R
or -HMS where the '-' is a single, right-arrow glyph).[1]

While it is not clear that identifiers I may wish to use would still
be prohibited under PEP 3131, I vote:

 +0

__
[1] RPL (HP's Dr. William Wickes' language and environment circa the
1980s) allows for a few specific non-ASCII glyphs as the start of a
name. I have solved my problem with my Python appliance computer
project by having up to three representations for my names: Python 2.x
acceptable names as the actual Python identifier, a Unicode text
display exposed to the end user, and also if needed, a bitmap display
exposed to the end user. So -- IAGNI. :-)

-- 
Richard Hanson

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread [EMAIL PROTECTED]
On May 16, 6:38 pm, [EMAIL PROTECTED] wrote:
 On May 16, 11:41 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
 wrote:

  Christophe wrote:
 snip...
   Who displays stack frames? Your code. Whose code includes unicode
   identifiers? Your code. Whose fault is it to create a stack trace
   display procedure that cannot handle unicode? You.

  Thanks but no--I work with a _lot_ of code I didn't write, and looking
  through stack traces from 3rd party packages is not uncommon.

 Are you worried that some 3rd-party package you have
 included in your software will have some non-ascii identifiers
 buried in it somewhere?  Surely that is easy to check for?
 Far easier that checking that it doesn't have some trojan
 code it it, it seems to me.

What do you mean, check for?  If, say, numeric starts using math
characters (as has been suggested), I'm not exactly going to stop
using numeric.  It'll still be a lot better than nothing, just
slightly less better than it used to be.

  And I'm often not creating a stack trace procedure, I'm using the
  built-in python procedure.

  And I'm often dealing with mailing lists, Usenet, etc where I don't
  know ahead of time what the other end's display capabilities are, how
  to fix them if they don't display what I'm trying to send, whether
  intervening systems will mangle things, etc.

 I think we all are in this position.  I always send plain
 text mail to mailing lists, people I don't know etc.  But
 that doesn't mean that email software should be contrainted
 to only 7-bit plain text, no attachements!  I frequently use
 such capabilities when they are appropriate.

Sure.  But when you're talking about maintaining code, there's a very
high value to having all the existing tools work with it whether
they're wide-character aware or not.

 If your response is, yes, but look at the problems html
 email, virus infected, attachements etc cause, the situation
 is not the same.  You have little control over what kind of
 email people send you but you do have control over what
 code, libraries, patches, you choose to use in your
 software.

 If you want to use ascii-only, do it!  Nobody is making
 you deal with non-ascii code if you don't want to.

Yes.  But it's not like this makes things so horribly awful that it's
worth my time to reimplement large external libraries.  I remain at -0
on the proposal; it'll cause some headaches for the majority of
current Python programmers, but it may have some benefits to a
sizeable minority and may help bring in new coders.  And it's not
going to cause flaming catastrophic death or anything.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Steve Holden
Martin v. Löwis wrote:
 Neil Hodgson schrieb:
 Martin v. Löwis:

 ... regardless of whether this PEP gets accepted
 or not (which it just did).
Which version can we expect this to be implemented in?
 
 The PEP says 3.0, and the planned implementation also targets
 that release.
 
Can we take it this change *won't* be backported to the 2.X series?

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
-- Asciimercial -
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.comsquidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-- Thank You for Reading 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Ross Ridge
=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=  [EMAIL PROTECTED] wrote:
One possible reason is that the tools processing the program would not
know correctly what encoding the source file is in, and would fail
when they guessed the encoding incorrectly. For comments, that is not
a problem, as an incorrect encoding guess has no impact on the meaning
of the program (if the compiler is able to read over the comment
in the first place).

Possibly.  One Java program I remember had Japanese comments encoded
in Shift-JIS.  Will Python be better here?  Will it support the source
code encodings that programmers around the world expect?

Another possible reason is that the programmers were unsure whether
non-ASCII identifiers are allowed.

If that's the case, I'm not sure how you can improve on that in Python.

There are lots of possible reasons why all these programmers around
the world who want to use non-ASCII identifiers end-up not using them.
One is simply that very people ever really want to do so.  However,
if you're to assume that they do, then you should look the existing
practice in other languages to find out what they did right and what
they did wrong.  You don't have to speculate.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  [EMAIL PROTECTED]
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Gregor Horvath
[EMAIL PROTECTED] schrieb:

 With the second one, all my standard tools would work fine.  My user's
 setups will work with it.  And there's a much higher chance that all
 the intervening systems will work with it.
 

Please fix your setup.
This is the 21st Century. Unicode is the default in Python 3000.
Wake up before it is too late for you.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 Currently, in Python 2.5, identifiers are specified as starting with
 an upper- or lowercase letter or underscore ('_') with the following
 characters of the identifier also optionally being a numerical digit
 (0...9).
 
 This current state seems easy to remember even if felt restrictive by
 many.
 
 Contrawise, the referenced document UAX-31 is a bit obscure to me

It's actually very easy. The basic principle will stay: the first
character must be a letter or an underscore, followed by letters,
underscores, and digits.

The question really is what is a letter? what is an underscore?
what is a digit?

 1) Will this allow me to use, say, a right-arrow glyph (if I can
 find one) to start my identifier? 

No. A right-arrow (such as U+2192, RIGHTWARDS ARROW) is a symbol
(general category Sm: Symbol, Math). See

http://unicode.org/Public/UNIDATA/UCD.html

for a list of general category values, and

http://unicode.org/Public/UNIDATA/UnicodeData.txt

for a textual description of all characters.

Now, there is a special case in that Unicode supports combining
modifier characters, i.e. characters that are not characters
themselves, but modify previous characters, to add diacritical
marks to letters. Unicode has great flexibility in applying these,
to form characters that are not supported themselves. Among those,
there is U+20D7, COMBINING RIGHT ARROW ABOVE, which is of general
category Mn, Mark, Nonspacing.

In PEP 3131, such marks may not appear as the first character
(since they need to modify a base character), but as subsequent
characters. This allows you to form identifiers such as
v⃗ (which should render as a small letter v, with an vector
arrow on top).

 2) Could an ``ID_Continue`` be used as an ``ID_Start`` if using a RTL
 (reversed or mirrored) identifier? (Probably not, but I don't know.)

Unicode, and this PEP, always uses logical order, not rendering order.
What matters is in what order the characters appear in the source code
string.

RTL languages do pose a challenge, in particular since bidirectional
algorithms apparently aren't implemented correctly in many editors.

 3) Is or will there be a definitive and exhaustive listing (with
 bitmap representations of the glyphs to avoid the font issues) of the
 glyphs that the PEP 3131 would allow in identifiers? (Does this
 question even make sense?)

It makes sense, but it is difficult to implement. The PEP already
links to a non-normative list that is exhaustive for Unicode 4.1.
Future Unicode versions may add additional characters, so the
a list that is exhaustive now might not be in the future. The
Unicode consortium promises stability, meaning that what is an
identifier now won't be reclassified as a non-identifier in the
future, but the reverse is not true, as new code points get
assigned.

As for the list I generated in HTML: It might be possible to
make it include bitmaps instead of HTML character references,
but doing so is a licensing problem, as you need a license
for a font that has all these characters. If you want to
lookup a specific character, I recommend to go to the Unicode
code charts, at

http://www.unicode.org/charts/

Notice that an HTML page that includes individual bitmaps
for all characters would take *ages* to load.

Regards,
Martin

P.S. Anybody who wants to play with generating visualisations
of the PEP, here are the functions I used:

def isnorm(c):
return unicodedata.normalize(NFC, c)

def start(c):
if not isnorm(c):
return False
if unicodedata.category(c) in ('Ll', 'Lt', 'Lm', 'Lo', 'Nl'):
return True
if c==u'_':
return True
if c in u\u2118\u212E\u309B\u309C:
return True
return False

def cont_only(c):
if not isnorm(c):
return False
if unicodedata.category(c) in ('Mn', 'Mc', 'Nd', 'Pc'):
return True
if 0x1369 = ord(c) = 0x1371:
return True
return False

def cont(c):
return start(c) or cont_only(c)

The isnorm() aspect excludes characters from the list which
change under NFC. This excludes a few compatibility characters
which are allowed in source code, but become indistinguishable
from their canonical form semantically.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-17 Thread Martin v. Löwis
 Possibly.  One Java program I remember had Japanese comments encoded
 in Shift-JIS.  Will Python be better here?  Will it support the source
 code encodings that programmers around the world expect?

It's not a question of will it. It does today, starting from Python 2.3.

 Another possible reason is that the programmers were unsure whether
 non-ASCII identifiers are allowed.
 
 If that's the case, I'm not sure how you can improve on that in Python.

It will change on its own over time. Not allowed could mean not
permitted by policy. Indeed, the PEP explicitly mandates a policy
that bans non-ASCII characters from source (whether in identifiers or
comments) for Python itself, and encourages other projects to define
similar policies. What projects pick up such a policy, or pick a
different policy (e.g. all comments must be in Korean) remains to
be seen.

Then, programmers will not be sure whether the language and the tools
allow it. For Python, it will be supported from 3.0, so people will
be worried initially whether their code needs to run on older Python
versions. When Python 3.5 comes along, people hopefully have lost
interest in supporting 2.x, so they will start using 3.x features,
including this one.

Now, it may be tempting to say ok, so lets wait until 3.5, if people
won't use it before anyway. That is trick logic: if we add it only
to 3.5, people won't be using it before 4.0. *Any* new feature
takes several years to get into wide acceptance, but years pass
surprisingly fast.

 There are lots of possible reasons why all these programmers around
 the world who want to use non-ASCII identifiers end-up not using them.
 One is simply that very people ever really want to do so.  However,
 if you're to assume that they do, then you should look the existing
 practice in other languages to find out what they did right and what
 they did wrong.  You don't have to speculate.

That's indeed how this PEP came about. There were early adapters, like
Java, then experience gained from it (resulting in PEP 263, implemented
in Python 2.3 on the Python side, and resulting in UAX#39 on the Unicode
consortium side), and that experience now flows into PEP 3131.

If you think I speculated in reasoning why people did not use the
feature in Java: sorry for expressing myself unclearly. I know for
a fact that the reasons I suggested were actual reasons given by
actual people. I'm just not sure whether this was an exhaustive
list (because I did not interview every programmer in the world),
and what statistical relevance each of these reasons had (because
I did not conduct a scientific research to gain statistically
relevant data on usage of non-ASCII identifiers in different
regions of the world).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Pierre Hanser
[EMAIL PROTECTED] a écrit :
 On May 15, 3:28 pm, René Fleschenberg [EMAIL PROTECTED] wrote:
 We all know what the PEP is about (we can read). The point is: If we do
 not *need* non-English/ASCII identifiers, we do not need the PEP. If the
 PEP does not solve an actual *problem* and still introduces some
 potential for *new* problems, it should be rejected. So far, the
 problem seems to just not exist. The burden of proof is on those who
 support the PEP.


it *does* solve a huge problem: i have to use degenerate french, with
orthographic mistakes, or select in a small subset of words to use
only ascii. I'm limited in my expression, and I ressent this
everyday!

This is true, even if commercial french programmers don't object
the pep because they have to use english in their own work. This
is something i really cannot understand.

it's a problem of everyday, for million people!

and yes sometimes i publish code (rarely), even if it uses french
identifiers, because someone looking after a real solution *does*
prefer an existing solution than nothing.


-- 
Pierre
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Steven D'Aprano schrieb:
 But they aren't new risks and problems, that's the point. So far, every 
 single objection raised ALREADY EXISTS in some form or another. 

No. The problem The traceback shows function names having characters
that do not display on most systems' screens for example does not exist
today, to the best of my knowledge. And in some form or another
basically means that the PEP would create more possibilities for things
to go wrong. That things can already go wrong today does not mean that
it does not matter if we create more occasions were things can go wrong
even worse.

 There's 
 all this hysteria about the problems the proposed change will cause, but 
 those problems already exist. When was the last time a Black Hat tried to 
 smuggle in bad code by changing an identifier from xyz0 to xyzO?

Agreed, I don't think intended malicious use of the proposed feature
would be a big problem.

 I think it is not. I think that the problem only really applies to very
 isolated use-cases. 
 
 Like the 5.5 billion people who speak no English.

No. The X people who speak no English and program in Python. I think X
actually is very low (close to zero), because programming in Python
virtually does require you to know some English, wether you can use
non-ASCII characters in identifiers or not. It is naive to believe that
you can program in Python without understanding any English once you can
use your native characters in identifiers. That will not happen. Please
understand that: You basically *must* know some English to program in
Python, and the reason for that is not that you cannot use non-ASCII
identifiers.

I admit that there may be occasions where you have domain-specific terms
that are hard to translate into English for a programmer. But is it
really not feasible to use an ASCII transliteration in these cases? This
does not seem to have been such a big problem so far, or else we would
have seen more discussions about it, I think.

 So isolated that they do not justify a change to
 mainline Python. If someone thinks that non-ASCII identifiers are really
 needed, he could maintain a special Python branch that supports them. I
 doubt that there would be alot of demand for it.
 
 Maybe so. But I guarantee with a shadow of a doubt that if the change 
 were introduced, people would use it -- even if right now they say they 
 don't want it.

Well, that is exactly what I would like to avoid ;-)

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Steven D'Aprano schrieb:
 Any program that uses non-English identifiers in Python is bound to
 become gibberish, since it *will* be cluttered with English identifiers
 all over the place anyway, wether you like it or not.
 
 It won't be gibberish to the people who speak the language.

Hmmm, did you read my posting? By my experience, it will. I wonder: is
English an acquired language for you?

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Gregor Horvath schrieb:
 If comments are allowed to be none English, then why are identifier not?

I don't need to be able to type in the exact characters of a comment in
order to properly change the code, and if a comment does not display on
my screen correctly, I am not as fscked as badly as when an identifier
does not display (e.g. in a traceback).

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath
René Fleschenberg schrieb:

 today, to the best of my knowledge. And in some form or another
 basically means that the PEP would create more possibilities for things
 to go wrong. That things can already go wrong today does not mean that
 it does not matter if we create more occasions were things can go wrong
 even worse.

Following this logic we should not add any new features at all, because 
all of them can go wrong and can be used the wrong way.

I love Python because it does not dictate how to do things.
I do not need a ASCII-Dictator, I can judge myself when to use this 
feature and when to avoid it, like any other feature.

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
[EMAIL PROTECTED] schrieb:
 I'm not sure how you conclude that no problem exists.
 - Meaningful identifiers are critical in creating good code.

I agree.

 - Non-english speakers can not create or understand
   english identifiers hence can't create good code nor
   easily grok existing code.

I agree that this is a problem, but please understand that is problem is
_not_ solved by allowing non-ASCII identifiers!

 Considering the vastly greater number of non-English
 spreakers in the world, who are not thus unable to use
 Python effectively, seems like a problem to me.

Yes, but this problem is not really addressed by the PEP. If you want to
do something about this:
1) Translate documentation.
2) Create a way to internationalize the standard library (and possibly
the language keywords, too). Ideally, create a general standardized way
to internationalize code, possibly similiar to how people
internationalize strings today.

When that is done, non-ASCII identifiers could become useful. But of
course, doing that might create a hog of other problems.

 That all programers know enough english to create and
 understand english identifiers is currently speculation or
 based on tiny personaly observed samples.

It is based on a look at the current Python environment. You do *at
least* have the problem that the standard library uses English names.
This assumes that there is documentation in the native language that is
good enough (i.e. almost as good as the official one), which I can tell
is not the case for German.

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Raffaele Salmaso
After reading all thread, and based on my experience (I'm italian, 
english is not my native language)

Martin v. Löwis wrote:

 - should non-ASCII identifiers be supported?
yes
 - why?
Years ago I've read C code written by a turkish guy, and all identifiers 
were transliteration of arab (persian? don't know) words.
What I've understand of this code? Nothing. 0 (zero ;) ). Not a word.
It would have been different if it was used unicode identifiers? Not at all.

 - would you use them if it was possible to do so?
yes

-- 
()_() | NN KAPISCO XK' CELLHAVETE T'ANNTO CN ME SL | +
(o.o) | XK' SKRIVO 1 P'HO VELLOCE MA HALL'ORA DITTELO  | +---+
'm m' | KE SIETE VOI K CI HAVVETE PROBBLEMI NO PENSATECI   |  O  |
(___) | HE SENZA RANKORI CIA   |
raffaele punto salmaso at gmail punto com
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Gregor Horvath schrieb:
 René Fleschenberg schrieb:
 
 today, to the best of my knowledge. And in some form or another
 basically means that the PEP would create more possibilities for things
 to go wrong. That things can already go wrong today does not mean that
 it does not matter if we create more occasions were things can go wrong
 even worse.
 
 Following this logic we should not add any new features at all, because
 all of them can go wrong and can be used the wrong way.

No, that does not follow from my logic. What I say is: When thinking
about wether to add a new feature, the potential benefits should be
weighed against the potential problems. I see some potential problems
with this PEP and very little potential benefits.

 I love Python because it does not dictate how to do things.
 I do not need a ASCII-Dictator, I can judge myself when to use this
 feature and when to avoid it, like any other feature.

*That* logic can be used to justify the introduction of *any* feature.

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel
On Tue, 15 May 2007 17:35:11 +0200, Stefan Behnel  
[EMAIL PROTECTED] wrote:
 Eric Brunel wrote:
 On Tue, 15 May 2007 15:57:32 +0200, Stefan Behnel
 In-house developers are rather for this PEP as they see the advantage  
 of
 expressing concepts in the way the non-techies talk about it.

 No: I *am* an in-house developer. The argument is not
 public/open-source against private/industrial. As I said in some of my
 earlier posts, any code can pass through many people in its life, people
 not having the same language. I dare to say that starting a project
 today in any other language than english is almost irresponsible: the
 chances that it will get at least read by people not talking the same
 language as the original coders are very close to 100%, even if it
 always stays private.

 Ok, so I'm an Open-Source guy who happens to work in-house. And I'm a
 supporter of PEP 3131. I admit that I was simplifying in my round-up. :)

 But I would say that irresponsible is a pretty self-centered word in  
 this
 context. Can't you imagine that those who take the irresponsible  
 decisions
 of working on (and starting) projects in another language than English  
 are
 maybe as responsible as you are when you take the decision of starting a
 project in English, but in a different context? It all depends on the  
 specific
 constraints of the project, i.e. environment, developer skills, domain,  
 ...

 The more complex an application domain, the more important is clear and
 correct domain terminology. And software developers just don't have  
 that. They
 know their own domain (software development with all those concepts,  
 languages
 and keywords), but there is a reason why they develop software for those  
 who
 know the complex professional domain in detail but do not know how to  
 develop
 software. And it's a good idea to name things in a way that is  
 consistent with
 those who know the professional domain.

 That's why keywords are taken from the domain of software development and
 identifiers are taken (mostly) from the application domain. And that's  
 why I
 support PEP 3131.

You keep eluding the question: even if the decisions made at the project  
start seem quite sensible *at that time*, if the project ends up  
maintained in Korea, you *will have* to translate all your identifiers to  
something displayable, understandable and typable by (almost) anyone,  
a.k.a ASCII-English... Since - as I already said - I'm quite convinced  
that any application bigger than the average quick-n-dirty throwable  
script is highly likely to end up in a different country than its original  
coders', you'll end up losing the time you appeared to have gained in the  
beginning. That's what I called irresponsible (even if I admit that the  
word was a bit strong...).

Anyway, concerning the PEP, I've finally put some water in my wine as we  
say in French, and I'm not so strongly against it now... Not for the  
reasons you give (so we can continue our flame war on this ;-) ), but  
mainly considering Python's usage in a learning context: this is a valid  
reason why non-ASCII identifiers should be supported. I just wish I'll get  
a '--ascii-only' switch on my Python interpreter (or any other means to  
forbid non-ASCII identifiers and/or strings and/or comments).
-- 
python -c print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
René Fleschenberg wrote:
 [EMAIL PROTECTED] schrieb:
 I'm not sure how you conclude that no problem exists.
 - Meaningful identifiers are critical in creating good code.
 
 I agree.
 
 - Non-english speakers can not create or understand
   english identifiers hence can't create good code nor
   easily grok existing code.
 
 I agree that this is a problem, but please understand that is problem is
 _not_ solved by allowing non-ASCII identifiers!

Well, as I said before, there are three major differences between the stdlib
and keywords on one hand and identifiers on the other hand. Ignoring arguments
does not make them any less true.

So, the problem is partly tackled by the people who face it by writing
degenerated transliterations and language mix in identifiers, but it would be
*solved* by means of the language if Unicode identifiers were available.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Marc 'BlackJack' Rintsch
In [EMAIL PROTECTED], Stefan Behnel wrote:

 René Fleschenberg wrote:
 We all know what the PEP is about (we can read). The point is: If we do
 not *need* non-English/ASCII identifiers, we do not need the PEP. If the
 PEP does not solve an actual *problem* and still introduces some
 potential for *new* problems, it should be rejected. So far, the
 problem seems to just not exist. The burden of proof is on those who
 support the PEP.
 
 The main problem here seems to be proving the need of something to people who
 do not need it themselves. So, if a simple but I need it because a, b, c is
 not enough, what good is any further prove?

Maybe all the (potential) programmers that can't understand english and
would benefit from the ability to use non-ASCII characters in identifiers
could step up and take part in this debate.  In an english speaking
newsgroup…  =:o)

There are potential users of Python who don't know much english or no
english at all.  This includes kids, old people, people from countries
that have letters that are not that easy to transliterate like european
languages, people who just want to learn Python for fun or to customize
their applications like office suites or GIS software with a Python
scripting option.

Some people here seem to think the user base is or should be only from the
computer science domain.  Yes, if you are a programming professional it
may be mandatory to be able to write english identifiers, comments and
documentation, but there are not just programming professionals out there.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
René Fleschenberg wrote:
 Gregor Horvath schrieb:
 If comments are allowed to be none English, then why are identifier not?
 
 I don't need to be able to type in the exact characters of a comment in
 order to properly change the code, and if a comment does not display on
 my screen correctly, I am not as fscked as badly as when an identifier
 does not display (e.g. in a traceback).

Then get tools that match your working environment.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread gatti
Martin v. Lowis wrote:
 Lorenzo Gatti wrote:
 Not providing an explicit listing of allowed characters is inexcusable
 sloppiness.

 That is a deliberate part of the specification. It is intentional that
 it does *not* specify a precise list, but instead defers that list
 to the version of the Unicode standard used (in the unicodedata
 module).

Ok, maybe you considered listing characters but you earnestly decided
to follow an authority; but this reliance on the Unicode standard is
not a merit: it defers to an external entity (UAX 31 and the Unicode
database) a foundation of Python syntax.
The obvious purpose of Unicode Annex 31 is defining a framework for
parsing the identifiers of arbitrary programming languages, it's only,
in its own words, specifications for recommended defaults for the use
of Unicode in the definitions of identifiers and in pattern-based
syntax. It suggests an orderly way to add tens of thousands of exotic
characters to programming language grammars, but it doesn't prove it
would be wise to do so.

You seem to like Unicode Annex 31, but keep in mind that:
- it has very limited resources (only the Unicode standard, i.e. lists
and properties of characters, and not sensible programming language
design, software design, etc.)
- it is culturally biased in favour of supporting as much of the
Unicode character set as possible, disregarding the practical
consequences and assuming without discussion that programming language
designers want to do so
- it is also culturally biased towards the typical Unicode patterns of
providing well explained general algorithms, ensuring forward
compatibility, and relying on existing Unicode standards (in this
case, character types) rather than introducing new data (but the
character list of Table 3 is unavoidable); the net result is caring
even less for actual usage.

 The XML standard is an example of how listings of large parts of the
 Unicode character set can be provided clearly, exactly and (almost)
 concisely.

 And, indeed, this is now recognized as one of the bigger mistakes
 of the XML recommendation: they provide an explicit list, and fail
 to consider characters that are unassigned. In XML 1.1, they try
 to address this issue, by now allowing unassigned characters in
 XML names even though it's not certain yet what those characters
 mean (until they are assigned).

XML 1.1 is, for practical purposes, not used except by mistake. I
challenge you to show me XML languages or documents of some importance
that need XML 1.1 because they use non-ASCII names.
XML 1.1 is supported by many tools and standards because of buzzword
compliance, enthusiastic obedience to the W3C and low cost of
implementation, but this doesn't mean that its features are an
improvement over XML 1.0.

 ``ID_Continue`` is defined as all characters in ``ID_Start``, plus
 nonspacing marks (Mn), spacing combining marks (Mc), decimal number
 (Nd), and connector punctuations (Pc).

 Am I the first to notice how unsuitable these characters are?

 Probably. Nobody in the Unicode consortium noticed, but what
 do they know about suitability of Unicode characters...

Don't be silly. These characters are suitable for writing text, not
for use in identifiers; the fact that UAX 31 allows them merely proves
how disconnected from actual programming language needs that document
is.

In typical word processing, what characters are used is the editor's
problem and the only thing that matters is the correctness of the
printed result; program code is much more demanding, as it needs to do
more (exact comparisons, easy reading...) with less (straightforward
keyboard inputs and monospaced fonts instead of complex input systems
and WYSIWYG graphical text). The only way to work with program text
successfully is limiting its complexity.
Hard to input characters, hard to see characters, ambiguities and
uncertainty in the sequence of characters, sets of hard to distinguish
glyphs and similar problems are unacceptable.

It seems I'm not the first to notice a lot of Unicode characters that
are unsuitable for identifiers. Appendix I of the XML 1.1 standard
recommends to avoid variation selectors, interlinear annotations (I
missed them...), various decomposable characters, and names which are
nonsensical, unpronounceable, hard to read, or easily confusable with
other names.
The whole appendix I is a clear admission of self-defeat, probably the
result of committee compromises.  Do you think you could do better?

Regards,
Lorenzo Gatti



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Christophe
[EMAIL PROTECTED] a écrit :
 Steven D'Aprano wrote:
 I would find it useful to be able to use non-ASCII characters for heavily
 mathematical programs. There would be a closer correspondence between the
 code and the mathematical equations if one could write D(u*p) instead of
 delta(mu*pi).
 
 Just as one risk here:
 When reading the above on Google groups, it showed up as if one could
 write ?(u*p)...
 When quoting it for response, it showed up as could write D(u*p).
 
 I'm sure that the symbol you used was neither a capital letter d nor a
 question mark.
 
 Using identifiers that are so prone to corruption when posting in a
 rather popular forum seems dangerous to me--and I'd guess that a lot
 of source code highlighters, email lists, etc have similar problems.
 I'd even be surprised if some programming tools didn't have similar
 problems.

So, it was google groups that continuously corrupted the good UTF-8 
posts by force converting them to ISO-8859-1?

Of course, there's also the possibility that it is a problem on *your* 
side so, to be fair I've launched google groups and looked for this 
thread. And of course the result was that Steven's post displayed 
perfectly. I didn't try to reply to it of course, no need to clutter 
that thread anymore than it is.

-- 
Δ(µ*π)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel
On Tue, 15 May 2007 21:07:30 +0200, Pierre Hanser  
[EMAIL PROTECTED] wrote:
 hello

 i work for a large phone maker, and for a long time
 we thought, very arrogantly, our phones would be ok
 for the whole world.

 After all, using a phone uses so little words, and
 some of them where even replaced with pictograms!
 every body should be able to understand appel, bis,
 renvoi, mévo, ...

 nowdays we make chinese, corean, japanese talking
 phones.

 because we can do it, because graphics are cheaper
 than they were, because it augments our market.
 (also because some markets require it)

 see the analogy?

Absolutely not: you're talking about internationalization of the  
user-interface here, not about the code. There are quite simple ways to  
ensure users will see the displays in their own language, even if the  
source code is the same for everyone. But your source code will not  
automagically translate itself to the language of the guy who'll have to  
maintain it or make it evolve. So the analogy actually seems to work  
backwards: if you want any coder to be able to read/understand/edit your  
code, just don't write it in your own language...
-- 
python -c print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Stefan Behnel schrieb:
 Then get tools that match your working environment.

Integration with existing tools *is* something that a PEP should
consider. This one does not do that sufficiently, IMO.

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
[EMAIL PROTECTED] wrote:
 I even sometimes
 read code snippets on email lists and websites from my handheld, which
 is sadly still memory-limited enough that I'm really unlikely to
 install anything approaching a full set of Unicode fonts.

One of the arguments against this PEP was that it seemed to be impossible to
find either transliterated identifiers in code or native identifiers in Java
code using a web search. So it is very unlikely that you will need to upgrade
your handheld as it is very unlikely for you to stumble into such code.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Stefan Behnel schrieb:
 - Non-english speakers can not create or understand
   english identifiers hence can't create good code nor
   easily grok existing code.
 I agree that this is a problem, but please understand that is problem is
 _not_ solved by allowing non-ASCII identifiers!
 
 Well, as I said before, there are three major differences between the stdlib
 and keywords on one hand and identifiers on the other hand. Ignoring arguments
 does not make them any less true.

BTW: Please stop replying to my postings by E-Mail (in Thunderbird, use
Reply in stead of Reply to all).

I agree that keywords are a different matter in many respects, but the
only difference between stdlib interfaces and other intefaces is that
the stdlib interfaces are part of the stdlib. That's it. You are still
ignoring the fact that, contrary to what has been suggested in this
thread, it is _not_ possible to write German or Chinese Python
without cluttering it up with many many English terms. It's not only the
stdlib, but also many many third party libraries. Show me one real
Python program that is feasibly written without throwing in tons of
English terms.

Now, very special environments (what I called rare and isolated
earlier) like special learning environments for children are a different
matter. It should be ok if you have to use a specially patched Python
branch there, or have to use an interpreter option that enables the
suggested behaviour. For general programming, it IMO is a bad idea.

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Marc 'BlackJack' Rintsch schrieb:
 There are potential users of Python who don't know much english or no
 english at all.  This includes kids, old people, people from countries
 that have letters that are not that easy to transliterate like european
 languages, people who just want to learn Python for fun or to customize
 their applications like office suites or GIS software with a Python
 scripting option.

Make it an interpreter option that can be turned on for those cases.

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
Eric Brunel wrote:
 reason why non-ASCII identifiers should be supported. I just wish I'll
 get a '--ascii-only' switch on my Python interpreter (or any other means
 to forbid non-ASCII identifiers and/or strings and/or comments).

I could certainly live with that as it would be the right way around. Support
Unicode by default, but allow those who require the lowest common denominator
to enforce it.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread René Fleschenberg
Stefan Behnel schrieb:
 *Your* logic can be used to justify dropping *any* feature.

No. I am considering both the benefits and the problems. You just happen
to not like the outcome of my considerations [again, please don't reply
by E-Mail, I read the NG].

-- 
René
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Eric Brunel
On Wed, 16 May 2007 02:14:58 +0200, Steven D'Aprano  
[EMAIL PROTECTED] wrote:

 On Tue, 15 May 2007 09:09:30 +0200, Eric Brunel wrote:

 Joke aside, this just means that I won't ever be able to program math in
 ADA, because I have absolutely no idea on how to do a 'pi' character on
 my keyboard.

 Maybe you should find out then? Personal ignorance is never an excuse for
 rejecting technology.

My personal ignorance is fine, thank you; how is yours?: there is no  
keyboard *on Earth* allowing to type *all* characters in the whole Unicode  
set. So my keyboard may just happen to provide no means at all to type a  
greek 'pi', as it doesn't provide any to type Chinese, Japanese, Korean,  
Russian, Hebrew, or whatever character set that is not in usage in my  
country. And so are all keyboards all over the world.

Have I made my point clear or do you require some more explanations?
-- 
python -c print ''.join([chr(154 - ord(c)) for c in  
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
René Fleschenberg wrote:
 Stefan Behnel schrieb:
 - Non-english speakers can not create or understand
   english identifiers hence can't create good code nor
   easily grok existing code.
 I agree that this is a problem, but please understand that is problem is
 _not_ solved by allowing non-ASCII identifiers!
 Well, as I said before, there are three major differences between the stdlib
 and keywords on one hand and identifiers on the other hand. Ignoring 
 arguments
 does not make them any less true.
 
 I agree that keywords are a different matter in many respects, but the
 only difference between stdlib interfaces and other intefaces is that
 the stdlib interfaces are part of the stdlib. That's it. You are still
 ignoring the fact that, contrary to what has been suggested in this
 thread, it is _not_ possible to write German or Chinese Python
 without cluttering it up with many many English terms. It's not only the
 stdlib, but also many many third party libraries. Show me one real
 Python program that is feasibly written without throwing in tons of
 English terms.
 
 Now, very special environments (what I called rare and isolated
 earlier) like special learning environments for children are a different
 matter. It should be ok if you have to use a specially patched Python
 branch there, or have to use an interpreter option that enables the
 suggested behaviour. For general programming, it IMO is a bad idea.

Ok, let me put it differently.

You *do not* design Python's keywords. You *do not* design the stdlib. You *do
not* design the concepts behind all that. You *use* them as they are. So you
can simply take the identifiers they define and use them the way the docs say.
You do not have to understand these names, they don't have to be words, they
don't have to mean anything to you. They are just tools. Even if you do not
understand English, they will not get in your way. You just learn them.

But you *do* design your own software. You *do* design its concepts. You *do*
design its APIs. You *do* choose its identifiers. And you want them to be
clear and telling. You want them to match your (or your clients) view of the
application. You do not care about the naming of the tools you use inside. But
you do care about clarity and readability in *your own software*.

See the little difference here?

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
René Fleschenberg wrote:
 Marc 'BlackJack' Rintsch schrieb:
 There are potential users of Python who don't know much english or no
 english at all.  This includes kids, old people, people from countries
 that have letters that are not that easy to transliterate like european
 languages, people who just want to learn Python for fun or to customize
 their applications like office suites or GIS software with a Python
 scripting option.
 
 Make it an interpreter option that can be turned on for those cases.

No. Make ASCII-only an interpreter option that can be turned on for the
cases where it is really required.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Stefan Behnel
René Fleschenberg schrieb:
 Gregor Horvath schrieb:
 René Fleschenberg schrieb:

 today, to the best of my knowledge. And in some form or another
 basically means that the PEP would create more possibilities for things
 to go wrong. That things can already go wrong today does not mean that
 it does not matter if we create more occasions were things can go wrong
 even worse.
 Following this logic we should not add any new features at all, because
 all of them can go wrong and can be used the wrong way.
 
 No, that does not follow from my logic. What I say is: When thinking
 about wether to add a new feature, the potential benefits should be
 weighed against the potential problems. I see some potential problems
 with this PEP and very little potential benefits.
 
 I love Python because it does not dictate how to do things.
 I do not need a ASCII-Dictator, I can judge myself when to use this
 feature and when to avoid it, like any other feature.
 
 *That* logic can be used to justify the introduction of *any* feature.

*Your* logic can be used to justify dropping *any* feature.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Ben
On May 15, 11:25 pm, Stefan Behnel [EMAIL PROTECTED]
wrote:
 René Fleschenberg wrote:
  Javier Bezos schrieb:
  But having, for example, things like open() from the stdlib in your code
  and then öffnen() as a name for functions/methods written by yourself is
  just plain silly. It makes the code inconsistent and ugly without
  significantly improving the readability for someone who speaks German
  but not English.
  Agreed. I always use English names (more or
  less :-)), but this is not the PEP is about.

  We all know what the PEP is about (we can read). The point is: If we do
  not *need* non-English/ASCII identifiers, we do not need the PEP. If the
  PEP does not solve an actual *problem* and still introduces some
  potential for *new* problems, it should be rejected. So far, the
  problem seems to just not exist. The burden of proof is on those who
  support the PEP.

 The main problem here seems to be proving the need of something to people who
 do not need it themselves. So, if a simple but I need it because a, b, c is
 not enough, what good is any further prove?

 Stefan

For what it's worth, I can only speak English (bad English schooling!)
and I'm definitely +1 on the PEP. Anyone using tools from the last 5
years can handle UTF-8

Cheers,
Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread Gregor Horvath
René Fleschenberg schrieb:

 I love Python because it does not dictate how to do things.
 I do not need a ASCII-Dictator, I can judge myself when to use this
 feature and when to avoid it, like any other feature.
 
 *That* logic can be used to justify the introduction of *any* feature.
 

No. That logic can only be used to justify the introduction of a feature 
that brings freedom.

Who are we to dictate the whole python world how to spell an identifier?

Gregor
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread [EMAIL PROTECTED]
Ben wrote:
 On May 15, 11:25 pm, Stefan Behnel [EMAIL PROTECTED]
 wrote:
  Rene Fleschenberg wrote:
   Javier Bezos schrieb:
   But having, for example, things like open() from the stdlib in your code
   and then o:ffnen() as a name for functions/methods written by yourself 
   is
   just plain silly. It makes the code inconsistent and ugly without
   significantly improving the readability for someone who speaks German
   but not English.
   Agreed. I always use English names (more or
   less :-)), but this is not the PEP is about.
 
   We all know what the PEP is about (we can read). The point is: If we do
   not *need* non-English/ASCII identifiers, we do not need the PEP. If the
   PEP does not solve an actual *problem* and still introduces some
   potential for *new* problems, it should be rejected. So far, the
   problem seems to just not exist. The burden of proof is on those who
   support the PEP.
 
  The main problem here seems to be proving the need of something to people 
  who
  do not need it themselves. So, if a simple but I need it because a, b, c 
  is
  not enough, what good is any further prove?
 
  Stefan

 For what it's worth, I can only speak English (bad English schooling!)
 and I'm definitely +1 on the PEP. Anyone using tools from the last 5
 years can handle UTF-8

The falsehood of the last sentence is why I'm moderately against this
PEP.  Even examples within this thread don't display correctly on
several of the machines I have access too (all of which are less than
5 year old OS/browser environments).  It strikes me a similar to the
arguments for quoted-printable in the early 1990s, claiming that
everyone can view it or will be able to soon--and even a decade
_after_ everyone can deal with latin1 just fine it was still causing
massive headaches.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3131: Supporting Non-ASCII Identifiers

2007-05-16 Thread [EMAIL PROTECTED]
Christophe wrote:
 [EMAIL PROTECTED] a ecrit :
  Steven D'Aprano wrote:
  I would find it useful to be able to use non-ASCII characters for heavily
  mathematical programs. There would be a closer correspondence between the
  code and the mathematical equations if one could write D(u*p) instead of
  delta(mu*pi).
 
  Just as one risk here:
  When reading the above on Google groups, it showed up as if one could
  write ?(u*p)...
  When quoting it for response, it showed up as could write D(u*p).
 
  I'm sure that the symbol you used was neither a capital letter d nor a
  question mark.
 
  Using identifiers that are so prone to corruption when posting in a
  rather popular forum seems dangerous to me--and I'd guess that a lot
  of source code highlighters, email lists, etc have similar problems.
  I'd even be surprised if some programming tools didn't have similar
  problems.

 So, it was google groups that continuously corrupted the good UTF-8
 posts by force converting them to ISO-8859-1?

 Of course, there's also the possibility that it is a problem on *your*
 side

Well, that's part of the point isn't it?  It seems incredibly naive to
me to think that you could use whatever symbol was intended and have
it show up, and the well fix your machine! argument doesn't fly.  A
lot of the time programmers have to look at stack traces on end-user's
machines (whatever they may be) to help debug.  They have to look at
code on the (GUI-less) production servers over a terminal link.  They
have to use all kinds of environments where they can't install the
latest and greatest fonts.  Promoting code that becomes very hard to
read and debug in real situations seems like a sound negative to me.

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   >