On 5/24/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Ka-Ping Yee schrieb: > > On Thu, 24 May 2007, Jim Jewett wrote: > >> So how about
> >> (1) By default, python allows only ASCII. > >> (2) Additional characters are permitted if they appear in a table > >> named on the command line. > >> These additional characters should be restricted to code > >> points larger than ASCII (so you can't easily turn "!" into > >> an ID char), but beyond that, anything goes. If you want to > >> include punctuation or undefined characters, so be it. > > +1! This is a fine solution. It is better than the "python -U" > > option I proposed > -2. Any solution found must also accommodate users which are > unaware of the security issue, and just want to use their native > language for identifiers. So requiring them to change their > environment or pass additional command line parameters is > unacceptable. There is no hope of explaining security; therefore, the defaults should be relatively safe. If the default is "anything goes", that isn't safe. If the default is "ASCII", that is safe, but possibly inconvenient. It depends on how hard it is to make the switch. Is your concern just that it should be possible to do once (perhaps at install), rather than on each run? That would probably be OK too, so long as the default install was ASCII-only, so that *someone* had to make a decision about what to allow. I assume that large communities will standardize on a tailored table, but a first-pass slightly-too-inclusive table is easy enough to create. Here are the Thaana lines from (unicode consortium file) Scripts.txt 0780..07A5 ; Thaana # Lo [38] THAANA LETTER HAA..THAANA LETTER WAAVU 07A6..07B0 ; Thaana # Mn [11] THAANA ABAFILI..THAANA SUKUN 07B1 ; Thaana # Lo THAANA LETTER NAA Though if it were me, I would probably simplify that to 0780..07B1 ; Thaana Similarly, Devanagari has 15 lines in Scripts.txt, but you could simplify it to 0901..0939 ; Devanagari 093C..094D ; Devanagari 0950..0954 ; Devanagari 0958..0963 ; Devanagari 0966..096F ; Devanagari 097B..097F ; Devanagari or even 0901..097F ; Devanagari and some undefined characters if you (as a Devanagari speaker) were confident that none of your characters would be confused with ASCII. (In practice, you might well want to exclude the Devangari numbers for looking too similar to ASCII digits with different values, but ... that is a judgment call for Devanagari speakers to make, so long as they make it explicitly.) -jJ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com