On 4/17/07, Christian Heimes <[EMAIL PROTECTED]> wrote: > Neal Norwitz schrieb: > > I don't have any plans, just considering options. Move them > > somewhere? Perhaps, trim the ones that are unused. In a unicode > > world, I'm not sure how much some of these make sense. letters stands > > out more than others. I don't know enough about unicode to know if > > digits or whitespace can be diff.
There are several additional characters in both sets, and plenty of reasons that a given program might want to use a restricted set. (Probably those already in string, or else a letters grouping set by locale.) > What do you think about replacing the definitions by information from > the unicode character properties database. The information are available > somewhere in Python: > http://docs.python.org/lib/re-syntax.html > \w ... With LOCALE, it will match the set [0-9_] plus whatever > characters are defined as alphanumeric for the current locale. If > UNICODE is set, this will match the characters [0-9_] plus whatever is > classified as alphanumeric in the Unicode character properties database. There are reasons to want exactly ASCII. There are also reasons to want only "local" letters. For example, in a French interface, I might want to include the extra French letters, but not the Greek. Also note that regex isn't quite the only use of those letters groupings. -jJ _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
