Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread Greg Ewing
Guido van Rossum wrote:
 In the world where cooperative multiple inheritance
 originated (C++), this would be a static error.

I wasn't aware that C++ had anything resembling super().
Is it a recent addition to the language?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Style Sheets ? Re: User's complaints

2006-07-18 Thread Boris Borcic
Josiah Carlson wrote:
 Boris Borcic [EMAIL PROTECTED] wrote:
 Of course, and that's why in my initial post I was talking of transparent 
 reversible transforms and central control of styles through the standard.
 Means not to fall into the trap you describe. Or else I would have asked for 
 macros ! Are you implying that /no/ measure of language variability can be 
 dealt 
 with by such means as standards-controlled reversible transforms ? I guess 
 not.
 
 Regardless of the existance of reversable transforms, a user's ability
 to understand and/or maintain code is dependant on the syntax and
 semantics of the language.

If you have an effective isomorphism, that's irrelevant. Everybody works with 
the language she understands. Ever tried a slide rule ?

 In allowing different language variants, one
 is changing the user-understood meaning of a block of code, which
 necessarily increses the burden of programming and maintenance.

Allowing different language variants connected by reversible transforms means 
one need not change any user's understood meaning of any block of code. The 
user 
stipulates the language variant she likes and the system translates 
back-and-forth from/to distinct variants other users might prefer.

- BB
--
666 ? - 666 ~ .666 ~ 2/3 ~ 1-1/3 ~ tertium non datur ~ the excluded middle
 ~ either with us, or against us

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Mihai Ibanescu
On Mon, Jul 17, 2006 at 03:39:55PM -0400, Mihai Ibanescu wrote:
 Hi,
 
 This is reported on sourceforge:
 
 http://sourceforge.net/tracker/index.php?func=detailaid=1524081group_id=5470atid=105470
 
 I am willing to try and patch the problem, but I'd like to discuss my ideas
 first.
 
 The basic problem is that, in some locale, INFO.lower() != info. So,
 initializing a dictionary with keys that appear to be lower-case in English
 is unsafe for other locale.
 
 Now, the other problem is, people can choose to initialize the locale any time
 they want, and can choose to change it too. My initial approach was to create
 the dictionary with strings to which lower() was explicitly called.
 
 But, since priority_names is class-scoped, it gets evaluated at module import
 time, which is way too early in the game (most likely the locale will be set
 afterwards).
 
 Any thoughts on the proper way to approach this?

To follow up on my own email: it looks like, even though in some locale
INFO.lower() != info

uINFO.lower() == info (at least in the Turkish locale).

Is that guaranteed, at least for now (for the current versions of python)?

Thanks,
Misa
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Style Sheets ? Re: User's complaints

2006-07-18 Thread Aahz
Please.  Just end this discussion.  It ain't gonna happen.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.  --Brian W. Kernighan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread Scott Dial
Greg Ewing wrote:
 Guido van Rossum wrote:
 In the world where cooperative multiple inheritance
 originated (C++), this would be a static error.
 
 I wasn't aware that C++ had anything resembling super().
 Is it a recent addition to the language?
 

It is much more explicit, but you call the function from the 
superclass's namespace:

class B : public A {
public:
   void m(void) {
 A::m(); // The call to my super
   };
};

C++ has no concept of MRO, so super() would be completely ambiguous. 
In fact, if you try to replicate your code in C++ using a generic M 
class (which defines a dummy m method), you'll get such an error from 
your compiler: `M' is an ambiguous base of `C'

This is very much a dynamic language quirk that you can call out to a 
function that may or may not exist, and we should avoid comparing it to 
other languages which don't allow it. I agree with Guido that in python, 
the reasonable fix is to have a superclass which defines an empty method.

-- 
Scott Dial
[EMAIL PROTECTED]
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread glyph


On Tue, 18 Jul 2006 09:24:57 -0400, Jean-Paul Calderone [EMAIL PROTECTED] 
wrote:
On Tue, 18 Jul 2006 09:10:11 -0400, Scott Dial [EMAIL PROTECTED] wrote:
Greg Ewing wrote:
 Guido van Rossum wrote:
 In the world where cooperative multiple inheritance
 originated (C++), this would be a static error.

 I wasn't aware that C++ had anything resembling super().
 Is it a recent addition to the language?

C++ has no concept of MRO, so super() would be completely ambiguous.

I think this was Greg's point.  Talking about C++ and super() is
nonsensical.

C++ originally specified multiple inheritance, but it wasn't cooperative in
the sense that super is.  In Lisp, though, where cooperative method dispatch
originated, call-next-method does basically the same thing in the case where
there's no next method: it calls no-next-method which signals a generic
error.

http://www.lisp.org/HyperSpec/Body/locfun_call-next-method.html

However, you can write methods for no-next-method, so you can override that
behavior as appropriate.  In Python you might achieve a similar effect using a
hack like the one Greg suggested, but in a slightly more systematic way; using
Python's regular inheritance-based method ordering, of course, not bothering 
with multimethods.  Stand-alone it looks like an awful hack, but with a bit
of scaffolding I think it looks nice enough; at least, it looks like the Lisp
solution, which while potentially ugly, is complete :).

This is just implemented as a function for brevity; you could obviously use a
proxy object with all the features of 'super', including optional self,
method getters, etc.  For cooperative classes you could implement noNextMethod
to always be OK, or to provide an appropriate null value for a type map of
a method's indicated return value ('' for str, 0 for int, None for object, 
etc).

# ---cut here---

def callNextMethod(cls, self, methodName, *a, **k):
sup = super(cls, self)
method = getattr(sup, methodName, None)
if method is not None:
return method(*a, **k)
else:
return self.noNextMethod(methodName, *a, **k)

class NextMethodHelper(object):
def noNextMethod(self, methodName, *a, **k):
return getattr(self, noNext_+methodName)(*a, **k)

class A(object):
def m(self):
print A.m

class B(NextMethodHelper):
def m(self):
print B.m
return callNextMethod(B, self, m)

def noNext_m(self):
# it's ok not to have an 'm'!
print No next M, but that's OK!
return None

class C(B, A):
def m(self):
print C.m
return callNextMethod(C, self, m)


# c = C()
# c.m()
#C.m
#B.m
#A.m
# b = B()
# b.m()
#B.m
#No next M, but that's OK!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Style Sheets ? Re: User's complaints

2006-07-18 Thread Terry Reedy

Boris Borcic [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Allowing different language variants connected by reversible transforms 
 means
 one need not change any user's understood meaning of any block of code. 
 The user
 stipulates the language variant she likes and the system translates
 back-and-forth from/to distinct variants other users might prefer.

Editing systems like this are and should remain 3rd-party add-ons.  This 
discussion belongs on the general python list (clp) or on the discussion 
list of current projects in this area.

tjr



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread Willem Broekema
On 7/18/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 C++ originally specified multiple inheritance, but it wasn't cooperative in
 the sense that super is.  In Lisp, though, where cooperative method dispatch
 originated, call-next-method does basically the same thing in the case where
 there's no next method: it calls no-next-method which signals a generic
 error.

Don't forget Lisp's next-method-p, which tests (-p for
predicate) if there is any next method to call. Highly elegant, I'd
say.

 http://www.lisp.org/HyperSpec/Body/locfun_call-next-method.html

http://www.lisp.org/HyperSpec/Body/locfun_next-method-p.html

- Willem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread Guido van Rossum
On 7/18/06, Greg Ewing [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  In the world where cooperative multiple inheritance
  originated (C++), this would be a static error.

 I wasn't aware that C++ had anything resembling super().
 Is it a recent addition to the language?

I don't know about pure C++, but the C++ variant (or was it a
library?) that originated the cooperative multitasking ideas does have
a super call. And it is strongly typed as I explained.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem with super() usage

2006-07-18 Thread Guido van Rossum
On 7/18/06, Jean-Paul Calderone [EMAIL PROTECTED] wrote:
 I think this was Greg's point.  Talking about C++ and super() is
 nonsensical.

If you're talking pure C++, yes. But I was talking about programming
system built on top of C++ implementing cooperating multitasking.

As I am fond of repeating, if the ideas originally came from Lisp, I
don't know -- I got them from this book:

Putting Metaclasses to Work: A New Dimension in Object-Oriented
Programming, by Ira R. Forman and Scott H. Danforth. Addison-Wesley,
1999, ISBN 0-201-43305-2.

It uses C++ (or perhaps a language very much like it), not Lisp. And
it has super.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Style Sheets ? Re: User's complaints

2006-07-18 Thread Guido van Rossum
On 7/17/06, Boris Borcic [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  You must be misunderstanding.

 I don't think so. You appeared to say that the language changes too much 
 because
 everyone wants different changes - that accumulate. I suggested a mechanism
 allowing people to see only the changes they want - or none at all - might be
 devised.

Oh, but don't fool yourself into thinking that that's not a language
change -- and an extremely drastic one at that. Nobody will be able to
read other people's source code any more without first applying the
conversion tool -- which is kind of painful when you're confronted
with a code snippet in email or a book, for example.

Let me rephrase that. I don't believe your proposal solves that
problem, *and* I don't think it's a good idea for other reasons. Am I
clear now? :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Guido van Rossum
Alternatively, does info.upper() == INFO everywhere?

On 7/18/06, Mihai Ibanescu [EMAIL PROTECTED] wrote:
 On Mon, Jul 17, 2006 at 03:39:55PM -0400, Mihai Ibanescu wrote:
  Hi,
 
  This is reported on sourceforge:
 
  http://sourceforge.net/tracker/index.php?func=detailaid=1524081group_id=5470atid=105470
 
  I am willing to try and patch the problem, but I'd like to discuss my ideas
  first.
 
  The basic problem is that, in some locale, INFO.lower() != info. So,
  initializing a dictionary with keys that appear to be lower-case in English
  is unsafe for other locale.
 
  Now, the other problem is, people can choose to initialize the locale any 
  time
  they want, and can choose to change it too. My initial approach was to 
  create
  the dictionary with strings to which lower() was explicitly called.
 
  But, since priority_names is class-scoped, it gets evaluated at module 
  import
  time, which is way too early in the game (most likely the locale will be set
  afterwards).
 
  Any thoughts on the proper way to approach this?

 To follow up on my own email: it looks like, even though in some locale
 INFO.lower() != info

 uINFO.lower() == info (at least in the Turkish locale).

 Is that guaranteed, at least for now (for the current versions of python)?

 Thanks,
 Misa
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Mihai Ibanescu
On Tue, Jul 18, 2006 at 10:19:54AM -0700, Guido van Rossum wrote:
 Alternatively, does info.upper() == INFO everywhere?

Not in the Turkish locale :-(

# begin /tmp/foo.py
import locale

locale.setlocale(locale.LC_ALL, '')

print info.upper()
print info.upper() == INFO
# end /tmp/foo.py

LANG=tr_TR.UTF-8 python /tmp/foo.py
iNFO
False

Thanks,
Misa
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Guido van Rossum
And uinfo.upper()?

On 7/18/06, Mihai Ibanescu [EMAIL PROTECTED] wrote:
 On Tue, Jul 18, 2006 at 10:19:54AM -0700, Guido van Rossum wrote:
  Alternatively, does info.upper() == INFO everywhere?

 Not in the Turkish locale :-(

 # begin /tmp/foo.py
 import locale

 locale.setlocale(locale.LC_ALL, '')

 print info.upper()
 print info.upper() == INFO
 # end /tmp/foo.py

 LANG=tr_TR.UTF-8 python /tmp/foo.py
 iNFO
 False

 Thanks,
 Misa



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Martin v. Löwis
Mihai Ibanescu wrote:
 To follow up on my own email: it looks like, even though in some locale
 INFO.lower() != info
 
 uINFO.lower() == info (at least in the Turkish locale).
 
 Is that guaranteed, at least for now (for the current versions of python)?

It's guaranteed for now; unicode.lower is not locale-aware.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread James Y Knight
On Jul 18, 2006, at 1:54 PM, Martin v. Löwis wrote:

 Mihai Ibanescu wrote:
 To follow up on my own email: it looks like, even though in some  
 locale
 INFO.lower() != info

 uINFO.lower() == info (at least in the Turkish locale).

 Is that guaranteed, at least for now (for the current versions of  
 python)?

 It's guaranteed for now; unicode.lower is not locale-aware.

That seems backwards of how it should be ideally: the byte-string  
upper and lower should always do ascii uppering-and-lowering, and the  
unicode ones should do it according to locale. Perhaps that can be  
cleaned up in py3k?

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread M.-A. Lemburg
James Y Knight wrote:
 On Jul 18, 2006, at 1:54 PM, Martin v. Löwis wrote:
 
 Mihai Ibanescu wrote:
 To follow up on my own email: it looks like, even though in some  
 locale
 INFO.lower() != info

 uINFO.lower() == info (at least in the Turkish locale).

 Is that guaranteed, at least for now (for the current versions of  
 python)?
 It's guaranteed for now; unicode.lower is not locale-aware.
 
 That seems backwards of how it should be ideally: the byte-string  
 upper and lower should always do ascii uppering-and-lowering, and the  
 unicode ones should do it according to locale. Perhaps that can be  
 cleaned up in py3k?

Actually, you've got that backwards ;-) ...

There are no .lower()/.upper() methods for bytes.

The reason these methods are locale aware for 8-bit strings
lies in the fact that we're using the C lib functions, which
are locale setting dependent - with all the drawbacks that
go with it.

The Unicode database OTOH *defines* the upper/lower case mapping in
a locale independent way, so the mappings are guaranteed
to always produce the same results on all platforms.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 18 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Guido van Rossum
On 7/18/06, James Y Knight [EMAIL PROTECTED] wrote:
 On Jul 18, 2006, at 1:54 PM, Martin v. Löwis wrote:

  Mihai Ibanescu wrote:
  To follow up on my own email: it looks like, even though in some
  locale
  INFO.lower() != info
 
  uINFO.lower() == info (at least in the Turkish locale).
 
  Is that guaranteed, at least for now (for the current versions of
  python)?
 
  It's guaranteed for now; unicode.lower is not locale-aware.

 That seems backwards of how it should be ideally: the byte-string
 upper and lower should always do ascii uppering-and-lowering, and the
 unicode ones should do it according to locale. Perhaps that can be
 cleaned up in py3k?

No, you've got it backwards. 8-bit strings are assumed to be encoded
using the current locale's default encoding so upper and lower behave
locale-dependent. Unicode strings don't need the locale as additional
input for upper and lower; in a different locale you simply use a
different code point. I believe the original issue was that in
Turkish, there are i's with and without dots in both lower and upper
case. I'm guessing that the ASCII code points are used for lowercase
dotted i and uppercase undotted I; code points with the high bit set
are used for the uppercase dotted i and lowercase undotted I (which I
can't easily type here).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Mihai Ibanescu
On Tue, Jul 18, 2006 at 10:53:23AM -0700, Guido van Rossum wrote:
 And uinfo.upper()?

Yepp, that shows the right thing (at least in the several locales I tested,
Turkish included).

It's along the lines of uINFO.lower() I was proposing in my second post :-)

Misa
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Mihai Ibanescu
On Tue, Jul 18, 2006 at 07:54:28PM +0200, Martin v. Löwis wrote:
 Mihai Ibanescu wrote:
  To follow up on my own email: it looks like, even though in some locale
  INFO.lower() != info
  
  uINFO.lower() == info (at least in the Turkish locale).
  
  Is that guaranteed, at least for now (for the current versions of python)?
 
 It's guaranteed for now; unicode.lower is not locale-aware.

OK, should I write a patch for the logging module to convert the string to
unicode before applying lower()? So far that seems like the way to go.

Maybe this could also be explained in the documentation:

http://docs.python.org/lib/node323.html

I don't think I've seen it in the locale documentation that locale settings do
not affect unicode strings, and that particular page says 

quote
If, when coding a module for general use, you need a locale independent version
of an operation that is affected by the locale (such as string.lower(), or
certain formats used with time.strftime()), you will have to find a way to do
it without using the standard library routine.
/quote

Unicode might be a perfectly acceptable suggestion for others too.

Misa
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Fred L. Drake, Jr.
On Tuesday 18 July 2006 14:52, Mihai Ibanescu wrote:
  Unicode might be a perfectly acceptable suggestion for others too.

Are we still supporting builds that don't include Unicode?  If so, that needs 
to be considered in a patch as well.


  -Fred

-- 
Fred L. Drake, Jr.   fdrake at acm.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Martin v. Löwis
James Y Knight wrote:
 That seems backwards of how it should be ideally: the byte-string upper
 and lower should always do ascii uppering-and-lowering, and the unicode
 ones should do it according to locale. Perhaps that can be cleaned up in
 py3k?

Cleaned-up, yes. But it is currently not backwards.

For a byte string, you need an encoding, which comes from the locale.
So for byte strings, case-conversion *has* to be locale-aware (in
principle, making it encoding-aware only would almost suffice, but
there is no universal API for that).

OTOH, for Unicode, due to the unification, case-conversion mostly
does not need to be locale-aware. Nearly all case-conversions are
only script-dependent, not language-dependent. So it is nearly possible
to make case-conversion locale-independent, and that is what Python
provides.

The nearly above refers to *very* few exceptions, in *very*
few languages. Most of the details are collected in UAX#21, some
highlights are:
- case conversions are not always reversible
- sometimes, case conversion may convert a single
  character to multiple characters; the canonical
  example is German ß (considered lower-case) - SS
  (historically, this is just typographical, since there
   is no upper case sharp s in our script)
- sometimes, conversion depends on the position of
  the letter in the word, see Final_Sigma
  in SpecialCasing.txt, or on the subsequent
  combining accents, see Lithuanian More_Above

I believe the unicode.lower behaviour is currently right
for most applications, so it should continue to be the
default. An additional locale-aware version should be added,
but that probably means to incorporate ICU into Python,
to get this and other locale properties right in a
platform-independent fashion.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Martin v. Löwis
M.-A. Lemburg wrote:
 The Unicode database OTOH *defines* the upper/lower case mapping in
 a locale independent way, so the mappings are guaranteed
 to always produce the same results on all platforms.

Actually, that isn't the full truth; see UAX#21, which is now official
part of Unicode 4. It specifies two kinds of case conversion:
simple case conversion, and full case conversion. Python only supports
simple case conversion at the moment. Full case conversion is context
(locale) dependent, and must take into account SpecialCasing.txt.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Raymond Hettinger
I briefly had a chance to look at some of the work being done on a C 
implementation of decimal, and it looks like the approach is following 
the Python version too literally.

Ideally, it should be written as if Python were not involved and 
afterwards add the appropriate method wrappers.  Contexts should be 
small structures that include the traps and flags as bitfields.  
Likewise, the internal representation of a decimal should be in a simple 
structure using a byte array for the decimal digits -- it should not be 
a Python object.  Internally, the implementation creates many temporary 
decimals for a scratchpad, it makes many copies of contexts, it 
frequently tests components of a context, and the functions frequently 
call each other.  Those operations need to be as cheap as possible -- 
that means no internal tuple objects, no ref counting, and no passing 
variables through tuples of arguments, etc.

I recommend writing most of the module to be independent of the Python C 
API.  After a working implementation is built, grafting on the wrappers 
should be a trivial step.  Unless we stay true to this course, the code 
will end-up being unnecessarily complex and the performance will be 
disappointing.


my-two-cents,


Raymond


P.S.  The dictionary approach to context objects should likely be 
abandoned for the C version.  If the API has to change a bit, then so be it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Aahz
On Tue, Jul 18, 2006, Raymond Hettinger wrote:

 P.S.  The dictionary approach to context objects should likely be
 abandoned for the C version.  If the API has to change a bit, then so
 be it.

Why do you say that?  The rest I agree with; seems to me that making a
small wrapper for dict access works well, too.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.  --Brian W. Kernighan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 M.-A. Lemburg wrote:
 The Unicode database OTOH *defines* the upper/lower case mapping in
 a locale independent way, so the mappings are guaranteed
 to always produce the same results on all platforms.
 
 Actually, that isn't the full truth; see UAX#21, which is now official
 part of Unicode 4. It specifies two kinds of case conversion:
 simple case conversion, and full case conversion. Python only supports
 simple case conversion at the moment. Full case conversion is context
 (locale) dependent, and must take into account SpecialCasing.txt.

Right. In fact, some case mappings are not available in the Unicode
database, since that only contains mappings which don't increase or
decrease the length of the Unicode string. A typical example is the
German u'ß'. u'ß'.upper() would have to give u'SS', but instead
returns u'ß'.

However, the point I wanted to make was that these mappings don't depend
on the locale setting of the C lib - you have to explicitly
access the mapping in the context of a locale and/or text.

As an example, here's the definition for the dotted/dotless i's in
Turkish taken from that file
(http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt):


# The entries in this file are in the following machine-readable format:
#
# code; lower ; title ; upper ; (condition_list ;)? # comment
#

...

# I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
# The following rules handle those cases.

0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE

# When lowercasing, remove dot_above in the sequence I + dot_above,
which will turn into i.
# This matches the behavior of the canonically equivalent I-dot_above

0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE

# When lowercasing, unless an I is before a dot_above, it turns into a
dotless i.

0049; 0131; 0049; 0049; tr Not_Before_Dot; # LATIN CAPITAL LETTER I
0049; 0131; 0049; 0049; az Not_Before_Dot; # LATIN CAPITAL LETTER I

# When uppercasing, i turns into a dotted capital I

0069; 0069; 0130; 0130; tr; # LATIN SMALL LETTER I
0069; 0069; 0130; 0130; az; # LATIN SMALL LETTER I

# Note: the following case is already in the UnicodeData file.

# 0131; 0131; 0049; 0049; tr; # LATIN SMALL LETTER DOTLESS I


Note how the context of the usage of the code points matters
when doing case-conversions.

To make things even more complicated, there are so called
language tags which can be embedded into the Unicode string,
so the language can also change within a Unicode string.

http://www.unicode.org/reports/tr7/

To get a feeling of what it takes to do locale aware handling
of Unicode right, have a look at the Locale Data Markup
Language (LDML):

http://www.unicode.org/reports/tr35/

(hey, perhaps Google could contribute support for this to Python ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 18 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Martin v. Löwis
M.-A. Lemburg wrote:
 Right. In fact, some case mappings are not available in the Unicode
 database, since that only contains mappings which don't increase or
 decrease the length of the Unicode string. A typical example is the
 German u'ß'. u'ß'.upper() would have to give u'SS', but instead
 returns u'ß'.

Actually, that is in the Unicode database (SpecialCasing.txt):

00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S

 However, the point I wanted to make was that these mappings don't depend
 on the locale setting of the C lib - you have to explicitly
 access the mapping in the context of a locale and/or text.

I don't get that point. SpecialCasing.txt is clearly intended to take
locale context into account. Whether this is the C locale, or some
other locale mechanism, is out of scope of the Unicode specification.
It could be the C locale (and indeed, the C locale implementations
often take the Unicode casing procedure into account these days).

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Tim Peters
[Raymond Hettinger]
 ...
 If the current approach gets in their way, the C implementers should feel 
 free to
 make an alternate design choice.

I expect they will, eventually.  Converting this to C is a big job,
and at the NFS sprint we settled on an incremental strategy allowing
most of the module to remain written in Python, converting methods to
C one at a time.  Changing the user-visible API is a hard egg to
swallow, and it's unfortunate that the Python code used a dict to hold
flags to begin with.  The dict doesn't just record whether an
exception has occurred, it also counts how many times the exception
occurred.  It's possible that someone, somewhere, has latched on to
that as a feature.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Raymond Hettinger




Aahz wrote:

  On Tue, Jul 18, 2006, Raymond Hettinger wrote:
  
  
P.S.  The dictionary approach to context objects should likely be
abandoned for the C version.  If the API has to change a bit, then so
be it.

  
  
Why do you say that?  The rest I agree with; seems to me that making a
small wrapper for dict access works well, too.
  

I think it was tripping-up the folks working on the C implementation.
Georg can speak to it more directly. IIRC, the issue was that the
context object exposed a dictionary which a user could update directly
and there was no notification back to the surrounding object so it
could update an underlying bitfield representation. If the current
approach gets in their way, the C implementers should feel free to make
an alternate design choice.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Lisandro Dalcin
On 7/18/06, Tim Peters [EMAIL PROTECTED] wrote:
 [Raymond Hettinger]
  ...
  If the current approach gets in their way, the C implementers should feel 
  free to
  make an alternate design choice.

 I expect they will, eventually.  Converting this to C is a big job,
 and at the NFS sprint we settled on an incremental strategy allowing
 most of the module to remain written in Python, converting methods to
 C one at a time.  Changing the user-visible API is a hard egg to
 swallow, and it's unfortunate that the Python code used a dict to hold
 flags to begin with.  The dict doesn't just record whether an
 exception has occurred, it also counts how many times the exception
 occurred.  It's possible that someone, somewhere, has latched on to
 that as a feature.

Why not a 'cDecimal' module instead?


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Raymond Hettinger




Lisandro Dalcin wrote:

  On 7/18/06, Tim Peters [EMAIL PROTECTED] wrote:
  
  
[Raymond Hettinger]


  ...
If the current approach gets in their way, the C implementers should feel free to
make an alternate design choice.
  

I expect they will, eventually.  Converting this to C is a big job,
and at the NFS sprint we settled on an "incremental" strategy allowing
most of the module to remain written in Python, converting methods to
C one at a time.  Changing the user-visible API is a hard egg to
swallow, and it's unfortunate that the Python code used a dict to hold
"flags" to begin with.  The dict doesn't just record whether an
exception has occurred, it also counts how many times the exception
occurred.  It's possible that someone, somewhere, has latched on to
that as "a feature".

  
  
Why not a 'cDecimal' module instead?
  


Good answer. 

My recommendation is that instead of "eventually", we bite the bullet
now. The development path for decimal can follow that for sets where
the
lessons from sets.py were used to make an improved API for the C
version while leaving the pure Python verion unmodified. Realizing
that "it's unfortunate that the Python code used a dict to hold
flags to begin with", the C implementers for decimal should move on to
more sensible choices.

Also, I have to take the blame for encouraging the "incremental"
migration approach. I thought we could get some quick wins on the most
common called parts of the module. Now, I can see that this approach
will result in nasty, convoluted C code that falls far short of its
performance potential and is a nightmare to maintain. Looking at some
of the checkins, it is now clear to me that there are substantial
benefits to divorcing the base C implementation of decimal from the
Python C API.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strategy for converting the decimal module to C

2006-07-18 Thread Giovanni Bajo
Tim Peters wrote:

 Changing the user-visible API is a hard egg to
 swallow, and it's unfortunate that the Python code used a dict to hold
 flags to begin with.  The dict doesn't just record whether an
 exception has occurred, it also counts how many times the exception
 occurred.  It's possible that someone, somewhere, has latched on to
 that as a feature.

Especially since it was a documented one:

 import decimal
 help(decimal.Context)
Help on class Context in module decimal:

class Context(__builtin__.object)
 |  Contains the context for a Decimal instance.
[...]
 |  flags  - When an exception is caused, flags[exception] is incremented.
 |   (Whether or not the trap_enabler is set)
 |   Should be reset by user of Decimal instance.
[...]

-- 
Giovanni Bajo
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] logging module broken because of locale

2006-07-18 Thread Greg Ewing
James Y Knight wrote:

 That seems backwards of how it should be ideally: the byte-string  
 upper and lower should always do ascii uppering-and-lowering, and the  
 unicode ones should do it according to locale. Perhaps that can be  
 cleaned up in py3k?

I would expect bytes objects not to have upper() and lower()
methods at all in Py3k.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com