Le mercredi 19 décembre 2012 22:31:42 UTC+1, Ian a écrit : > On Wed, Dec 19, 2012 at 2:18 PM, <wxjmfa...@gmail.com> wrote: > > > latin-1 (iso-8859-1) ? are you sure ? > > > > Yes. > > > > >>>> sys.getsizeof('a') > > > 26 > > >>>> sys.getsizeof('ab') > > > 27 > > >>>> sys.getsizeof('aé') > > > 39 > > > > Compare to: > > > > >>> sys.getsizeof('a\u0100') > > 42 > > > > The reason for the difference you posted is that pure ASCII strings > > have a further optimization, which I glossed over and which is purely > > a savings in overhead: > > > > >>> sys.getsizeof('abcde') - sys.getsizeof('a') > > 4 > > >>> sys.getsizeof('ábçdê') - sys.getsizeof('á') > > 4
----- I know all of this. And this is exactly, what I explained. I do not care about this optimization. I'm not an ascii user. As a non ascii user, this optimization is just irrelevant. What should a Python user think, if he sees his strings are comsuming more memory just because he uses non ascii characters or he sees his strings are changing just because he "uppercases" them. Unicode is here to serve anybody. jmf -- http://mail.python.org/mailman/listinfo/python-list