On 31/03/2013 08:35, jmfauth wrote:
------
Neil Hodgson:
"The counter-problem is that a French document that needs to include
one mathematical symbol (or emoji) outside Latin-1 will double in size
as a Python string."
Serious developers/typographers/users know that you can not compose
a text in French with "latin-1". This is now also the case with
German (Germany).
---
Neil's comment is correct,
sys.getsizeof('a' * 1000 + 'z')
1026
sys.getsizeof('a' * 1000 + '€')
2040
This is not really the problem. "Serious users" may
notice sooner or later, Python and Unicode are walking in
opposite directions (technically and in spirit).
timeit.repeat("'a' * 1000 + 'ẞ'")
[1.1088995672090292, 1.0842266613261913, 1.1010779011941594]
timeit.repeat("'a' * 1000 + 'z'")
[0.6362570846925735, 0.6159128762502917, 0.6200501673623791]
(Just an opinion)
jmf
I'm feeling very sorry for this horse, it's been flogged so often it's
down to bare bones.
--
If you're using GoogleCrap™ please read this
http://wiki.python.org/moin/GoogleGroupsPython.
Mark Lawrence
--
http://mail.python.org/mailman/listinfo/python-list