Re: [Python-Dev] len(chr(i)) = 2?

Glyph Lefkowitz Thu, 25 Nov 2010 23:23:25 -0800

On Nov 24, 2010, at 4:03 AM, Stephen J. Turnbull wrote:

> You end up proliferating types that all do the same kind of thing.  Judicious 
> use of inheritance helps, but getting the fundamental abstraction right is 
> hard.  Or least, Emacs hasn't found it in 20 years of trying.


Emacs hasn't even figured out how to do general purpose iteration in 20 years 
of trying either.  The easiest way I've found to loop across an arbitrary pile 
of 'stuff' is the CL 'loop' macro, which you're not even supposed to use.  Even 
then, you still have to make the arcane and pointless distinction of using 
'across' or 'in' or 'on'.  Python, on the other hand, has iteration pretty well 
tied up nicely in a bow.

I don't know how to respond to the rest of your argument.  Nothing you've said 
has in any way indicated to me why having code-point offsets is a good idea, 
only that people who know C and elisp would rather sling around piles of 
integers than have good abstract types.

For example:

> I think it more likely that markers are very expense to create and use 
> compared to integers.

What?  When you do 'for x in str' in python, you are already creating an 
iterator object, which has to store the exact same amount of state that our 
proposed 'marker' or 'character pointer' would have to store.  The proposed 
UTF-8 marker would have to do a tiny bit more work when iterating because it 
would have to combine multibyte characters, but in exchange for that you get to 
skip a whole ton of copying when encoding and decoding.  How is this expensive 
to create and use?  For every application I have ever designed, encountered, or 
can even conjecture about, this would be cheaper.  (Assuming not just a UTF-8 
string type, but one for UTF-16 as well, where native data is in that format 
already.)

For what it's worth, not wanting to use abstract types in Emacs makes sense to 
me: I've written my share of elisp code, and it is hard to create reasonable 
abstractions in Emacs, because the facilities for defining types and creating 
polymorphic logic are so crude.  It's a lot easier to just assume your 
underlying storage is an array, because at the end of the day you're going to 
need to call some functions on it which care whether it's an array or an alist 
or a list or a vector anyway, so you might as well just say so up front.  But 
in Python we could just call 'mystring.by_character()' or 
'mystring.by_codepoint()' and get an iterator object back and forget about all 
that junk.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

Reply via email to