On 5/2/06, Talin <[EMAIL PROTECTED]> wrote: > As to whether or not flat arrays are a win over a stream-based interface, > I think it really depends on how well the array type supports incremental > growth. I'm mostly familiar with the behavior of STL arrays, which (for > small array sizes) doubles whenever the array runs out of room. The result > is that for an array of size N, built up 1 character at a time, you will > have at most log2(N) memory allocations, and less if you set the initial > capacity of the array to a reasonable guess.
You should look at list.append in 2.5. It uses a similar O(log N) approach but doesn't waste up to 50% space for large lists. My plans would be do something similar for the bytes += implementation. > The reason that I asked about this was that my current 'toy' implementation > of the string formatting PEP (which I posted on this list earlier) uses > character arrays. Yeah, I was curious why you bothered with making a toy implementation efficient. :-) You ought to assume that eventually this will be made efficient; the focus now should be on functionality. > Although if you are supporting surrogates, then __getitem__ and __getslice__ > won't be O(1), will they? That's why I asked about UCS-2, which is what I > use in my work -- which is another way of saying that we've punted on the > issue of surrogates. There are certain streamwise things (like conversion to/from UTF-8) that support surrogates, so I would hesitate to call it UCS-2. But there is no attempt to hide surrogates from __getitem__ or __getslice__. If you access a code point that's a surrogate, you get a surrogate. So it's O(1). -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
