Re: Using libunistring for string comparisons et al

Andy Wingo Wed, 30 Mar 2011 04:14:14 -0700

On Tue 15 Mar 2011 23:49, Mark H Weaver <m...@netris.org> writes:

>> Well, we covered O(1) vs O(n).  To make UTF-8 O(1), you need to store
>> additional indexing information of some sort.  There are various schemes,
>> but, depending the the scheme, you lose some of memory advantage of UTF-8
>> vs UTF-32.  You can likely to better than UTF-32, though.
>
> I would prefer to either let our accessors be O(n), or else to create
> the index lazily, i.e. on the first usage of string-ref or string-set!
> In such a scheme, very few strings would include indices, and thus the
> overhead would be minimal.
>
> Anyway, the index overhead can be made arbitrarily small by increasing
> the chunk size.  It is a classic time-space trade-off here.  The chunk
> size could be made larger over the years, as usage of string-ref and
> string-set! become less common, and eventually the index stuff could be
> removed entirely.


Though I agre that string-set! should be discouraged -- as Clinger also
thought back in 1984, it seems -- string-ref is still important.  The
only thing that could replace it would be some sort of string cursor /
iteration protocol, and I would prefer for that to be standard (SRFI or
otherwise).

So let's factor string-ref into the "costs" of a potential switch to
UTF-8, be it in space or in time or whatever.

Andy
-- 
http://wingolog.org/

Re: Using libunistring for string comparisons et al

Reply via email to