On Sun, Feb 17, 2013 at 3:22 PM, Marvin Humphrey <[email protected]> wrote:
> On Sun, Feb 17, 2013 at 12:08 PM, Nick Wellnhofer <[email protected]> wrote:
>> On Feb 17, 2013, at 19:48 , Marvin Humphrey <[email protected]> wrote:
>>> For a number of reasons, Clownfish needs an immutable string type, which I
>>> think should be named Clownfish::String. CharBuf would become a mutable
>>> subclass of String (and would be significantly rarer than it is now).
>>
>> Oh, I'd love to help working on this. One thing I don't like about the
>> current CharBuf implementation is that the way to iterate through strings
>> is rather limiting. The highlighter code in particular could benefit from a
>> few changes.
After giving the matter some thought I've decided to try another approach
first for the most pressing problem I want to solve, which is related to
`const` method invocants.
Iterating through strings seems orthogonal to mutability. What is it that you
find objectionable about the current iteration support?
Bear in mind that one requirement for Clownfish strings going forward is to
support UTF-16 as an internal encoding in addition to UTF-8.
// Example of wrapping UTF-8 content. We'll have to do something similar
// for UTF-16 strings, like those in Python. We want to avoid malloc() in
// wrapper functions both for the sake of speed and to avoid leaking
// memory during exceptions, so we use alloca() to allocate the object on
// the stack and then wrap the original string content rather than copy
// it. (So long as the amount requested is finite and small, alloca() is
// safe.)
int32_t
call_hash_sum(SV *perl_string) {
STRLEN len;
char *utf8 = SvPVutf8(perl_string, len);
void *stack_memory = alloca(ZCB_size()); // sizeof(ZombieCharBuf);
ZombieCharBuf *wrapper = ZCB_wrap(stack_memory, utf8, len);
return ZCB_Hash_Sum(wrapper);
}
CharBuf's iteration facilities were designed to proceed code-point by
code-point in order to work with multiple Unicode encodings. Does that
help to explain why things are as they are?
Marvin Humphrey