On 26/04/2013 01:48, Marvin Humphrey wrote:
Substrings of zombie strings are dangerous, because the buffer belonging to
the parent object may not outlive the substring.
Right. This even applies to zombie strings alone. Consider assigning a
zombie string to a member var. This is currently done with
self->var = CB_Clone(zstr);
With immutable strings, we don't have to create a copy and can simply write
self->var = (String*)INCREF(zstr);
This will break with zombie strings.
User-defined procedures will encounter ZombieStrings via wrapped callbacks --
if a parameter is `String*` they'll get a real String with copied content from
host argument, but if it's `const String*`, they'll get a ZombieString*
wrapping the host string content.
That's a great solution. But this isn't implemented yet, right? It would
also require that String methods can be invoked on const Strings (like
const member functions in C++). Would this work without further changes?
Unless we want to require that `SubString`
operate on non-const String* (like we will for `Inc_RefCount`),
ZStr_SubString() will have to return a fully independent String object which
owns its own buffer.
That shouldn't be a problem. BTW, the INCREF macro should be changed so
it doesn't work with const objects, see example above.
For zombie strings, it's assumed that they don't have to care about the
lifetime of the character buffer. So there are two cases left out:
stack-allocated strings that own a buffer
Can we make that an invalid state and avoid it?
Yes, we'll simply make the assumption that zombie strings never own a
buffer.
I think that's a good approach, but I have an ulterior motive -- I'm hoping
that ultimately we end up with one class handling all encodings, a la
<http://www.python.org/dev/peps/pep-0393/>.
That would only require a member var to store the encoding. But I don't
quite understand the rationale behind this. Does it have to do with the
Python bindings?
PS: Is it now true that ZombieStrings can only ever be allocated on the stack,
rather than in static memory? Because if that's the case, I'd favor the
name StackString instead.
In ZombieKeyedHash, they're allocated from a MemPool. Otherwise, all
allocations seem to be from the stack.
Nick