On May 4, 2013, at 01:14 , Marvin Humphrey <[email protected]> wrote: > OK, cool... Since it's primarily a naming change but disruptive, I'd suggest > the following order of actions: > > 1. Grep for `INCREF` and `Inc_RefCount` and make sure all invocations capture > the returned reference. > 2. Implement immutable String. > 3. Make the naming change.
As a side note, step 3 is quite a bit more than a naming change. All usages of CB_Nip must be converted to string iterators since CB_Nip mutates the string. Additionally, all the places where we construct new strings have to be identified. In this case we have to keep using a CharBuf followed by CB_Yield_String after the construction is complete. > Hmm. Well, the most incremental strategy is to hard-code UTF-8 into String > for now and the Python bindings can just forego the stack-allocated-string > optimization until we make up our minds later. It just occurred to me that there's another problem with the INCREF approach. Since string iterators INCREF the source string, every zombie string would be copied as soon as it's iterated. The String methods using zombie iterators wouldn't be affected, but it would still result in many unneccessary copies of host strings. A possible solution would be to do away with stack allocation but to keep using the host string buffer directly. Before returning to the host language, we could check whether the refcount of the string is greater than one and copy the string only then. This scheme wouldn't require changing INCREF semantics. Nick
