> On Nov 3, 2017, at 10:39 AM, Cory Benfield via swift-evolution > <swift-evolution@swift.org> wrote: > > One of Swift’s major advantages as a language is the ease of bridging from > Swift code to C. This ease makes it possible to utilise the vast body of > existing code to bootstrap projects, rather than reinventing the world in > Swift every time we have a problem. > > The String type in Swift has some affordances for this use-case. The > withCString method, the utf8CString property, and the cString(using:) > functions are all very effective at providing the most-common case: a > NULL-terminated string suitable for passing into most libc functions. > However, using any of these affordances will always incur a memory copy, as > Swift needs to not just ensure that the bytes making up the String are in > contiguous memory, but also need to append a NULL byte to those strings for C > safety. >
This is something we’re actively working on. It’s a stretch goal for 4.1, but certainly no promises. In full generality, it’s not always possible as we support bridged NSStrings with non-contiguous backing storage, but we should ensure all native Swift strings are always contiguous and nul terminated (hey, it’s just a byte or two). Then, we can discuss APIs to provide zero copy ways to get the pointer. > This is a bit frustrating when working with C libraries that accept strings > in the form of pointer + length, and so do not require NULL-termination, such > as libicu. In these cases we are always required to incur the overhead of a > memory copy, even in situations when the underlying String representation is > contiguous, all in the name of appending a NULL byte we don’t actually need. > Worse, the pointers provided by those methods are not BufferPointers, so they > don’t carry their length around with them, requiring that another function > call be used to determine the length of the pointer. This is also something we’re actively working on. There’s a branch were we have an “UnsafeString” (name may change) which is just pointer, length, and some flags. This is useful for internal usage (implementing existing String APIs in a more performant fashion). Once that’s in, whether and how to surface this construct as API is a needed discussion for swift-evolution. > > It would be convenient to have one or more additional functions that allow us > to get access to a contiguous representation of bytes making up the string > without appending a NULL byte, as a BufferPointer. The guarantees of these > functions would be: > > 1. If the underlying string is stored in contiguous memory; AND > 2. It is stored in the encoding the user has requested; THEN > 3. An UnsafeBufferPointer will be returned that points to the underlying > storage, without NULL-termination; OTHERWISE > 4. A new contiguous buffer will be allocated and the string will be copied > into it, with no NULL-termination. > > Of course, I’ve used the word “return” here, but in practice all of these > functions would be best used as with* style functions that accept trailing > non-escaping closures. > Yup! At the very least, a ‘withUnsafeString’ should be a reasonably orthogonal API to propose (once the aforementioned infrastructure is in place). > The advantage of these functions is that they avoid unnecessary copying of > memory in circumstances when the internal String representation was already > suitable for passing to the C library. In the case of libraries like libicu, > this halves the number of memory accesses in common-cases (e.g. passing a > UTF-8 string), which can provide substantial improvements to both performance > and memory usage on hot code paths. > > Does this seem like it’s of interest to anyone else? > > Cory > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution