Hi Charlie, Thanks for your answer.
> Le 7 févr. 2017 à 18:23, Charlie Monroe <[email protected]> a écrit : > >> >> On Feb 7, 2017, at 5:56 PM, Florent Bruneau via swift-evolution >> <[email protected]> wrote: >> >> Anyone interested in that subject? >> >>> Le 31 janv. 2017 à 09:16, Florent Bruneau via swift-evolution >>> <[email protected]> a écrit : >>> >>> Hi swift-evolution, >>> >>> For the last few weeks, I've been working on introducing some Swift in a >>> pure-C codebase. While the Clang importer makes the process quite smooth, >>> there are still some rough edges. >>> >>> Here is a (lengthy) proposal resulting from that experience. >>> Rendered version: >>> https://gist.github.com/Fruneau/fa83fe87a316514797c1eeaaaa2e5012 >>> >>> Introduction >>> ======= >>> >>> Directly importing C APIs is a core feature of the Swift compiler. In that >>> process, C pointers are systematically imported as `Unsafe*Pointer` swift >>> objects. However, in C we make the distinction between pointers that >>> reference a single object, and those pointing to an array of objects. In >>> the case of a single object of type `T`, the Swift compiler should be able >>> to import the parameter `T *` as a `inout T`, and `T const *` as `T`. Since >>> the compiler cannot makes the distinction between pointer types by itself, >>> we propose to add an attribute of C pointer for that purpose. >>> >>> Motivation >>> ======= >>> >>> Let consider the following C API: >>> >>> ```c >>> typedef struct sb_t { >>> char * _Nonnull data; >>> int len; >>> int size; >>> } sb_t; >>> >>> /** Append the string \p str to \p sb. */ >>> void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str); >>> >>> /** Append the content of \p other to \p sb. */ >>> void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other); >>> >>> /** Returns the amount of available memory of \p sb. */ >>> int sb_avail(const sb_t * _Nonnull sb); >>> ``` >>> >>> This is imported in Swift as follow: >>> >>> ```swift >>> struct sb_t { >>> var data: UnsafeMutablePointer<Int8> >>> var len: Int32 >>> var size: Int32 >>> } >>> >>> func sb_adds(_ sb: UnsafeMutablePointer<sb_t>, _ str: UnsafePointer<Int8>) >>> func sb_addsb(_ sb: UnsafeMutablePointer<sb_t>, _ other: >>> UnsafePointer<sb_t>) >>> func sb_avail(_ sb: UnsafePointer<sb_t>) -> Int32 >>> ``` >>> >>> `sb_adds()` takes two pointers: the first one is supposed to point to a >>> single object named `sb` that will be mutated in order to add the content >>> of `str` which points to a c-string. So we have two kinds of pointers: the >>> first points to a single object, the second to a buffer. But both are >>> represented using `Unsafe*Pointer`. Swift cannot actually make the >>> difference between those two kind of pointers since the C language provides >>> no way to express it. >>> >>> `sb_addsb()` takes two objects of type `sb_t`. The first is mutated by the >>> function by appending the content of the second one, which is `const`. The >>> constness is properly reflected in Swift. However, the usage of the >>> imported API is Swift might be surprising since Swift requires usage of an >>> `inout` parameter in order to build an `Unsafe*Pointer` object: >>> >>> ```swift >>> var sb = sb_t(...) >>> let sb2 = sb_t(...) >>> sb_addsb(&sb, &sb2) // error: cannot pass immutable value as inout >>> argument: 'sb2' is a 'let' constant > > This is because your declaration is const sb_t * _Nonnull other... See > http://stackoverflow.com/questions/1143262/what-is-the-difference-between-const-int-const-int-const-and-int-const > > Change it to "const sb_t * const _Nonnull other" and you get a non-mutable > pointer and you can use it with let. Actually, no. In case of a function argument, the difference between `const sb_t *` and `const sb_t * const` is the same as between a `var` argument and a `let` argument in swift < 3: it only affects the mutability of the variable inside the function definition, and has no effect on the outside. When imported in swift, both result in the exact same function prototype. ```c void sb_addsb(sb_t *self, const sb_t *other); void sb_addsb2(sb_t *self, const sb_t * const other); void sb_addsb3(sb_t * const self, const sb_t * const other); ``` ```swift public func sb_addsb(_ self: UnsafeMutablePointer<sb_t>!, _ other: UnsafePointer<sb_t>!) public func sb_addsb2(_ self: UnsafeMutablePointer<sb_t>!, _ other: UnsafePointer<sb_t>!) public func sb_addsb3(_ self: UnsafeMutablePointer<sb_t>!, _ other: UnsafePointer<sb_t>!) ``` > >>> sb_addsb(&sb, sb2) // cannot convert value of type 'sb_t' to expected >>> argument type 'UnsafePointer<sb_t>!' > > If the other parameter is const, why not just take in the struct vs. pointer > to it? Yes, you run into the risk of copying the structure, but since the > structure (unless it's really small and fits into registers on some > architectures) gets passed by reference and if the compiler is smart enough > during optimization, it won't copy it anyway... (At least from what I > remember reading.) There are cases where you just cannot pass the structure by value. For example structure ending with a variable-size buffer: ```c struct with_trailing_buf { int len; char buf[]; }; ``` Anyway, the question here isn't really wether we should rewrite our whole code base, but rather: can we make the Clang Importer run smoothly even on non-Apple codebases. In C there are certainly as many coding conventions as there are developers, so if we want more developer willing to use safer languages without throwing away their existing code bases, I think we need to make the importer more flexible. > >>> var sb3 = sb_t(...) >>> sb_addsb(&sb, &sb3) // works >>> ``` >>> >>> ```swift >>> sb_avail(&sb2) // cannot convert value of type 'sb_t' to expected argument >>> type 'UnsafePointer<sb_t>!' >>> ``` >>> >>> >>> However, Swift also provides the `swift_name()` attribute that allows >>> remapping a C function to a Swift method, which includes mapping one of the >>> parameter to `self:`: >>> >>> ```c >>> __attribute__((swift_name("sb_t.add(self:string:)"))) >>> void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str); >>> __attribute__((swift_name("sb_t.add(self:other:)"))) >>> void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other); >>> __attribute__((swift_name("sb_t.avail(self:)"))) >>> int sb_avail(const sb_t * _Nonnull sb); >>> ``` > > While I do feel your pain dealing with structs imported from C, nothing is > stopping you from making an extension of that struct and implementing these > methods on it... Yes, it's a lot of boilerplate, but it can be in a separate > file until you migrate your C code into Swift, where as the suggested > solution generates so many annotations that it's IMHO unreadable for anyone > hoping to use the API from pure C... My problem is exactly the lot of boilerplate. We are still investigating the ability to switch from C to swift (at least for some part of our codebase), but we cannot afford rewriting the whole code, nor spending months writing overlays for every C library we want to be able to use from swift. And, but this is just my opinion, I think that adding qualifiers such as the `_Nonnull`, `_Nullable` (and the proposed `_Ref`), in addition to improved interoperability, also helps self-documenting the APIs, and helps providing safer code even in C (since the qualifier offer the opportunity for new static analysis heuristics). > >>> >>> ```swift >>> struct sb_t { >>> var data: UnsafeMutablePointer<Int8> >>> var len: Int32 >>> var size: Int32 >>> >>> mutating func add(string: UnsafePointer<Int8>) >>> mutating func add(other: UnsafePointer<sb_t>) >>> func avail() -> Int32 >>> } >>> ``` >>> >>> With that attribute used, there is no need to convert the parameter mapped >>> to `self:` to an `Unsafe*Pointer`. As a consequence, we have an improved >>> API: >>> >>> ```swift >>> sb2.avail() // This time it works! >>> ``` >>> >>> But we also have some inconsistent behavior since only `self:` is affected >>> by this: >>> >>> ```swift >>> sb.add(other: &sb2) // error: cannot pass immutable value as inout >>> argument: 'sb2' is a 'let' constant >>> sb.add(other: sb2) // cannot convert value of type 'sb_t' to expected >>> argument type 'UnsafePointer<sb_t>!' >>> ``` >>> >>> >>> What we observe here is that mapping an argument to `self:` is enough for >>> the compiler to be able to change its semantics. As soon as it knows the >>> pointer is actually the pointer to a single object, it can deal with it >>> without exposing it as an `Unsafe*Pointer`, making the API safer and less >>> surprising. >>> >>> >>> Proposed solution >>> ================ >>> >>> A new qualifier could be added to inform the compiler that a pointer points >>> to a single object. Then the Swift compiler could use that new piece of the >>> information to generate API that use directly the object type instead of >>> the pointer type. We propose the introduction of a new qualifier named >>> `_Ref`, semantically similar to a C++ reference. That is: >>> >>> * `_Ref` is applied with the same grammar as the `_Nonnull`, `_Nullable`, >>> family >>> * A pointer tagged `_Ref` cannot be used to access more than the single >>> pointed object. >>> * A pointer tagged `_Ref` is non-owning >>> >>> Parameters qualified with `_Ref` would then be imported in Swift as follows: >>> >>> * `T * _Ref _Nonnull` is imported as `inout T` >>> * `T * _Ref _Nullable` is imported as `inout T?` >>> * `T const * _Ref _Nonnull` is imported as `T` >>> * `T const * _Ref _Nullable` is imported as `T?` >>> >>> Example >>> ======= >>> >>> In the context of the provided example from the motivation section: >>> >>> ```c >>> typedef struct sb_t { >>> char * _Nonnull data; >>> int len; >>> int size; >>> } sb_t; >>> >>> /** Append the string \p str to \p sb. */ >>> void sb_adds(sb_t * _Ref _Nonnull sb, const char * _Nonnull str); >>> >>> /** Append the content of \p other to \p sb. */ >>> void sb_addsb(sb_t * _Ref _Nonnull sb, const sb_t * _SIngle _Nonnull other); >>> >>> /** Returns the amount of available memory of \p sb. */ >>> int sb_avail(const sb_t * _Ref _Nonnull sb); >>> ``` >>> >>> Would be imported as follow: >>> >>> ```swift >>> struct sb_t { >>> var data: UnsafeMutablePointer<Int8> >>> var len: Int32 >>> var size: Int32 >>> } >>> >>> func sb_adds(_ sb: inout sb_t, _ str: UnsafePointer<Int8>) >>> func sb_addsb(_ sb: inout sb_t, _ other: sb_t) >>> func sb_avail(_ sb: sb_t) -> Int32 >>> ``` >>> >>> Impact on existing code >>> ================= >>> >>> This proposal has no impact on existing code since it proposes additive >>> changes only. However, opting in for the `_Ref` qualifier on APIs already >>> exposed in Swift will impact the generated code. >>> >>> * For `const` pointers, the change is always source-incompatible >>> * For non-`const` pointers, the change will be source-compatible everywhere >>> we use the `&object` syntax to pass the argument from a plain object, but >>> will break sources that passed an `Unsafe*Pointer` as argument. >>> >>> >>> Alternatives considered >>> =================== >>> >>> It has been considered to use to qualifiers family instead of the `_Ref`: >>> >>> - one family to specify the kind of pointer: single object or array >>> - one family to declare the ownership >>> >>> This approach has the clear advantage to be more flexible, however it has >>> been found to be less expressive. Considering C API already should use >>> nullability qualifiers on every single pointers, forcing two additional >>> qualifiers on every pointer would be painful and negatively impact the >>> readability of the C APIs. >>> >>> `_Ref` on the other hand is short and leverage a concept already known by >>> developers, but is also more specific to particular use case. >>> >>> >>> Discussion >>> ======== >>> >>> * Safety: won't this make developper think they are calling safe APIs from >>> Swift while the API is actually unsafe? >>> >>> There is certainly a risk a C API make an improper use of `_Ref` (in >>> particular, breaks the non-owning part of the contract). However, this kind >>> of safety issues are already present when using the `swift_name()` >>> attribute of function and mapping one of the pointer parameter of the >>> function to `self:`, or when using the nullability qualifiers. >>> >>> * What about pointers stored in structures? or pointers returned by >>> functions? >>> >>> As a qualifier, `_Ref` could also be used on pointers that are not >>> arguments of a function: >>> >>> ```c >>> typedef struct { >>> sb_t * _Ref obj; >>> } sb_ptr_t; >>> >>> sb_t * _Ref sb_get_singleton(void); >>> ``` >>> >>> Swift, however, cannot import those as `sb_t` but will still be forced to >>> use `Unsafe*Pointer<sb_t>` since `sb_t` is a structure and as such is not >>> stored by reference. >>> >>> We could also imagine a standard `Reference<T>` type that would wrap a >>> pointer to a `T` (and could exposes the API of `T` on it). >>> >>> * What about function pointers that take a `_Ref` object? >>> >>> When an API takes a function pointer whose type includes a `_Ref` qualified >>> parameter, the qualifier applies: >>> >>> ```c >>> void take_cb(int (*a)(sb_t const * _Ref _Nonnull sb, sb_t * _Ref _Nonnull >>> other)) >>> ``` >>> >>> ```swift >>> func cb(sb: sb_t, other: inout sb_t) { >>> ... >>> } >>> >>> take_cb(cb) >>> ``` >>> >>> Swift guarantees we cannot break the non-owning contract and that we >>> respect the constness of the parameter. This is safer than using the >>> `Unsafe*Pointer`-based alternative. >>> >>> * Other use cases than Swift's? >>> >>> The `_Ref` qualifier could be used by static analysis to check that >>> functions don't access memory it shouldn't access: as long as some code >>> manipulates some memory through a `_Ref` qualified pointer, it shouldn't >>> access memory address bellow that pointer or above that pointer plus the >>> stride of the type (an exception remains for types ending with a >>> zero-length array). >>> >>> * What about pointers to arrays of objects? >>> >>> This is another topic. We could imagine a `_Array` qualifier that could >>> take an optional length. >>> >>> ```c >>> /* The number of elements is statically known or passed as argument */ >>> int main(int argc, char ** _Array(argc) argv) >>> >>> /* The number of element is unknown. */ >>> int puts(const char * _Array str); >>> ``` >>> _______________________________________________ >>> swift-evolution mailing list >>> [email protected] >>> https://lists.swift.org/mailman/listinfo/swift-evolution >> _______________________________________________ >> swift-evolution mailing list >> [email protected] >> https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
