Hi Charlie,

Thanks for your answer.

> Le 7 févr. 2017 à 18:23, Charlie Monroe <[email protected]> a écrit :
> 
>> 
>> On Feb 7, 2017, at 5:56 PM, Florent Bruneau via swift-evolution 
>> <[email protected]> wrote:
>> 
>> Anyone interested in that subject?
>> 
>>> Le 31 janv. 2017 à 09:16, Florent Bruneau via swift-evolution 
>>> <[email protected]> a écrit :
>>> 
>>> Hi swift-evolution, 
>>> 
>>> For the last few weeks, I've been working on introducing some Swift in a 
>>> pure-C codebase. While the Clang importer makes the process quite smooth, 
>>> there are still some rough edges.
>>> 
>>> Here is a (lengthy) proposal resulting from that experience.
>>> Rendered version: 
>>> https://gist.github.com/Fruneau/fa83fe87a316514797c1eeaaaa2e5012
>>> 
>>> Introduction
>>> =======
>>> 
>>> Directly importing C APIs is a core feature of the Swift compiler. In that 
>>> process, C pointers are systematically imported as `Unsafe*Pointer` swift 
>>> objects. However, in C we make the distinction between pointers that 
>>> reference a single object, and those pointing to an array of objects. In 
>>> the case of a single object of type `T`, the Swift compiler should be able 
>>> to import the parameter `T *` as a `inout T`, and `T const *` as `T`. Since 
>>> the compiler cannot makes the distinction between pointer types by itself, 
>>> we propose to add an attribute of C pointer for that purpose.
>>> 
>>> Motivation
>>> =======
>>> 
>>> Let consider the following C API:
>>> 
>>> ```c
>>> typedef struct sb_t {
>>>  char * _Nonnull data;
>>>  int len;
>>>  int size;
>>> } sb_t;
>>> 
>>> /** Append the string \p str to \p sb. */
>>> void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str);
>>> 
>>> /** Append the content of \p other to \p sb. */
>>> void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other);
>>> 
>>> /** Returns the amount of available memory of \p sb. */
>>> int sb_avail(const sb_t * _Nonnull sb);
>>> ```
>>> 
>>> This is imported in Swift as follow:
>>> 
>>> ```swift
>>> struct sb_t {
>>>  var data: UnsafeMutablePointer<Int8>
>>>  var len: Int32
>>>  var size: Int32
>>> }
>>> 
>>> func sb_adds(_ sb: UnsafeMutablePointer<sb_t>, _ str: UnsafePointer<Int8>)
>>> func sb_addsb(_ sb: UnsafeMutablePointer<sb_t>, _ other: 
>>> UnsafePointer<sb_t>)
>>> func sb_avail(_ sb: UnsafePointer<sb_t>) -> Int32
>>> ```
>>> 
>>> `sb_adds()` takes two pointers: the first one is supposed to point to a 
>>> single object named `sb` that will be mutated in order to add the content 
>>> of `str` which points to a c-string. So we have two kinds of pointers: the 
>>> first points to a single object, the second to a buffer. But both are 
>>> represented using `Unsafe*Pointer`. Swift cannot actually make the 
>>> difference between those two kind of pointers since the C language provides 
>>> no way to express it.
>>> 
>>> `sb_addsb()` takes two objects of type `sb_t`. The first is mutated by the 
>>> function by appending the content of the second one, which is `const`. The 
>>> constness is properly reflected in Swift. However, the usage of the 
>>> imported API is Swift might be surprising since Swift requires usage of an 
>>> `inout` parameter in order to build an `Unsafe*Pointer` object:
>>> 
>>> ```swift
>>> var sb = sb_t(...)
>>> let sb2 = sb_t(...)
>>> sb_addsb(&sb, &sb2) // error: cannot pass immutable value as inout 
>>> argument: 'sb2' is a 'let' constant
> 
> This is because your declaration is const sb_t * _Nonnull other... See 
> http://stackoverflow.com/questions/1143262/what-is-the-difference-between-const-int-const-int-const-and-int-const
> 
> Change it to "const sb_t * const _Nonnull other" and you get a non-mutable 
> pointer and you can use it with let.

Actually, no. In case of a function argument, the difference between `const 
sb_t *` and `const sb_t * const` is the same as between a `var` argument and a 
`let` argument in swift < 3: it only affects the mutability of the variable 
inside the function definition, and has no effect on the outside. When imported 
in swift, both result in the exact same function prototype.

```c
void sb_addsb(sb_t *self, const sb_t *other);
void sb_addsb2(sb_t *self, const sb_t * const other);
void sb_addsb3(sb_t * const self, const sb_t * const other);
```

```swift
public func sb_addsb(_ self: UnsafeMutablePointer<sb_t>!, _ other: 
UnsafePointer<sb_t>!)
public func sb_addsb2(_ self: UnsafeMutablePointer<sb_t>!, _ other: 
UnsafePointer<sb_t>!)
public func sb_addsb3(_ self: UnsafeMutablePointer<sb_t>!, _ other: 
UnsafePointer<sb_t>!)
```

> 
>>> sb_addsb(&sb, sb2) // cannot convert value of type 'sb_t' to expected 
>>> argument type 'UnsafePointer<sb_t>!'
> 
> If the other parameter is const, why not just take in the struct vs. pointer 
> to it? Yes, you run into the risk of copying the structure, but since the 
> structure (unless it's really small and fits into registers on some 
> architectures) gets passed by reference and if the compiler is smart enough 
> during optimization, it won't copy it anyway... (At least from what I 
> remember reading.)

There are cases where you just cannot pass the structure by value. For example 
structure ending with a variable-size buffer:

```c
struct with_trailing_buf {
    int len;
    char buf[];
};
```

Anyway, the question here isn't really wether we should rewrite our whole code 
base, but rather: can we make the Clang Importer run smoothly even on non-Apple 
codebases. In C there are certainly as many coding conventions as there are 
developers, so if we want more developer willing to use safer languages without 
throwing away their existing code bases, I think we need to make the importer 
more flexible.

> 
>>> var sb3 = sb_t(...)
>>> sb_addsb(&sb, &sb3) // works
>>> ```
>>> 
>>> ```swift
>>> sb_avail(&sb2) // cannot convert value of type 'sb_t' to expected argument 
>>> type 'UnsafePointer<sb_t>!'
>>> ```
>>> 
>>> 
>>> However, Swift also provides the `swift_name()` attribute that allows 
>>> remapping a C function to a Swift method, which includes mapping one of the 
>>> parameter to `self:`:
>>> 
>>> ```c 
>>> __attribute__((swift_name("sb_t.add(self:string:)")))
>>> void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str);
>>> __attribute__((swift_name("sb_t.add(self:other:)")))
>>> void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other);
>>> __attribute__((swift_name("sb_t.avail(self:)")))
>>> int sb_avail(const sb_t * _Nonnull sb);
>>> ```
> 
> While I do feel your pain dealing with structs imported from C, nothing is 
> stopping you from making an extension of that struct and implementing these 
> methods on it... Yes, it's a lot of boilerplate, but it can be in a separate 
> file until you migrate your C code into Swift, where as the suggested 
> solution generates so many annotations that it's IMHO unreadable for anyone 
> hoping to use the API from pure C...

My problem is exactly the lot of boilerplate. We are still investigating the 
ability to switch from C to swift (at least for some part of our codebase), but 
we cannot afford rewriting the whole code, nor spending months writing overlays 
for every C library we want to be able to use from swift. And, but this is just 
my opinion, I think that adding qualifiers such as the `_Nonnull`, `_Nullable` 
(and the proposed `_Ref`), in addition to improved interoperability, also helps 
self-documenting the APIs, and helps providing safer code even in C (since the 
qualifier offer the opportunity for new static analysis heuristics).

> 
>>> 
>>> ```swift
>>> struct sb_t {
>>>  var data: UnsafeMutablePointer<Int8>
>>>  var len: Int32
>>>  var size: Int32
>>> 
>>>  mutating func add(string: UnsafePointer<Int8>)
>>>  mutating func add(other: UnsafePointer<sb_t>)
>>>  func avail() -> Int32
>>> }
>>> ```
>>> 
>>> With that attribute used, there is no need to convert the parameter mapped 
>>> to `self:` to an `Unsafe*Pointer`. As a consequence, we have an improved 
>>> API:
>>> 
>>> ```swift
>>> sb2.avail() // This time it works!
>>> ```
>>> 
>>> But we also have some inconsistent behavior since only `self:` is affected 
>>> by this:
>>> 
>>> ```swift
>>> sb.add(other: &sb2)  // error: cannot pass immutable value as inout 
>>> argument: 'sb2' is a 'let' constant
>>> sb.add(other: sb2) // cannot convert value of type 'sb_t' to expected 
>>> argument type 'UnsafePointer<sb_t>!'
>>> ```
>>> 
>>> 
>>> What we observe here is that mapping an argument to `self:` is enough for 
>>> the compiler to be able to change its semantics. As soon as it knows the 
>>> pointer is actually the pointer to a single object, it can deal with it 
>>> without exposing it as an `Unsafe*Pointer`, making the API safer and less 
>>> surprising.
>>> 
>>> 
>>> Proposed solution
>>> ================
>>> 
>>> A new qualifier could be added to inform the compiler that a pointer points 
>>> to a single object. Then the Swift compiler could use that new piece of the 
>>> information to generate API that use directly the object type instead of 
>>> the pointer type. We propose the introduction of a new qualifier named 
>>> `_Ref`, semantically similar to a C++ reference. That is:
>>> 
>>> * `_Ref` is applied with the same grammar as the `_Nonnull`,  `_Nullable`, 
>>> family
>>> * A pointer tagged `_Ref` cannot be used to access more than the single 
>>> pointed object.
>>> * A pointer tagged `_Ref` is non-owning
>>> 
>>> Parameters qualified with `_Ref` would then be imported in Swift as follows:
>>> 
>>> * `T * _Ref _Nonnull` is imported as `inout T`
>>> * `T * _Ref _Nullable` is imported as `inout T?`
>>> * `T const * _Ref _Nonnull` is imported as `T`
>>> * `T const * _Ref _Nullable` is imported as `T?`
>>> 
>>> Example
>>> =======
>>> 
>>> In the context of the provided example from the motivation section:
>>> 
>>> ```c
>>> typedef struct sb_t {
>>>  char * _Nonnull data;
>>>  int len;
>>>  int size;
>>> } sb_t;
>>> 
>>> /** Append the string \p str to \p sb. */
>>> void sb_adds(sb_t * _Ref _Nonnull sb, const char * _Nonnull str);
>>> 
>>> /** Append the content of \p other to \p sb. */
>>> void sb_addsb(sb_t * _Ref _Nonnull sb, const sb_t * _SIngle _Nonnull other);
>>> 
>>> /** Returns the amount of available memory of \p sb. */
>>> int sb_avail(const sb_t * _Ref _Nonnull sb);
>>> ```
>>> 
>>> Would be imported as follow:
>>> 
>>> ```swift
>>> struct sb_t {
>>>  var data: UnsafeMutablePointer<Int8>
>>>  var len: Int32
>>>  var size: Int32
>>> }
>>> 
>>> func sb_adds(_ sb: inout sb_t, _ str: UnsafePointer<Int8>)
>>> func sb_addsb(_ sb: inout sb_t, _ other: sb_t)
>>> func sb_avail(_ sb: sb_t) -> Int32
>>> ```
>>> 
>>> Impact on existing code
>>> =================
>>> 
>>> This proposal has no impact on existing code since it proposes additive 
>>> changes only. However, opting in for the `_Ref` qualifier on APIs already 
>>> exposed in Swift will impact the generated code.
>>> 
>>> * For `const` pointers, the change is always source-incompatible
>>> * For non-`const` pointers, the change will be source-compatible everywhere 
>>> we use the `&object` syntax to pass the argument from a plain object, but 
>>> will break sources that passed an `Unsafe*Pointer` as argument.
>>> 
>>> 
>>> Alternatives considered
>>> ===================
>>> 
>>> It has been considered to use to qualifiers family instead of the `_Ref`:
>>> 
>>> - one family to specify the kind of pointer: single object or array
>>> - one family to declare the ownership
>>> 
>>> This approach has the clear advantage to be more flexible, however it has 
>>> been found to be less expressive. Considering C API already should use 
>>> nullability qualifiers on every single pointers, forcing two additional 
>>> qualifiers on every pointer would be painful and negatively impact the 
>>> readability of the C APIs.
>>> 
>>> `_Ref` on the other hand is short and leverage a concept already known by 
>>> developers, but is also more specific to particular use case.
>>> 
>>> 
>>> Discussion
>>> ========
>>> 
>>> * Safety: won't this make developper think they are calling safe APIs from 
>>> Swift while the API is actually unsafe?
>>> 
>>> There is certainly a risk a C API make an improper use of `_Ref` (in 
>>> particular, breaks the non-owning part of the contract). However, this kind 
>>> of safety issues are already present when using the `swift_name()` 
>>> attribute of function and mapping one of the pointer parameter of the 
>>> function to `self:`, or when using the nullability qualifiers.
>>> 
>>> * What about pointers stored in structures? or pointers returned by 
>>> functions?
>>> 
>>> As a qualifier, `_Ref` could also be used on pointers that are not 
>>> arguments of a function:
>>> 
>>> ```c
>>> typedef struct {
>>>  sb_t * _Ref obj;
>>> } sb_ptr_t;
>>> 
>>> sb_t * _Ref sb_get_singleton(void);
>>> ```
>>> 
>>> Swift, however, cannot import those as `sb_t` but will still be forced to 
>>> use `Unsafe*Pointer<sb_t>` since `sb_t` is a structure and as such is not 
>>> stored by reference.
>>> 
>>> We could also imagine a standard `Reference<T>` type that would wrap a 
>>> pointer to a `T` (and could exposes the API of `T` on it).
>>> 
>>> * What about function pointers that take a `_Ref` object?
>>> 
>>> When an API takes a function pointer whose type includes a `_Ref` qualified 
>>> parameter, the qualifier applies:
>>> 
>>> ```c
>>> void take_cb(int (*a)(sb_t const * _Ref _Nonnull sb, sb_t * _Ref _Nonnull 
>>> other))
>>> ```
>>> 
>>> ```swift
>>> func cb(sb: sb_t, other: inout sb_t) {
>>>  ...
>>> }
>>> 
>>> take_cb(cb)
>>> ```
>>> 
>>> Swift guarantees we cannot break the non-owning contract and that we 
>>> respect the constness of the parameter. This is safer than using the 
>>> `Unsafe*Pointer`-based alternative.
>>> 
>>> * Other use cases than Swift's?
>>> 
>>> The `_Ref` qualifier could be used by static analysis to check that 
>>> functions don't access memory it shouldn't access: as long as some code 
>>> manipulates some memory through a `_Ref` qualified pointer, it shouldn't 
>>> access memory address bellow that pointer or above that pointer plus the 
>>> stride of the type (an exception remains for types ending with a 
>>> zero-length array).
>>> 
>>> * What about pointers to arrays of objects?
>>> 
>>> This is another topic. We could imagine a `_Array` qualifier that could 
>>> take an optional length.
>>> 
>>> ```c
>>> /* The number of elements is statically known or passed as argument */
>>> int main(int argc, char ** _Array(argc) argv)
>>> 
>>> /* The number of element is unknown. */
>>> int puts(const char * _Array str);
>>> ```
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected]
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> _______________________________________________
>> swift-evolution mailing list
>> [email protected]
>> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to