> On Oct 7, 2016, at 6:04 PM, Michael Gottesman <mgottes...@apple.com> wrote: >> On Oct 7, 2016, at 5:09 PM, John McCall <rjmcc...@apple.com >> <mailto:rjmcc...@apple.com>> wrote: >>> On Oct 7, 2016, at 2:38 PM, Michael Gottesman <mgottes...@apple.com >>> <mailto:mgottes...@apple.com>> wrote: >>> >>> Attached below is an updated version of the proposal. Again a rendered >>> version is located at: >>> >>> https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html >>> <https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html> >>> >>> Michael >>> >>> ---- >>> >>> # Summary >>> >>> This document proposes: >>> >>> 1. adding the following ownership qualifiers to `load`: `[take]`, `[copy]`, >>> `[borrow]`, `[trivial]`. >>> 2. adding the following ownership qualifiers to `store`: `[init]`, >>> `[assign]`, >>> `[trivial]`. >>> 3. requiring all `load` and `store` operations to have ownership qualifiers. >>> 4. banning the use of `load [trivial]`, `store [trivial]` on memory >>> locations of >>> `non-trivial` type. >>> >>> This will allow for: >>> >>> 1. eliminating optimizer miscompiles that occur due to releases being moved >>> into >>> the region in between a `load`/`retain`, `load`/`release`, >>> `store`/`release`. (For a specific example, see the appendix). >>> 2. explicitly modeling `load [trivial]`/`store [trivial]` as having `unsafe >>> unowned` ownership semantics. This will be enforced via the verifier. >>> 3. more aggressive ARC code motion. >>> >>> # Definitions >>> >>> ## ownership qualified load >>> >>> We propose four different ownership qualifiers for load. Define `load >>> [trivial]` >>> as: >>> >>> %x = load [trivial] %x_ptr : $*Int >>> >>> => >>> >>> %x = load %x_ptr : $*Int >>> >>> A `load [trivial]` can not be used to load values of non-trivial type. >> >> Should we mandate the reverse as well, that e.g. load [copy] cannot be used >> to >> load values of trivial type? That's a little more work for substituting >> cloners, but it >> keeps everything nice and canonical. > > No. I think that in the trivial case, load [copy] optimizes to load [trivial] > as a canonicalization. This is just like applying a retain_value to a trivial > type.
I guess I'm fine with that. >>> Define >>> `load [copy]` as: >>> >>> %x = load [copy] %x_ptr : $*C >>> >>> => >>> >>> %x = load %x_ptr : $*C >>> retain_value %x : $C >>> >>> Then define `load [take]` as: >>> >>> %x = load [take] %x_ptr : $*Builtin.NativeObject >>> >>> => >>> >>> %x = load %x_ptr : $*Builtin.NativeObject >>> >>> **NOTE** `load [take]` implies that the loaded from memory location no >>> longer >>> owns the result object (i.e. a take is a move). Loading from the memory >>> location >>> again without reinitialization is illegal. >>> >>> Next we provide `load [borrow]`: >>> >>> %x = load [borrow] %x_ptr : $*Builtin.NativeObject >>> ... >>> endBorrow(%x, %x_ptr) >>> >>> => >>> >>> %x = load %x_ptr : $*Builtin.NativeObject >>> ... >>> endBorrow(%x, %x_ptr) >>> >>> `load [borrow]` implies that in the region between the `load` and the >>> `endBorrow`, the loaded object must semantically remain alive. >> >> For consistency with other multi-word SIL instructions, this should be >> end_borrow. > > Sure. > >> >> I wonder whether it might make more sense for load [borrow] to be a >> different instruction. >> There's a couple reasons for that first. The first is that it's the only >> load which introduces >> a scope, which is a really big difference structurally. The second is that >> it's the only load >> which returns a non-owned value, which will be a typing difference when we >> record >> ownership in the type system. > > I am fine with a load_borrow. If this is the only change left that you want > can I just send out a proposal with that small change and start implementing. > I am nervous about perfection being the enemy of the good (and I want to > start implementing this weekend if possible *evil smile*). Yes, it's fine to introduce this incrementally. We can discuss the formation rules on scopes concurrently. I think the same theoretical foundation is probably what we'll use for pseudo-linearity, memory-initialization soundness, and so on. John. >> Anyway, I really like that load [borrow] is scoped.. Are you planning to >> include the formation >> restrictions on scopes instructions immediately, or will you do that in a >> later proposal? > > It will be done in a later proposal. We are just trying to set the stage for > verification. > >> >> The requirements we need from scopes are: >> - there has to be a well-defined set of IPs that lie within the scope and >> - there can't be a non-explicit transition into or out of the scope. >> >> One way to get the first condition is to say that there has to be a unique >> scope-ender that >> post-dominates the scope-beginner, but that's a pretty hard restriction for >> SILGen to satisfy >> (as well as the optimizer, I imagine), and it doesn't handle "unreachable" >> or infinite loops. >> We need to allow multiple scope-enders, and we need to allow scope-enders to >> be missing >> in some cases. > > I agree with you here. We definitely want to be able to support multiple > scope-enders. > >> Here's the right formalism, I think: >> >> For all walks W beginning from the entry point of the function: >> For each node B in the CFG which is a scope-beginner: >> Let E be the set of scope-enders for B, and consider just the >> sub-sequence S of nodes >> of W where the node is a member of {B} U E. Then the elements of S at >> even >> positions (starting from 0) must be B, and the elements at odd positions >> must be >> members of E. Furthermore, if the walk ends in a return or throw >> instruction, then >> S must have even length. >> >> Note that for this to be true, all the scope-enders must be dominated by the >> scope-beginner. >> >> It is sufficient to just consider the set of "beeline" paths, i.e. acyclic >> paths ending in either a true >> terminator (a return, throw, or unreachable) or an edge back to a node >> already in the path. >> No such path may include multiple scope-enders for the same scope-beginner. >> If the path ends >> in a return or throw, it must include a matching scope-ender after every >> scope-beginner. If >> it ends in a loop back, then for every scope-beginner in the path, if the >> path contains a scope-ender >> then the loop destination must either precede the scope-beginner or follow >> the scope-ender; >> otherwise, the loop destination must follow the scope-beginner. >> >> Or, as a decision algorithm in Swift for a single scope-beginner: >> >> var blockEntryIsInScope = [Block: Bool]() >> func scan(startingFrom inst: Instruction, isInScope: Bool) { >> if inst is ReturnInst || inst is ThrowInst { >> guard !isInScope else { invalid("ended function while in scope") } >> return >> } >> >> if let term = inst as? TerminatorInst { >> for successor in term.successors { >> guard begin.dominates(successor) else { >> guard !isInScope else { invalid("branch out of scope while in >> scope") } >> continue >> } >> if let cachedValue = blockEntryIsInScope[successor] { >> if cachedValue != isInScope { >> invalid(isInScope ? "branch out of scope while in scope" : >> "branch into scope after exiting scope") >> } >> } else { >> blockEntryIsInScope[successor] = isInScope >> scan(startingFrom: successor.begin, isInScope: isInScope) >> } >> } >> return >> } >> >> if inst.endsScopeOf(begin) { >> guard isInScope else { invalid("ending scope that was already ended") } >> scan(startingFrom: inst.next, isInScope: false) >> } else { >> scan(startingFrom: inst.next, isInScope: isInScope) >> } >> } >> scan(startingFrom: begin, isInScope: true) > > Since this is tangential to the current proposal, can we introduce a side > thread? > >> >> John. >> >>> The `endBorrow` communicates to the optimizer: >>> >>> 1. That the value in `%x_ptr` should not be destroyed before endBorrow. >>> 2. Uses of `%x` should not be sunk past endBorrow since `%x` is only a >>> shallow >>> copy of the value in `%x_ptr` and past that point `%x_ptr` may not remain >>> alive. >>> >>> An example of where this construct is useful is when one has a let binding >>> to a >>> class instance `c` that contains a let field `f`. In that case `c`'s >>> lifetime >>> guarantees `f`'s lifetime meaning that returning `f` over the function call >>> boundary is safe. >>> >>> ## ownership qualified store >>> >>> First define a `store [trivial]` as: >>> >>> store %x to [trivial] %x_ptr : $*Int >>> >>> => >>> >>> store %x to %x_ptr : $*Int >>> >>> The verifier will prevent this instruction from being used on types with >>> non-trivial ownership. Define a `store [assign]` as follows: >>> >>> store %x to [assign] %x_ptr : $*C >>> >>> => >>> >>> %old_x = load %x_ptr : $*C >>> store %new_x to %x_ptr : $*C >>> release_value %old_x : $C >>> >>> *NOTE* `store` is defined as a consuming operation. We also provide >>> `store [init]` in the case where we know statically that there is no >>> previous value in the memory location: >>> >>> store %x to [init] %x_ptr : $*C >>> >>> => >>> >>> store %new_x to %x_ptr : $*C >>> >>> # Implementation >>> >>> ## Goals >>> >>> Our implementation strategy goals are: >>> >>> 1. zero impact on other compiler developers until the feature is fully >>> developed. This implies all work will be done behind a flag. >>> 2. separation of feature implementation from updating passes. >>> >>> Goal 2 will be implemented via a pass that transforms ownership qualified >>> `load`/`store` instructions into unqualified `load`/`store` right after >>> SILGen. >>> >>> ## Plan >>> >>> We begin by adding initial infrastructure for our development. This means: >>> >>> 1. Adding to SILOptions a disabled by default flag called >>> "EnableSILOwnershipModel". This flag will be set by a false by default >>> frontend >>> option called "-enable-sil-ownership-mode". >>> >>> 2. Bots will be brought up to test the compiler with >>> "-enable-sil-ownership-model" set to true. The specific bots are: >>> >>> * RA-OSX+simulators >>> * RA-Device >>> * RA-Linux. >>> >>> The bots will run once a day until the feature is close to completion. >>> Then a >>> polling model will be followed. >>> >>> Now that change isolation is borrow, we develop building blocks for the >>> optimization: >>> >>> 1. Two enums will be defined: `LoadInstOwnershipQualifier`, >>> `StoreInstOwnershipQualifier`. The exact definition of these enums are as >>> follows: >>> >>> enum class LoadOwnershipQualifier { >>> Unqualified, Take, Copy, Borrow, Trivial >>> }; >>> enum class StoreOwnershipQualifier { >>> Unqualified, Init, Assign, Trivial >>> }; >>> >>> *NOTE* `LoadOwnershipQualifier::Unqualified` and >>> `StoreOwnershipQualifier::Unqualified` are only needed for staging >>> purposes. >>> >>> 2. Creating a `LoadInst`, `StoreInst` will be changed to require an >>> ownership >>> qualifier. At this stage, this argument will default to `Unqualified`. >>> "Bare" >>> `load`, `store` when parsed via textual SIL will be considered to be >>> unqualified. This implies that the rest of the compiler will not have to be >>> changed as a result of this step. >>> >>> 3. Support will be added to SIL, IRGen, Serialization, SILPrinting, and SIL >>> Parsing for the rest of the qualifiers. SILGen will not be modified at this >>> stage. >>> >>> 4. A pass called the "OwnershipModelEliminator" will be implemented. It will >>> blow up all `load`, `store` instructions with non `*::Unqualified` >>> ownership >>> into their constituant ARC operations and `*::Unqualified` `load`, >>> `store` >>> insts. >>> >>> 3. An option called "EnforceSILOwnershipMode" will be added to the >>> verifier. If >>> the option is set, the verifier will assert if: >>> >>> a. `load`, `store` operations with trivial ownership are applied to >>> memory >>> locations with non-trivial type. >>> >>> b. `load`, `store` operation with unqualified ownership type are present >>> in >>> the IR. >>> >>> Finally, we wire up the building blocks: >>> >>> 1. If SILOption.EnableSILOwnershipModel is true, then the after SILGen SIL >>> verification will be performed with EnforceSILOwnershipModel set to true. >>> 2. If SILOption.EnableSILOwnershipModel is true, then the pass manager will >>> run >>> the OwnershipModelEliminator pass right after SILGen before the normal >>> pass >>> pipeline starts. >>> 3. SILGen will be changed to emit non-unqualified ownership qualifiers on >>> load, >>> store instructions when the EnableSILOwnershipModel flag is set. We will >>> use >>> the verifier throwing to guarantee that we are not missing any specific >>> cases. >>> >>> Then once all of the bots are green, we change >>> SILOption.EnableSILOwnershipModel >>> to be true by default. After a cooling off period, we move all of the code >>> behind the SILOwnershipModel flag in front of the flag. We do this so we can >>> reuse that flag for further SILOwnershipModel changes. >>> >>> ## Optimizer Changes >>> >>> Since the SILOwnershipModel eliminator will eliminate the ownership >>> qualifiers >>> on load, store instructions right after ownership verification, there will >>> be no >>> immediate affects on the optimizer and thus the optimizer changes can be >>> done in >>> parallel with the rest of the ARC optimization work. >>> >>> But, in the long run, we want to enforce these ownership invariants all >>> throughout the SIL pipeline implying these ownership qualified `load`, >>> `store` >>> instructions must be processed by IRGen, not eliminated by the >>> SILOwnershipModel >>> eliminator. Thus we will need to update passes to handle these new >>> instructions. The main optimizer changes can be separated into the following >>> areas: memory forwarding, dead stores, ARC optimization. In all of these >>> cases, >>> the necessary changes are relatively trivial to respond to. We give a quick >>> taste of two of them: store->load forwarding and ARC Code Motion. >>> >>> ### store->load forwarding >>> >>> Currently we perform store->load forwarding as follows: >>> >>> store %x to %x_ptr : $C >>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ... >>> %y = load %x_ptr : $C >>> use(%y) >>> >>> => >>> >>> store %x to %x_ptr : $C >>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ... >>> use(%x) >>> >>> In a world, where we are using ownership qualified load, store, we have to >>> also >>> consider the ownership implications. *NOTE* Since we are not modifying the >>> store, `store [assign]` and `store [init]` are treated the same. Thus >>> without >>> any loss of generality, lets consider solely `store`. >>> >>> store %x to [assign] %x_ptr : $C >>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ... >>> %y = load [copy] %x_ptr : $C >>> use(%y) >>> >>> => >>> >>> store %x to [assign] %x_ptr : $C >>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ... >>> strong_retain %x >>> use(%x) >>> >>> ### ARC Code Motion >>> >>> If ARC Code Motion wishes to move the ARC semantics of ownership qualified >>> `load`, `store` instructions, it must now consider read/write effects. On >>> the >>> other hand, it will be able to now not consider the side-effects of >>> destructors >>> when moving retain/release operations. >>> >>> ### Normal Code Motion >>> >>> Normal code motion will lose some effectiveness since many of the load/store >>> operations that it used to be able to move now must consider ARC >>> information. We >>> may need to consider running ARC code motion earlier in the pipeline where >>> we >>> normally run Normal Code Motion to ensure that we are able to handle these >>> cases. >>> >>> ### ARC Optimization >>> >>> The main implication for ARC optimization is that instead of eliminating >>> just >>> retains, releases, it must be able to recognize ownership qualified `load`, >>> `store` and set their flags as appropriate. >>> >>> ### Function Signature Optimization >>> >>> Semantic ARC affects function signature optimization in the context of the >>> owned >>> to borrow optimization. Specifically: >>> >>> 1. A `store [assign]` must be recognized as a release of the old value that >>> is >>> being overridden. In such a case, we can move the `release` of the old >>> value >>> into the caller and change the `store [assign]` into a `store [init]`. >>> 2. A `load [copy]` must be recognized as a retain in the callee. Then >>> function >>> signature optimization will transform the `load [copy]` into a `load >>> [borrow]`. This would require the addition of a new `@borrow` return >>> value convention. >>> >>> # Appendix >>> >>> ## Partial Initialization of Loadable References in SIL >>> >>> In SIL, a value of non-trivial loadable type is loaded from a memory >>> location as >>> follows: >>> >>> %x = load %x_ptr : $*S >>> ... >>> retain_value %x_ptr : $S >>> >>> At first glance, this looks reasonable, but in truth there is a hidden >>> drawback: >>> the partially initialized zone in between the load and the retain >>> operation. This zone creates a period of time when an "evil optimizer" could >>> insert an instruction that causes x to be deallocated before the copy is >>> finished being initialized. Similar issues come up when trying to perform a >>> store of a non-trival value into a memory location. >>> >>> Since this sort of partial initialization is allowed in SIL, the optimizer >>> is >>> forced to be overly conservative when attempting to move releases passed >>> retains >>> lest the release triggers a deinit that destroys a value like `%x`. Lets >>> look at >>> two concrete examples that show how semantically providing ownership >>> qualified >>> load, store instructions eliminate this problem. >>> >>> **NOTE** Without any loss of generality, we will speak of values with >>> reference >>> semantics instead of non-trivial values. >>> >>> ## Case Study: Partial Initialization and load [copy] >>> >>> ### The Problem >>> >>> Consider the following swift program: >>> >>> func opaque_call() >>> >>> final class C { >>> var int: Int = 0 >>> deinit { >>> opaque_call() >>> } >>> } >>> >>> final class D { >>> var int: Int = 0 >>> } >>> >>> var GLOBAL_C : C? = nil >>> var GLOBAL_D : D? = nil >>> >>> func useC(_ c: C) >>> func useD(_ d: D) >>> >>> func run() { >>> let c = C() >>> GLOBAL_C = c >>> let d = D() >>> GLOBAL_D = d >>> useC(c) >>> useD(d) >>> } >>> >>> Notice that both `C` and `D` have fixed layouts and separate class >>> hierarchies, >>> but `C`'s deinit has a call to the function `opaque_call` which may write to >>> `GLOBAL_D` or `GLOBAL_C`. Additionally assume that both `useC` and `useD` >>> are >>> known to the compiler to not have any affects on instances of type `D`, `C` >>> respectively and useC assigns `nil` to `GLOBAL_C`. Now consider the >>> following >>> valid SIL lowering for `run`: >>> >>> sil_global GLOBAL_D : $D >>> sil_global GLOBAL_C : $C >>> >>> final class C { >>> var x: Int >>> deinit >>> } >>> >>> final class D { >>> var x: Int >>> } >>> >>> sil @useC : $@convention(thin) () -> () >>> sil @useD : $@convention(thin) () -> () >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> (2) >>> >>> %c2 = load %global_c : $*C >>> (3) >>> strong_retain %c2 : $C >>> (4) >>> %d2 = load %global_d : $*D >>> (5) >>> strong_retain %d2 : $D >>> (6) >>> >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () >>> (7) >>> >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> >>> strong_release %d : $D >>> (9) >>> strong_release %c : $C >>> (10) >>> } >>> >>> Lets optimize this function! First we perform the following operations: >>> >>> 1. Since `(2)` is storing to an identified object that can not be >>> `GLOBAL_C`, we >>> can store to load forward `(1)` to `(3)`. >>> 2. Since a retain does not block store to load forwarding, we can forward >>> `(2)` >>> to `(5)`. But lets for the sake of argument, assume that the optimizer >>> keeps >>> such information as an analysis and does not perform the actual >>> load->store >>> forwarding. >>> 3. Even though we do not foward `(2)` to `(5)`, we can still move `(4)` over >>> `(6)` so that `(4)` is right before `(7)`. >>> >>> This yields (using the ' marker to designate that a register has had >>> load-store >>> forwarding applied to it): >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> (2) >>> >>> strong_retain %c : $C >>> (4') >>> %d2 = load %global_d : $*D >>> (5) >>> strong_retain %d2 : $D >>> (6) >>> >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> >>> strong_release %d : $D >>> (9) >>> strong_release %c : $C >>> (10) >>> } >>> >>> Then by assumption, we know that `%useC` does not perform any releases of >>> any >>> instances of class `D`. Thus `(6)` can be moved past `(7')` and we can then >>> pair >>> and eliminate `(6)` and `(9)` via the rules of ARC optimization using the >>> analysis information that `%d2` and `%d` are th same due to the possibility >>> of >>> performing store->load forwarding. After performing such transformations, >>> `run` >>> looks as follows: >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> >>> %d2 = load %global_d : $*D >>> (5) >>> strong_retain %c : $C >>> (4') >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> >>> strong_release %c : $C >>> (10) >>> } >>> >>> Now by assumption, we know that `%useD_func` does not touch any instances of >>> class `C` and `%c` does not contain any ivars of type `D` and is final so >>> none >>> can be added. At first glance, this seems to suggest that we can move `(10)` >>> before `(8')` and then pair/eliminate `(4')` and `(10)`. But is this a safe >>> optimization perform? Absolutely Not! Why? Remember that since `useC_func` >>> assigns `nil` to `GLOBAL_C`, after `(7')`, `%c` could have a reference count >>> of 1. Thus `(10)` _may_ invoke the destructor of `C`. Since this destructor >>> calls an opaque function that _could_ potentially write to `GLOBAL_D`, we >>> may be >>> be passing `%d2`, an already deallocated object to `%useD_func`, an illegal >>> optimization! >>> >>> Lets think a bit more about this example and consider this example at the >>> language level. Remember that while Swift's deinit are not asychronous, we >>> do >>> not allow for user level code to create dependencies from the body of the >>> destructor into the normal control flow that has called it. This means that >>> there are two valid results of this code: >>> >>> - Operation Sequence 1: No optimization is performed and `%d2` is passed to >>> `%useD_func`. >>> - Operation Sequence 2: We shorten the lifetime of `%c` before `%useD_func` >>> and >>> a different instance of `$D` is passed into `%useD_func`. >>> >>> The fact that 1 occurs without optimization is just as a result of an >>> implementation detail of SILGen. 2 is also a valid sequence of operations. >>> >>> Given that: >>> >>> 1. As a principle, the optimizer does not consider such dependencies to >>> avoid >>> being overly conservative. >>> 2. We provide constructs to ensure appropriate lifetimes via the usage of >>> constructs such as fix_lifetime. >>> >>> We need to figure out how to express our optimization such that 2 >>> happens. Remember that one of the optimizations that we performed at the >>> beginning was to move `(6)` over `(7')`, i.e., transform this: >>> >>> %d = alloc_ref $D >>> %global_d_addr = global_addr GLOBAL_D : $D >>> %d = load %global_d_addr : $*D (5) >>> strong_retain %d : $D (6) >>> >>> // Call the user functions passing in the instances that we loaded >>> from the globals. >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> >>> into: >>> >>> %global_d_addr = global_addr GLOBAL_D : $D >>> %d2 = load %global_d_addr : $*D (5) >>> >>> // Call the user functions passing in the instances that we loaded >>> from the globals. >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> strong_retain %d2 : $D (6) >>> >>> This transformation in Swift corresponds to transforming: >>> >>> let d = GLOBAL_D >>> useC(c) >>> >>> to: >>> >>> let d_raw = load_d_value(GLOBAL_D) >>> useC(c) >>> let d = take_ownership_of_d(d_raw) >>> >>> This is clearly an instance where we have moved a side-effect in between the >>> loading of the data and the taking ownership of such data, that is before >>> the >>> `let` is fully initialized. What if instead of just moving the retain, we >>> moved >>> the entire let statement? This would then result in the following swift >>> code: >>> >>> useC(c) >>> let d = GLOBAL_D >>> >>> and would correspond to the following SIL snippet: >>> >>> %global_d_addr = global_addr GLOBAL_D : $D >>> >>> // Call the user functions passing in the instances that we loaded >>> from the globals. >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> %d2 = load %global_d_addr : $*D >>> (5) >>> strong_retain %d2 : $D >>> (6) >>> >>> Moving the load with the strong_retain to ensure that the full >>> initialization is >>> performed even after code motion causes our SIL to look as follows: >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> >>> strong_retain %c : $C >>> (4') >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () >>> (7') >>> >>> %d2 = load %global_d : $*D >>> (5) >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> >>> strong_release %c : $C >>> (10) >>> } >>> >>> Giving us the exact result that we want: Operation Sequence 2! >>> >>> ### Defining load [copy] >>> >>> Given that we wish the load, store to be tightly coupled together, it is >>> natural >>> to express this operation as a `load [copy]` instruction. Lets define the >>> `load >>> [copy]` instruction as follows: >>> >>> %1 = load [copy] %0 : $*C >>> >>> => >>> >>> %1 = load %0 : $*C >>> retain_value %1 : $C >>> >>> Now lets transform our initial example to use this instruction: >>> >>> Notice how now if we move `(7)` over `(3)` and `(6)` now, we get the >>> following SIL: >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> (2) >>> >>> %c2 = load [copy] %global_c : $*C >>> (3) >>> %d2 = load [copy] %global_d : $*D >>> (5) >>> >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () >>> (7) >>> >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> >>> strong_release %d : $D >>> (9) >>> strong_release %c : $C >>> (10) >>> } >>> >>> We then perform the previous code motion: >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> (2) >>> >>> %c2 = load [copy] %global_c : $*C >>> (3) >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () >>> (7) >>> strong_release %d : $D >>> (9) >>> >>> %d2 = load [copy] %global_d : $*D >>> (5) >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> strong_release %c : $C >>> (10) >>> } >>> >>> We then would like to eliminate `(9)` and `(10)` by pairing them with `(3)` >>> and >>> `(8)`. Can we still do so? One way we could do this is by introducing the >>> `[take]` flag. The `[take]` flag on a `load [take]` says that one is >>> semantically loading a value from a memory location and are taking >>> ownership of >>> the value thus eliding the retain. In terms of SIL this flag is defined as: >>> >>> %x = load [take] %x_ptr : $*C >>> >>> => >>> >>> %x = load %x_ptr : $*C >>> >>> Why do we care about having such a `load [take]` instruction when we could >>> just >>> use a `load`? The reason why is that a normal `load` has unsafe unowned >>> ownership (i.e. it has no implications on ownership). We would like for >>> memory >>> that has non-trivial type to only be able to be loaded via instructions that >>> maintain said ownership. We will allow for casting to trivial types as >>> usual to >>> provide such access if it is required. >>> >>> Thus we have achieved the desired result: >>> >>> sil @run : $@convention(thin) () -> () { >>> bb0: >>> %c = alloc_ref $C >>> %global_c = global_addr @GLOBAL_C : $*C >>> strong_retain %c : $C >>> store %c to %global_c : $*C >>> (1) >>> >>> %d = alloc_ref $D >>> %global_d = global_addr @GLOBAL_D : $*D >>> strong_retain %d : $D >>> store %d to %global_d : $*D >>> (2) >>> >>> %c2 = load [take] %global_c : $*C >>> (3) >>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> () >>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () >>> (7) >>> >>> %d2 = load [take] %global_d : $*D >>> (5) >>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> () >>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () >>> (8) >>> } >>> >>> >>>> On Oct 6, 2016, at 3:03 PM, John McCall <rjmcc...@apple.com >>>> <mailto:rjmcc...@apple.com>> wrote: >>>> >>>>> On Oct 5, 2016, at 4:48 PM, Michael Gottesman <mgottes...@apple.com >>>>> <mailto:mgottes...@apple.com>> wrote: >>>>>> On Oct 5, 2016, at 4:40 PM, Michael Gottesman via swift-dev >>>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote: >>>>>> >>>>>>> >>>>>>> On Oct 4, 2016, at 1:04 PM, John McCall <rjmcc...@apple.com >>>>>>> <mailto:rjmcc...@apple.com>> wrote: >>>>>>> >>>>>>>> >>>>>>>> On Sep 30, 2016, at 11:54 PM, Michael Gottesman via swift-dev >>>>>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote: >>>>>>>> >>>>>>>> The document attached below contains the first "Semantic ARC" mini >>>>>>>> proposal: the High Level ARC Memory Operations Proposal. >>>>>>>> >>>>>>>> An html rendered version of this markdown document is available at the >>>>>>>> following URL: >>>>>>>> >>>>>>>> https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html >>>>>>>> >>>>>>>> <https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html> >>>>>>>> >>>>>>>> ---- >>>>>>>> >>>>>>>> # Summary >>>>>>>> >>>>>>>> This document proposes: >>>>>>>> >>>>>>>> 1. adding the `load_strong`, `store_strong` instructions to SIL. These >>>>>>>> can only >>>>>>>> be used with memory locations of `non-trivial` type. >>>>>>> >>>>>>> I would really like to avoid using the word "strong" here. Under the >>>>>>> current proposal, these instructions will be usable with arbitrary >>>>>>> non-trivial types, not just primitive class references. Even if you >>>>>>> think of an aggregate that happens to contain one or more strong >>>>>>> references as some sort of aggregate strong reference (which is >>>>>>> questionable but not completely absurd), we already have loadable >>>>>>> non-strong class references that this operation would be usable with, >>>>>>> like native unowned references. "load_strong %0 : $*@sil_unowned T" as >>>>>>> an operation yielding a scalar "@sil_unowned T" is ridiculous, and it >>>>>>> will only get more ridiculous when we eventually allow this operation >>>>>>> to work with types that are currently address-only, like weak >>>>>>> references. >>>>>>> >>>>>>> Brainstorming: >>>>>>> >>>>>>> Something like load_copy and store_copy would be a bit unfortunate, >>>>>>> since store_copy doesn't actually copy the source operand and we want >>>>>>> to have a load_copy [take]. >>>>>>> >>>>>>> load_value and store_value seem excessively generic. It's not like >>>>>>> non-trivial types aren't values. >>>>>>> >>>>>>> One question that comes to mind: do we actually need new instructions >>>>>>> here other than for staging purposes? We don't actually need new >>>>>>> instructions for pseudo-linear SIL to work; we just need to say that we >>>>>>> only enforce pseudo-linearity for non-trivial types. >>>>>>> >>>>>>> If we just want the instruction to be explicit about ownership so that >>>>>>> we can easily distinguish these cases, we can make the rule always >>>>>>> explicit, e.g.: >>>>>>> load [take] %0 : $*MyClass >>>>>>> load [copy] %0 : $*MyClass >>>>>>> load [trivial] %0 : $*Int >>>>>>> >>>>>>> store %0 to [initialization] %1 : $*MyClass >>>>>>> store %0 to [assignment] %1 : $*MyClass >>>>>>> store %0 to [trivial] %1 : $*Int >>>>>>> >>>>>>> John. >>>>>> >>>>>> The reason why I originally suggested to go the load_strong route is >>>>>> that we already have load_weak, load_unowned instructions. If I could >>>>>> add a load_strong instruction, then it would make sense to assign an >>>>>> engineer to do a pass over all 3 of these instructions and combine them >>>>>> into 1 load instruction. That is, first transform into a form amenable >>>>>> for canonicalization and then canonicalize all at once. >>>>>> >>>>>> As you pointed out, both load_unowned and load_weak involve >>>>>> representation changes in type (for instance the change of weak pointers >>>>>> to Optional<T>). Such a change would be against the "spirit" of a load >>>>>> instruction to perform such representation changes versus ownership >>>>>> changes. >>>>>> >>>>>> In terms of the properties that we actually want here, what is important >>>>>> is that we can verify that no non-trivially typed values are loaded in >>>>>> an unsafe unowned manner. That can be done also with ownership flags on >>>>>> load/store. >>>>>> >>>>>> Does this sound reasonable: >>>>>> >>>>>> 1. We introduce two enums that define memory ownership changes, one for >>>>>> load and one for store. Both of these enums will contain a [trivial] >>>>>> ownership. >>>>>> 2. We enforce in the verifier that non-trivial types must have a >>>>>> non-trivial ownership modifier on any memory operations that they are >>>>>> involved in. >>>>> >>>>> Sorry for not being explicit. I will not add new instructions, just >>>>> modifiers. Assuming that this is agreeable to you, I am going to prepare >>>>> a quick additional version of the proposal document. >>>> >>>> That sounds great, thanks. >>>> >>>> John. >
_______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev