Re: [swift-dev] [semantic-arc][proposal] High Level ARC Memory Operations

Michael Gottesman via swift-dev Wed, 05 Oct 2016 16:50:21 -0700

> On Oct 5, 2016, at 4:40 PM, Michael Gottesman via swift-dev 
> <swift-dev@swift.org> wrote:
> 
>> 
>> On Oct 4, 2016, at 1:04 PM, John McCall <rjmcc...@apple.com 
>> <mailto:rjmcc...@apple.com>> wrote:
>> 
>>> 
>>> On Sep 30, 2016, at 11:54 PM, Michael Gottesman via swift-dev 
>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>> 
>>> The document attached below contains the first "Semantic ARC" mini 
>>> proposal: the High Level ARC Memory Operations Proposal.
>>> 
>>> An html rendered version of this markdown document is available at the 
>>> following URL:
>>> 
>>> https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html 
>>> <https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html>
>>> 
>>> ----
>>> 
>>> # Summary
>>> 
>>> This document proposes:
>>> 
>>> 1. adding the `load_strong`, `store_strong` instructions to SIL. These can 
>>> only
>>>    be used with memory locations of `non-trivial` type.
>> 
>> I would really like to avoid using the word "strong" here.  Under the 
>> current proposal, these instructions will be usable with arbitrary 
>> non-trivial types, not just primitive class references.  Even if you think 
>> of an aggregate that happens to contain one or more strong references as 
>> some sort of aggregate strong reference (which is questionable but not 
>> completely absurd), we already have loadable non-strong class references 
>> that this operation would be usable with, like native unowned references.  
>> "load_strong %0 : $*@sil_unowned T" as an operation yielding a scalar 
>> "@sil_unowned T" is ridiculous, and it will only get more ridiculous when we 
>> eventually allow this operation to work with types that are currently 
>> address-only, like weak references.
>> 
>> Brainstorming:
>> 
>> Something like load_copy and store_copy would be a bit unfortunate, since 
>> store_copy doesn't actually copy the source operand and we want to have a 
>> load_copy [take].
>> 
>> load_value and store_value seem excessively generic.  It's not like 
>> non-trivial types aren't values.
>> 
>> One question that comes to mind: do we actually need new instructions here 
>> other than for staging purposes?  We don't actually need new instructions 
>> for pseudo-linear SIL to work; we just need to say that we only enforce 
>> pseudo-linearity for non-trivial types.
>> 
>> If we just want the instruction to be explicit about ownership so that we 
>> can easily distinguish these cases, we can make the rule always explicit, 
>> e.g.:
>>   load [take] %0 : $*MyClass
>>   load [copy] %0 : $*MyClass
>>   load [trivial] %0 : $*Int
>> 
>>   store %0 to [initialization] %1 : $*MyClass
>>   store %0 to [assignment] %1 : $*MyClass
>>   store %0 to [trivial] %1 : $*Int
>> 
>> John.
> 
> The reason why I originally suggested to go the load_strong route is that we 
> already have load_weak, load_unowned instructions. If I could add a 
> load_strong instruction, then it would make sense to assign an engineer to do 
> a pass over all 3 of these instructions and combine them into 1 load 
> instruction. That is, first transform into a form amenable for 
> canonicalization and then canonicalize all at once.
> 
> As you pointed out, both load_unowned and load_weak involve representation 
> changes in type (for instance the change of weak pointers to Optional<T>). 
> Such a change would be against the "spirit" of a load instruction to perform 
> such representation changes versus ownership changes.
> 
> In terms of the properties that we actually want here, what is important is 
> that we can verify that no non-trivially typed values are loaded in an unsafe 
> unowned manner. That can be done also with ownership flags on load/store.
> 
> Does this sound reasonable:
> 
> 1. We introduce two enums that define memory ownership changes, one for load 
> and one for store. Both of these enums will contain a [trivial] ownership.
> 2. We enforce in the verifier that non-trivial types must have a non-trivial 
> ownership modifier on any memory operations that they are involved in.


Sorry for not being explicit. I will not add new instructions, just modifiers. 
Assuming that this is agreeable to you, I am going to prepare a quick 
additional version of the proposal document.

> 
> Michael
> 
>> 
>>> 2. banning the use of `load`, `store` on values of `non-trivial` type.
>>> 
>>> This will allow for:
>>> 
>>> 1. eliminating optimizer miscompiles that occur due to releases being moved 
>>> into
>>>    the region in between a `load`/`retain`, `load`/`release`,
>>>    `store`/`release`. (For a specific example, see the appendix).
>>> 2. modeling `load`/`store` as having `unsafe unowned` ownership semantics. 
>>> This
>>>    will be enforced via the verifier.
>>> 3. more aggressive ARC code motion.
>>> 
>>> # Definitions
>>> 
>>> ## load_strong
>>> 
>>> We propose three different forms of load_strong differentiated via flags. 
>>> First
>>> define `load_strong` as follows:
>>> 
>>>     %x = load_strong %x_ptr : $*C
>>> 
>>>       =>
>>> 
>>>     %x = load %x_ptr : $*C
>>>     retain_value %x : $C
>>> 
>>> Then define `load_strong [take]` as:
>>> 
>>>     %x = load_strong [take] %x_ptr : $*Builtin.NativeObject
>>> 
>>>       =>
>>> 
>>>     %x = load %x_ptr : $*Builtin.NativeObject
>>> 
>>> **NOTE** `load_strong [take]` implies that the loaded from memory location 
>>> no
>>> longer owns the result object (i.e. a take is a move). Loading from the 
>>> memory
>>> location again without reinitialization is illegal.
>>> 
>>> Next we provide `load_strong [guaranteed]`:
>>> 
>>>     %x = load_strong [guaranteed] %x_ptr : $*Builtin.NativeObject
>>>     ...
>>>     fixLifetime(%x)
>>> 
>>>       =>
>>> 
>>>     %x = load %x_ptr : $*Builtin.NativeObject
>>>     ...
>>>     fixLifetime(%x)
>>> 
>>> `load_strong [guaranteed]` implies that in the region before the 
>>> fixLifetime,
>>> the loaded object is guaranteed semantically to remain alive. The 
>>> fixLifetime
>>> communicates to the optimizer the location up to which the value's lifetime 
>>> is
>>> guaranteed to live. An example of where this construct is useful is when 
>>> one has
>>> a let binding to a class instance `c` that contains a let field `f`. In that
>>> case `c`'s lifetime guarantees `f`'s lifetime.
>>> 
>>> ## store_strong
>>> 
>>> Define a store_strong as follows:
>>> 
>>>     store_strong %x to %x_ptr : $*C
>>> 
>>>        =>
>>> 
>>>     %old_x = load %x_ptr : $*C
>>>     store %new_x to %x_ptr : $*C
>>>     release_value %old_x : $C
>>> 
>>> *NOTE* store_strong is defined as a consuming operation. We also provide
>>> `store_strong [init]` in the case where we know statically that there is no
>>> previous value in the memory location:
>>> 
>>>     store_strong %x to [init] %x_ptr : $*C
>>> 
>>>        =>
>>> 
>>>     store %new_x to %x_ptr : $*C
>>> 
>>> # Implementation
>>> 
>>> ## Goals
>>> 
>>> Our implementation strategy goals are:
>>> 
>>> 1. zero impact on other compiler developers until the feature is fully
>>>    developed. This implies all work will be done behind a flag.
>>> 2. separation of feature implementation from updating passes.
>>> 
>>> Goal 2 will be implemented via a pass that blows up 
>>> `load_strong`/`store_strong`
>>> right after SILGen.
>>> 
>>> ## Plan
>>> 
>>> We begin by adding initial infrastructure for our development. This means:
>>> 
>>> 1. Adding to SILOptions a disabled by default flag called
>>>  "EnableSILOwnershipModel". This flag will be set by a false by default 
>>> frontend
>>>  option called "-enable-sil-ownership-mode".
>>> 
>>> 2. Bots will be brought up to test the compiler with
>>>    "-enable-sil-ownership-model" set to true. The specific bots are:
>>> 
>>>    * RA-OSX+simulators
>>>    * RA-Device
>>>    * RA-Linux.
>>> 
>>>    The bots will run once a day until the feature is close to completion. 
>>> Then a
>>>    polling model will be followed.
>>> 
>>> Now that change isolation is guaranteed, we develop building blocks for the
>>> optimization:
>>> 
>>> 1. load_strong, store_strong will be added to SIL and IRGen, serialization,
>>> printing, SIL parsing support will be implemented. SILGen will not be 
>>> modified
>>> at this stage.
>>> 
>>> 2. A pass called the "OwnershipModelEliminator" will be implemented. It will
>>> (initially) blow up load_strong/store_strong instructions into their 
>>> constituent
>>> operations.
>>> 
>>> 3. An option called "EnforceSILOwnershipMode" will be added to the 
>>> verifier. If
>>> the option is set, the verifier will assert if unsafe unowned loads, stores 
>>> are
>>> used to load from non-trivial memory locations.
>>> 
>>> Finally, we wire up the building blocks:
>>> 
>>> 1. If SILOption.EnableSILOwnershipModel is true, then the after SILGen SIL
>>>    verification will be performed with EnforceSILOwnershipModel set to true.
>>> 2. If SILOption.EnableSILOwnershipModel is true, then the pass manager will 
>>> run
>>>    the OwnershipModelEliminator pass right after SILGen before the normal 
>>> pass
>>>    pipeline starts.
>>> 3. SILGen will be changed to emit load_strong, store_strong instructions 
>>> when
>>>    the EnableSILOwnershipModel flag is set. We will use the verifier 
>>> throwing to
>>>    guarantee that we are not missing any specific cases.
>>> 
>>> Then once all fo the bots are green, we change 
>>> SILOption.EnableSILOwnershipModel
>>> to be true by default. After a cooling off period, we move all of the code
>>> behind the SILOwnershipModel flag in front of the flag. We do this so we can
>>> reuse that flag for further SILOwnershipModel changes.
>>> 
>>> ## Optimizer Changes
>>> 
>>> Since the SILOwnershipModel eliminator will eliminate the load_strong,
>>> store_strong instructions right after ownership verification, there will be 
>>> no
>>> immediate affects on the optimizer and thus the optimizer changes can be 
>>> done in
>>> parallel with the rest of the ARC optimization work.
>>> 
>>> But, in the long run, we need IRGen to eliminate the load_strong, 
>>> store_strong
>>> instructions, not the SILOwnershipModel eliminator, so that we can enforce
>>> Ownership invariants all through the SIL pipeline. Thus we will need to 
>>> update
>>> passes to handle these new instructions. The main optimizer changes can be
>>> separated into the following areas: memory forwarding, dead stores, ARC
>>> optimization. In all of these cases, the necessary changes are relatively
>>> trivial to respond to. We give a quick taste of two of them: store->load
>>> forwarding and ARC Code Motion.
>>> 
>>> ### store->load forwarding
>>> 
>>> Currently we perform store->load forwarding as follows:
>>> 
>>>     store %x to %x_ptr : $C
>>>     ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>>     %y = load %x_ptr : $C
>>>     use(%y)
>>> 
>>>       =>
>>> 
>>>     store %x to %x_ptr : $C
>>>     ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>>     use(%x)
>>> 
>>> In a world, where we are using load_strong, store_strong, we have to also
>>> consider the ownership implications. *NOTE* Since we are not modifying the
>>> store_strong, `store_strong` and `store_strong [init]` are treated the
>>> same. Thus without any loss of generality, lets consider solely 
>>> `store_strong`.
>>> 
>>>     store_strong %x to %x_ptr : $C
>>>     ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>>     %y = load_strong %x_ptr : $C
>>>     use(%y)
>>> 
>>>       =>
>>> 
>>>     store_strong %x to %x_ptr : $C
>>>     ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>>     strong_retain %x
>>>     use(%x)
>>> 
>>> ### ARC Code Motion
>>> 
>>> If ARC Code Motion wishes to move `load_strong`, `store_strong` 
>>> instructions, it
>>> must now consider read/write effects. On the other hand, it will be able to 
>>> now
>>> not consider the side-effects of destructors when moving retain/release
>>> operations.
>>> 
>>> ### Normal Code Motion
>>> 
>>> Normal code motion will lose some effectiveness since many of the load/store
>>> operations that it used to be able to move now must consider ARC 
>>> information. We
>>> may need to consider running ARC code motion earlier in the pipeline where 
>>> we
>>> normally run Normal Code Motion to ensure that we are able to handle these
>>> cases.
>>> 
>>> ### ARC Optimization
>>> 
>>> The main implication for ARC optimization is that instead of eliminating 
>>> just
>>> retains, releases, it must be able to recognize `load_strong`, 
>>> `store_strong`
>>> and set their flags as appropriate.
>>> 
>>> ### Function Signature Optimization
>>> 
>>> Semantic ARC affects function signature optimization in the context of the 
>>> owned
>>> to guaranteed optimization. Specifically:
>>> 
>>> 1. A `store_strong` must be recognized as a release of the old value that is
>>>    being overridden. In such a case, we can move the `release` of the old 
>>> value
>>>    into the caller and change the `store_strong` into a `store_strong
>>>    [init]`.
>>> 2. A `load_strong` must be recognized as a retain in the callee. Then 
>>> function
>>>    signature optimization will transform the `load_strong` into a 
>>> `load_strong
>>>    [guaranteed]`. This would require the addition of a new `@guaranteed` 
>>> return
>>>    value convention.
>>> 
>>> # Appendix
>>> 
>>> ## Partial Initialization of Loadable References in SIL
>>> 
>>> In SIL, a value of non-trivial loadable type is loaded from a memory 
>>> location as
>>> follows:
>>> 
>>>     %x = load %x_ptr : $*S
>>>     ...
>>>     retain_value %x_ptr : $S
>>> 
>>> At first glance, this looks reasonable, but in truth there is a hidden 
>>> drawback:
>>> the partially initialized zone in between the load and the retain
>>> operation. This zone creates a period of time when an "evil optimizer" could
>>> insert an instruction that causes x to be deallocated before the copy is
>>> finished being initialized. Similar issues come up when trying to perform a
>>> store of a non-trival value into a memory location.
>>> 
>>> Since this sort of partial initialization is allowed in SIL, the optimizer 
>>> is
>>> forced to be overly conservative when attempting to move releases passed 
>>> retains
>>> lest the release triggers a deinit that destroys a value like `%x`. Lets 
>>> look at
>>> two concrete examples that show how semantically providing load_strong,
>>> store_strong instructions eliminate this problem.
>>> 
>>> **NOTE** Without any loss of generality, we will speak of values with 
>>> reference
>>> semantics instead of non-trivial values.
>>> 
>>> ## Case Study: Partial Initialization and load_strong
>>> 
>>> ### The Problem
>>> 
>>> Consider the following swift program:
>>> 
>>>     func opaque_call()
>>> 
>>>     final class C {
>>>       var int: Int = 0
>>>       deinit {
>>>         opaque_call()
>>>       }
>>>     }
>>> 
>>>     final class D {
>>>       var int: Int = 0
>>>     }
>>> 
>>>     var GLOBAL_C : C? = nil
>>>     var GLOBAL_D : D? = nil
>>> 
>>>     func useC(_ c: C)
>>>     func useD(_ d: D)
>>> 
>>>     func run() {
>>>         let c = C()
>>>         GLOBAL_C = c
>>>         let d = D()
>>>         GLOBAL_D = d
>>>         useC(c)
>>>         useD(d)
>>>     }
>>> 
>>> Notice that both `C` and `D` have fixed layouts and separate class 
>>> hierarchies,
>>> but `C`'s deinit has a call to the function `opaque_call` which may write to
>>> `GLOBAL_D` or `GLOBAL_C`. Additionally assume that both `useC` and `useD` 
>>> are
>>> known to the compiler to not have any affects on instances of type `D`, `C`
>>> respectively and useC assigns `nil` to `GLOBAL_C`. Now consider the 
>>> following
>>> valid SIL lowering for `run`:
>>> 
>>>     sil_global GLOBAL_D : $D
>>>     sil_global GLOBAL_C : $C
>>> 
>>>     final class C {
>>>       var x: Int
>>>       deinit
>>>     }
>>> 
>>>     final class D {
>>>       var x: Int
>>>     }
>>> 
>>>     sil @useC : $@convention(thin) () -> ()
>>>     sil @useD : $@convention(thin) () -> ()
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D                                           
>>>    (2)
>>> 
>>>       %c2 = load %global_c : $*C                                            
>>>    (3)
>>>       strong_retain %c2 : $C                                                
>>>    (4)
>>>       %d2 = load %global_d : $*D                                            
>>>    (5)
>>>       strong_retain %d2 : $D                                                
>>>    (6)
>>> 
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()           
>>>    (7)
>>> 
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>> 
>>>       strong_release %d : $D                                                
>>>    (9)
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> Lets optimize this function! First we perform the following operations:
>>> 
>>> 1. Since `(2)` is storing to an identified object that can not be 
>>> `GLOBAL_C`, we
>>>    can store to load forward `(1)` to `(3)`.
>>> 2. Since a retain does not block store to load forwarding, we can forward 
>>> `(2)`
>>>    to `(5)`. But lets for the sake of argument, assume that the optimizer 
>>> keeps
>>>    such information as an analysis and does not perform the actual 
>>> load->store
>>>    forwarding.
>>> 3. Even though we do not foward `(2)` to `(5)`, we can still move `(4)` over
>>>    `(6)` so that `(4)` is right before `(7)`.
>>> 
>>> This yields (using the ' marker to designate that a register has had 
>>> load-store
>>> forwarding applied to it):
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D                                           
>>>    (2)
>>> 
>>>       strong_retain %c : $C                                                 
>>>    (4')
>>>       %d2 = load %global_d : $*D                                            
>>>    (5)
>>>       strong_retain %d2 : $D                                                
>>>    (6)
>>> 
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>    (7')
>>> 
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>> 
>>>       strong_release %d : $D                                                
>>>    (9)
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> Then by assumption, we know that `%useC` does not perform any releases of 
>>> any
>>> instances of class `D`. Thus `(6)` can be moved past `(7')` and we can then 
>>> pair
>>> and eliminate `(6)` and `(9)` via the rules of ARC optimization using the
>>> analysis information that `%d2` and `%d` are th same due to the possibility 
>>> of
>>> performing store->load forwarding. After performing such transformations, 
>>> `run`
>>> looks as follows:
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D
>>> 
>>>       %d2 = load %global_d : $*D                                            
>>>    (5)
>>>       strong_retain %c : $C                                                 
>>>    (4')
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>    (7')
>>> 
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>> 
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> Now by assumption, we know that `%useD_func` does not touch any instances of
>>> class `C` and `%c` does not contain any ivars of type `D` and is final so 
>>> none
>>> can be added. At first glance, this seems to suggest that we can move `(10)`
>>> before `(8')` and then pair/eliminate `(4')` and `(10)`. But is this a safe
>>> optimization perform?  Absolutely Not! Why? Remember that since `useC_func`
>>> assigns `nil` to `GLOBAL_C`, after `(7')`, `%c` could have a reference count
>>> of 1.  Thus `(10)` _may_ invoke the destructor of `C`. Since this destructor
>>> calls an opaque function that _could_ potentially write to `GLOBAL_D`, we 
>>> may be
>>> be passing `%d2`, an already deallocated object to `%useD_func`, an illegal
>>> optimization!
>>> 
>>> Lets think a bit more about this example and consider this example at the
>>> language level. Remember that while Swift's deinit are not asychronous, we 
>>> do
>>> not allow for user level code to create dependencies from the body of the
>>> destructor into the normal control flow that has called it. This means that
>>> there are two valid results of this code:
>>> 
>>> - Operation Sequence 1: No optimization is performed and `%d2` is passed to
>>>   `%useD_func`.
>>> - Operation Sequence 2: We shorten the lifetime of `%c` before `%useD_func` 
>>> and
>>>    a different instance of `$D` is passed into `%useD_func`.
>>> 
>>> The fact that 1 occurs without optimization is just as a result of an
>>> implementation detail of SILGen. 2 is also a valid sequence of operations.
>>> 
>>> Given that:
>>> 
>>> 1. As a principle, the optimizer does not consider such dependencies to 
>>> avoid
>>>    being overly conservative.
>>> 2. We provide constructs to ensure appropriate lifetimes via the usage of
>>>    constructs such as fix_lifetime.
>>> 
>>> We need to figure out how to express our optimization such that 2
>>> happens. Remember that one of the optimizations that we performed at the
>>> beginning was to move `(6)` over `(7')`, i.e., transform this:
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d_addr = global_addr GLOBAL_D : $D
>>>       %d = load %global_d_addr : $*D             (5)
>>>       strong_retain %d : $D                      (6)
>>> 
>>>       // Call the user functions passing in the instances that we loaded 
>>> from the globals.
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>     (7')
>>> 
>>> into:
>>> 
>>>       %global_d_addr = global_addr GLOBAL_D : $D
>>>       %d2 = load %global_d_addr : $*D             (5)
>>> 
>>>       // Call the user functions passing in the instances that we loaded 
>>> from the globals.
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>     (7')
>>>       strong_retain %d2 : $D                      (6)
>>> 
>>> This transformation in Swift corresponds to transforming:
>>> 
>>>       let d = GLOBAL_D
>>>       useC(c)
>>> 
>>> to:
>>> 
>>>       let d_raw = load_d_value(GLOBAL_D)
>>>       useC(c)
>>>       let d = take_ownership_of_d(d_raw)
>>> 
>>> This is clearly an instance where we have moved a side-effect in between the
>>> loading of the data and the taking ownership of such data, that is before 
>>> the
>>> `let` is fully initialized. What if instead of just moving the retain, we 
>>> moved
>>> the entire let statement? This would then result in the following swift 
>>> code:
>>> 
>>>       useC(c)
>>>       let d = GLOBAL_D
>>> 
>>> and would correspond to the following SIL snippet:
>>> 
>>>       %global_d_addr = global_addr GLOBAL_D : $D
>>> 
>>>       // Call the user functions passing in the instances that we loaded 
>>> from the globals.
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>     (7')
>>>       %d2 = load %global_d_addr : $*D                                       
>>>   (5)
>>>       strong_retain %d2 : $D                                                
>>>   (6)
>>> 
>>> Moving the load with the strong_retain to ensure that the full 
>>> initialization is
>>> performed even after code motion causes our SIL to look as follows:
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D
>>> 
>>>       strong_retain %c : $C                                                 
>>>    (4')
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()            
>>>    (7')
>>> 
>>>       %d2 = load %global_d : $*D                                            
>>>    (5)
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>> 
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> Giving us the exact result that we want: Operation Sequence 2!
>>> 
>>> ### Defining load_strong
>>> 
>>> Given that we wish the load, store to be tightly coupled together, it is 
>>> natural
>>> to express this operation as a `load_strong` instruction. Lets define the
>>> `load_strong` instruction as follows:
>>> 
>>>     %1 = load_strong %0 : $*C
>>> 
>>>       =>
>>> 
>>>     %1 = load %0 : $*C
>>>     retain_value %1 : $C
>>> 
>>> Now lets transform our initial example to use this instruction:
>>> 
>>> Notice how now if we move `(7)` over `(3)` and `(6)` now, we get the 
>>> following SIL:
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D                                           
>>>    (2)
>>> 
>>>       %c2 = load_strong %global_c : $*C                                     
>>>    (3)
>>>       %d2 = load_strong %global_d : $*D                                     
>>>    (5)
>>> 
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()           
>>>    (7)
>>> 
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>> 
>>>       strong_release %d : $D                                                
>>>    (9)
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> We then perform the previous code motion:
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D                                           
>>>    (2)
>>> 
>>>       %c2 = load_strong %global_c : $*C                                     
>>>    (3)
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()           
>>>    (7)
>>>       strong_release %d : $D                                                
>>>    (9)
>>> 
>>>       %d2 = load_strong %global_d : $*D                                     
>>>    (5)
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>>       strong_release %c : $C                                                
>>>    (10)
>>>     }
>>> 
>>> We then would like to eliminate `(9)` and `(10)` by pairing them with `(3)` 
>>> and
>>> `(8)`. Can we still do so? One way we could do this is by introducing the
>>> `[take]` flag. The `[take]` flag on a load_strong says that one is 
>>> semantically
>>> loading a value from a memory location and are taking ownership of the value
>>> thus eliding the retain. In terms of SIL this flag is defined as:
>>> 
>>>     %x = load_strong [take] %x_ptr : $*C
>>> 
>>>       =>
>>> 
>>>     %x = load %x_ptr : $*C
>>> 
>>> Why do we care about having such a `load_strong [take]` instruction when we
>>> could just use a `load`? The reason why is that a normal `load` has unsafe
>>> unowned ownership (i.e. it has no implications on ownership). We would like 
>>> for
>>> memory that has non-trivial type to only be able to be loaded via 
>>> instructions
>>> that maintain said ownership. We will allow for casting to trivial types as
>>> usual to provide such access if it is required.
>>> 
>>> Thus we have achieved the desired result:
>>> 
>>>     sil @run : $@convention(thin) () -> () {
>>>     bb0:
>>>       %c = alloc_ref $C
>>>       %global_c = global_addr @GLOBAL_C : $*C
>>>       strong_retain %c : $C
>>>       store %c to %global_c : $*C                                           
>>>    (1)
>>> 
>>>       %d = alloc_ref $D
>>>       %global_d = global_addr @GLOBAL_D : $*D
>>>       strong_retain %d : $D
>>>       store %d to %global_d : $*D                                           
>>>    (2)
>>> 
>>>       %c2 = load_strong [take] %global_c : $*C                              
>>>    (3)
>>>       %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>>       apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()           
>>>    (7)
>>> 
>>>       %d2 = load_strong [take] %global_d : $*D                              
>>>    (5)
>>>       %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>>       apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()           
>>>    (8)
>>>     }
>>> 
>>> _______________________________________________
>>> swift-dev mailing list
>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>> <https://lists.swift.org/mailman/listinfo/swift-dev>
> _______________________________________________
> swift-dev mailing list
> swift-dev@swift.org <mailto:swift-dev@swift.org>
> https://lists.swift.org/mailman/listinfo/swift-dev 
> <https://lists.swift.org/mailman/listinfo/swift-dev>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] [semantic-arc][proposal] High Level ARC Memory Operations

Reply via email to