> On Aug 4, 2016, at 11:20 AM, Johannes Neubauer <[email protected]> wrote: > > >> Am 04.08.2016 um 17:26 schrieb Matthew Johnson via swift-evolution >> <[email protected]>: >> >>> >>> On Aug 4, 2016, at 9:39 AM, Joe Groff <[email protected]> wrote: >>> >>>> >>>> On Aug 3, 2016, at 8:46 PM, Chris Lattner <[email protected]> wrote: >>>> >>>> On Aug 3, 2016, at 7:57 PM, Joe Groff <[email protected]> wrote: >>>>>>>> >>>>>>>> a. We indirect automatically based on some heuristic, as an >>>>>>>> optimization. >>>>>> >>>>>> I weakly disagree with this, because it is important that we provide a >>>>>> predictable model. I’d rather the user get what they write, and tell >>>>>> people to write ‘indirect’ as a performance tuning option. “Too magic” >>>>>> is bad. >>>>> >>>>> I think 'indirect' structs with a heuristic default are important to the >>>>> way people are writing Swift in practice. We've seen many users fully >>>>> invest in value semantics types, because they wants the benefits of >>>>> isolated state, without appreciating the code size and performance >>>>> impacts. Furthermore, implementing 'indirect' by hand is a lot of >>>>> boilerplate. Putting indirectness entirely in users' hands feels to me a >>>>> lot like the "value if word sized, const& if struct" heuristics C++ makes >>>>> you internalize, since there are similar heuristics where 'indirect' is >>>>> almost always a win in Swift too. >>>> >>>> I understand with much of your motivation, but I still disagree with your >>>> conclusion. I see this as exactly analogous to the situation and >>>> discussion when we added indirect to enums. At the time, some argued for >>>> a magic model where the compiler figured out what to do in the most common >>>> “obvious” cases. >>>> >>>> We agreed to use our current model though because: >>>> 1) Better to be explicit about allocations & indirection that implicit. >>>> 2) The compiler can guide the user in the “obvious” case to add the >>>> keyword with a fixit, preserving the discoverability / ease of use. >>>> 3) When indirection is necessary, there are choices to make about where >>>> the best place to do it is. >>>> 4) In the most common case, the “boilerplate” is a single “indirect” >>>> keyword added to the enum decl itself. In the less common case, you want >>>> the “boilerplate” so that you know where the indirections are happening. >>>> >>>> Overall, I think this model has worked well for enums and I’m still very >>>> happy with it. If you generalize it to structs, you also have to consider >>>> that this should be part of a larger model that includes better support >>>> for COW. I think it would be really unfortunate to “magically indirect” >>>> struct, when the right answer may actually be to COW them instead. I’d >>>> rather have a model where someone can use: >>>> >>>> // simple, predictable, always inline, slow in some cases. >>>> struct S1 { … } >>>> >>>> And then upgrade to one of: >>>> >>>> indirect struct S2 {…} >>>> cow struct S3 { … } >>>> >>>> Depending on the structure of their data. In any case, to reiterate, this >>>> really isn’t the time to have this debate, since it is clearly outside of >>>> stage 1. >>> >>> In my mind, indirect *is* cow. An indirect struct without value semantics >>> is a class, so there would be no reason to implement 'indirect' for structs >>> without providing copy-on-write behavior. >> >> This is my view as well. Chris, what is the distinction in your mind? >> >>> I believe that the situation with structs and enums is also different. >>> Indirecting enums has a bigger impact on interface because they enable >>> recursive data structures, and while there are places where indirecting a >>> struct may make new recursion possible, that's much rarer of a reason to >>> introduce indirectness for structs. Performance and code size are the more >>> common reasons, and we've described how to build COW boxes manually to work >>> around performance problems at the last two years' WWDC. There are pretty >>> good heuristics for when indirection almost always beats inline storage: >>> once you have more than one refcounted field, passing around a box and >>> retaining once becomes cheaper than retaining the fields individually. Once >>> you exceed the fixed-sized buffer threshold of three words, indirecting >>> some or all of your fields becomes necessary to avoid falling off a cliff >>> in unspecialized generic or protocol-type-based code. Considering that we >>> hope to explore other layout optimizations, such as automatically >>> reordering fields to minimize padding, and that, as with padding, there are >>> simple rules for indirecting that can be mechanically followed to get good >>> results in the 99% case, it seems perfectly reasonable to me to automate >>> this. >>> >>> -Joe >> >> I think everyone is making good points in this discussion. Predictability >> is an important value, but so is default performance. To some degree there >> is a natural tension between them, but I think it can be mitigated. >> >> Swift relies so heavily on the optimizer for performance that I don’t think >> the default performance is ever going to be perfectly predictable. But >> that’s actually a good thing because, as this allows the compiler to provide >> *better* performance for unannotated code than it would otherwise be able to >> do. We should strive to make the default characteristics, behaviors, >> heuristics, etc as predictable as possible without compromising the goal of >> good performance by default. We’re already pretty fair down this path. >> It’s not clear to me why indirect value types would be treated any >> differently. I don’t think anyone will complain as long as it is very rare >> for performance to be *worse* than the 100% predictable choice (always >> inline in this case). >> >> It seems reasonable to me to expect developers who are reasoning about >> relatively low level performance details (i.e. not Big-O performance) to >> understand some lower level details of the language defaults. It is also >> important to offer tools for developers to take direct, manual control when >> desired to make performance and behavior as predictable as possible. >> >> For example, if we commit to and document the size of the inline existential >> buffer it is possible to reason about whether or not a value type is small >> enough to fit. If the indirection heuristic is relatively simple - such as >> exceeding the inline buffer size, having more than one ref counted field >> (including types implemented with CoW), etc the default behavior will still >> be reasonably predictable. These commitments don’t necessarily need to >> cover *every* case and don’t necessarily need to happen immediately, but >> hopefully the language will reach a stage of maturity where the core team >> feels confident in committing to some of the details that are relevant to >> common use cases. >> >> We just need to also support users that want / need complete predictability >> and optimal performance for their specific use case by allowing opt-in >> annotations that offer more precise control. > > I agree with this. First: IMHO indirect *should be* CoW, but currently it is > not. If a value does not fit into the value buffer of an existential > container, the value will be put onto the heap. If you store the same value > into a second existential container (via an assignment to a variable of > protocol type), it will be copied and put *as a second indirectly stored > value* onto the heap, although no write has happened at all. Arnold > Schwaighofer explained that in his talk at WWDC2016 very good (if you need a > link, just ask me). > > If there will be an automatic mechanism for indirect storage *and* CoW (which > I would love), of course there have to be „tradeoff heuristics“ for when to > store a value directly and when to use indirect storage. Further on, there > should be a *unique value pool* for each value type where all (currently > used) values of that type are stored (uniquely). I would even prefer, that > the „tradeoff heuristics“ are done upfront by the compiler for a type, not > for a variable. That means, Swift would use always a container for value > types, but there are two types of containers: the value container and the > existential container. The existential container stays like it is. The value > container is as big as it needs to be to store the value of the given type, > for small values (at most as big as the value buffer). If the value is bigger > than the value buffer (or has more than one association to a reference type) > the value container for this type is only as big as a reference, because > these type will then stored on the heap with CoW **always**. This way I can > always assign a value to a variable typed with a protocol, since value (or > reference) will fit into the value buffer of the existential container. > Additionally, CoW is available automatically for all types for which it > „makes sense“ (of course annotations should be available to turn to the > current „behavior“ if someone does not like this automatism. Last but not > least, using the *unique value pool* for all value types, that fall into the > category CoW-abonga this will be very space efficient. > > Of course, if you create a new value of such a CoW-type, you need an *atomic > lookup and set operation* in the value pool first checking whether it is > already there (therefore a good (default) implementation of equality and > hashable is a prerequisite) and either use the available value or in the > other case add the new value to the pool. > > Such a value pool could even be used system-wide (some languages do this for > Strings, Ints and other value types). These values have to be evicted if > their reference count drops to `0`. For some values permanent storage or > storage for some time even if they are currently not referenced like in a > cache could be implemented in order to reduce heap allocations (e.g. Java > does this for primitive type wrapper instances for boxing and unboxing). > > I would really love this. It would affect ABI, so it is a (potential) > candidate for Swift 4 Phase 1 right?
I know some Java VM implementations have attempted global uniquing of strings, but from what I've heard, nobody has done it in a way that's worth the performance and complexity tradeoffs. -Joe _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
