> Am 05.08.2016 um 17:17 schrieb Joe Groff <[email protected]>: > >> >> On Aug 4, 2016, at 11:31 AM, Johannes Neubauer <[email protected]> wrote: >> >>> >>> Am 04.08.2016 um 20:21 schrieb Joe Groff <[email protected]>: >>> >>>> >>>> On Aug 4, 2016, at 11:20 AM, Johannes Neubauer <[email protected]> >>>> wrote: >>>> >>>> >>>>> Am 04.08.2016 um 17:26 schrieb Matthew Johnson via swift-evolution >>>>> <[email protected]>: >>>>> >>>>>> >>>>>> On Aug 4, 2016, at 9:39 AM, Joe Groff <[email protected]> wrote: >>>>>> >>>>>>> >>>>>>> On Aug 3, 2016, at 8:46 PM, Chris Lattner <[email protected]> wrote: >>>>>>> >>>>>>> On Aug 3, 2016, at 7:57 PM, Joe Groff <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> a. We indirect automatically based on some heuristic, as an >>>>>>>>>>> optimization. >>>>>>>>> >>>>>>>>> I weakly disagree with this, because it is important that we provide >>>>>>>>> a predictable model. I’d rather the user get what they write, and >>>>>>>>> tell people to write ‘indirect’ as a performance tuning option. “Too >>>>>>>>> magic” is bad. >>>>>>>> >>>>>>>> I think 'indirect' structs with a heuristic default are important to >>>>>>>> the way people are writing Swift in practice. We've seen many users >>>>>>>> fully invest in value semantics types, because they wants the benefits >>>>>>>> of isolated state, without appreciating the code size and performance >>>>>>>> impacts. Furthermore, implementing 'indirect' by hand is a lot of >>>>>>>> boilerplate. Putting indirectness entirely in users' hands feels to me >>>>>>>> a lot like the "value if word sized, const& if struct" heuristics C++ >>>>>>>> makes you internalize, since there are similar heuristics where >>>>>>>> 'indirect' is almost always a win in Swift too. >>>>>>> >>>>>>> I understand with much of your motivation, but I still disagree with >>>>>>> your conclusion. I see this as exactly analogous to the situation and >>>>>>> discussion when we added indirect to enums. At the time, some argued >>>>>>> for a magic model where the compiler figured out what to do in the most >>>>>>> common “obvious” cases. >>>>>>> >>>>>>> We agreed to use our current model though because: >>>>>>> 1) Better to be explicit about allocations & indirection that implicit. >>>>>>> 2) The compiler can guide the user in the “obvious” case to add the >>>>>>> keyword with a fixit, preserving the discoverability / ease of use. >>>>>>> 3) When indirection is necessary, there are choices to make about where >>>>>>> the best place to do it is. >>>>>>> 4) In the most common case, the “boilerplate” is a single “indirect” >>>>>>> keyword added to the enum decl itself. In the less common case, you >>>>>>> want the “boilerplate” so that you know where the indirections are >>>>>>> happening. >>>>>>> >>>>>>> Overall, I think this model has worked well for enums and I’m still >>>>>>> very happy with it. If you generalize it to structs, you also have to >>>>>>> consider that this should be part of a larger model that includes >>>>>>> better support for COW. I think it would be really unfortunate to >>>>>>> “magically indirect” struct, when the right answer may actually be to >>>>>>> COW them instead. I’d rather have a model where someone can use: >>>>>>> >>>>>>> // simple, predictable, always inline, slow in some cases. >>>>>>> struct S1 { … } >>>>>>> >>>>>>> And then upgrade to one of: >>>>>>> >>>>>>> indirect struct S2 {…} >>>>>>> cow struct S3 { … } >>>>>>> >>>>>>> Depending on the structure of their data. In any case, to reiterate, >>>>>>> this really isn’t the time to have this debate, since it is clearly >>>>>>> outside of stage 1. >>>>>> >>>>>> In my mind, indirect *is* cow. An indirect struct without value >>>>>> semantics is a class, so there would be no reason to implement >>>>>> 'indirect' for structs without providing copy-on-write behavior. >>>>> >>>>> This is my view as well. Chris, what is the distinction in your mind? >>>>> >>>>>> I believe that the situation with structs and enums is also different. >>>>>> Indirecting enums has a bigger impact on interface because they enable >>>>>> recursive data structures, and while there are places where indirecting >>>>>> a struct may make new recursion possible, that's much rarer of a reason >>>>>> to introduce indirectness for structs. Performance and code size are the >>>>>> more common reasons, and we've described how to build COW boxes manually >>>>>> to work around performance problems at the last two years' WWDC. There >>>>>> are pretty good heuristics for when indirection almost always beats >>>>>> inline storage: once you have more than one refcounted field, passing >>>>>> around a box and retaining once becomes cheaper than retaining the >>>>>> fields individually. Once you exceed the fixed-sized buffer threshold of >>>>>> three words, indirecting some or all of your fields becomes necessary to >>>>>> avoid falling off a cliff in unspecialized generic or >>>>>> protocol-type-based code. Considering that we hope to explore other >>>>>> layout optimizations, such as automatically reordering fields to >>>>>> minimize padding, and that, as with padding, there are simple rules for >>>>>> indirecting that can be mechanically followed to get good results in the >>>>>> 99% case, it seems perfectly reasonable to me to automate this. >>>>>> >>>>>> -Joe >>>>> >>>>> I think everyone is making good points in this discussion. >>>>> Predictability is an important value, but so is default performance. To >>>>> some degree there is a natural tension between them, but I think it can >>>>> be mitigated. >>>>> >>>>> Swift relies so heavily on the optimizer for performance that I don’t >>>>> think the default performance is ever going to be perfectly predictable. >>>>> But that’s actually a good thing because, as this allows the compiler to >>>>> provide *better* performance for unannotated code than it would otherwise >>>>> be able to do. We should strive to make the default characteristics, >>>>> behaviors, heuristics, etc as predictable as possible without >>>>> compromising the goal of good performance by default. We’re already >>>>> pretty fair down this path. It’s not clear to me why indirect value >>>>> types would be treated any differently. I don’t think anyone will >>>>> complain as long as it is very rare for performance to be *worse* than >>>>> the 100% predictable choice (always inline in this case). >>>>> >>>>> It seems reasonable to me to expect developers who are reasoning about >>>>> relatively low level performance details (i.e. not Big-O performance) to >>>>> understand some lower level details of the language defaults. It is also >>>>> important to offer tools for developers to take direct, manual control >>>>> when desired to make performance and behavior as predictable as possible. >>>>> >>>>> For example, if we commit to and document the size of the inline >>>>> existential buffer it is possible to reason about whether or not a value >>>>> type is small enough to fit. If the indirection heuristic is relatively >>>>> simple - such as exceeding the inline buffer size, having more than one >>>>> ref counted field (including types implemented with CoW), etc the default >>>>> behavior will still be reasonably predictable. These commitments don’t >>>>> necessarily need to cover *every* case and don’t necessarily need to >>>>> happen immediately, but hopefully the language will reach a stage of >>>>> maturity where the core team feels confident in committing to some of the >>>>> details that are relevant to common use cases. >>>>> >>>>> We just need to also support users that want / need complete >>>>> predictability and optimal performance for their specific use case by >>>>> allowing opt-in annotations that offer more precise control. >>>> >>>> I agree with this. First: IMHO indirect *should be* CoW, but currently it >>>> is not. If a value does not fit into the value buffer of an existential >>>> container, the value will be put onto the heap. If you store the same >>>> value into a second existential container (via an assignment to a variable >>>> of protocol type), it will be copied and put *as a second indirectly >>>> stored value* onto the heap, although no write has happened at all. Arnold >>>> Schwaighofer explained that in his talk at WWDC2016 very good (if you need >>>> a link, just ask me). >>>> >>>> If there will be an automatic mechanism for indirect storage *and* CoW >>>> (which I would love), of course there have to be „tradeoff heuristics“ for >>>> when to store a value directly and when to use indirect storage. Further >>>> on, there should be a *unique value pool* for each value type where all >>>> (currently used) values of that type are stored (uniquely). I would even >>>> prefer, that the „tradeoff heuristics“ are done upfront by the compiler >>>> for a type, not for a variable. That means, Swift would use always a >>>> container for value types, but there are two types of containers: the >>>> value container and the existential container. The existential container >>>> stays like it is. The value container is as big as it needs to be to store >>>> the value of the given type, for small values (at most as big as the value >>>> buffer). If the value is bigger than the value buffer (or has more than >>>> one association to a reference type) the value container for this type is >>>> only as big as a reference, because these type will then stored on the >>>> heap with CoW **always**. This way I can always assign a value to a >>>> variable typed with a protocol, since value (or reference) will fit into >>>> the value buffer of the existential container. Additionally, CoW is >>>> available automatically for all types for which it „makes sense“ (of >>>> course annotations should be available to turn to the current „behavior“ >>>> if someone does not like this automatism. Last but not least, using the >>>> *unique value pool* for all value types, that fall into the category >>>> CoW-abonga this will be very space efficient. >>>> >>>> Of course, if you create a new value of such a CoW-type, you need an >>>> *atomic lookup and set operation* in the value pool first checking whether >>>> it is already there (therefore a good (default) implementation of equality >>>> and hashable is a prerequisite) and either use the available value or in >>>> the other case add the new value to the pool. >>>> >>>> Such a value pool could even be used system-wide (some languages do this >>>> for Strings, Ints and other value types). These values have to be evicted >>>> if their reference count drops to `0`. For some values permanent storage >>>> or storage for some time even if they are currently not referenced like in >>>> a cache could be implemented in order to reduce heap allocations (e.g. >>>> Java does this for primitive type wrapper instances for boxing and >>>> unboxing). >>>> >>>> I would really love this. It would affect ABI, so it is a (potential) >>>> candidate for Swift 4 Phase 1 right? >>> >>> I know some Java VM implementations have attempted global uniquing of >>> strings, but from what I've heard, nobody has done it in a way that's worth >>> the performance and complexity tradeoffs. >> >> To my knowledge no other language than Swift offers this form of custom >> value types, that can implement protocols, and the need for CoW for "big >> values" is apparent. So why do you think, that only because the tradeoff in >> other languages (like Java) which have only a limited set of fixed value >> types (and a completely different memory model and with a virtual machine >> instead of LLVM) did not pay off, should not be worth evaluating in Swift? > > Strings are immutable in Java, so they are effectively value types (if not > for their identity, which you can't rely on anyway).
Yes, I agree. But the argumentation still holds.
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
