(My idea for the lint was `#[allow_kind(Name)]` , which someone on IRC remarked as "opt-out opt-in builtin traits"
On Fri, Feb 28, 2014 at 4:36 PM, Gábor Lehel <glaebho...@gmail.com> wrote: > I think this is a really great idea. > > There's another potential compromise that would preserve most of its > benefits, and reduce the annotation burden: > > There was another proposal earlier, driven by similar motivations, that > structs with private fields should be non-`Pod`. Combining the two ideas, we > could say that the built-in traits would be derived automatically for types > with fully-public interiors, and would have to be declared or derived > manually if any field is private. This would still accomplish what I think > is the most important thing, which is to preserve abstraction boundaries: > clients of a thing (type, module, package) should be insulated against > changes to its private implementation. Relying on public information is, in > this respect, however, fair game. > > The truly troublesome aspects of the current regime are, I believe, all > consequences of the violation of abstraction boundaries. Types like `Cell` > rely on these boundaries to ensure their safety, but under the current > system, information about their private implementation leaks out. The OP and > the above modification to it would both steer clear of this problem. > > I think the main tradeoffs between the two would be around simpler rules vs. > fewer annotations, and the principle of least astonishment. This here idea > is more complicated, because it has different rules for fully-public and > abstract datatypes, and also (as currently) has different rules for built-in > and user-defined traits. In exchange you only have to state your intentions > explicitly if you have something to hide. The PoLA is harder to evaluate. > Returning to the canonical example, if I write `struct Point { x: int, y: > int }`, I think I'd be surprised if it weren't copyable. On the other hand, > perhaps the afore-mentioned inconsistencies would also be surprising. So I > dunno. > >> This argument is rather weakened by the continued necessity of a > `marker::InvariantType` marker. This could be read as an argument > towards explicit variance. However, I think that in this particular > case, the better solution is to introduce the `Mut<T>` type described > in #12577 -- the `Mut<T>` type would give us the invariance. > > I don't see the difference here. Why do you think this should be handled > differently? This is the same sort of abstraction boundary violation as the > others: information about private fields is leaking out into the public > interface via variance inference. > > Under the above scheme we could say that type parameters default to > invariant for types with private fields, and are inferred for fully-public > types. (How you would/could explicitly declare variance is another question, > but kind of orthogonal to the idea that you /should/.) > > > > On Fri, Feb 28, 2014 at 4:51 PM, Niko Matsakis <n...@alum.mit.edu> wrote: >> >> From >> <http://smallcultfollowing.com/babysteps/blog/2014/02/28/rust-rfc-opt-in-builtin-traits/>: >> >> ## Rust RFC: opt-in builtin traits >> >> In today's Rust, there are a number of builtin traits (sometimes >> called "kinds"): `Send`, `Freeze`, `Share`, and `Pod` (in the future, >> perhaps `Sized`). These are expressed as traits, but they are quite >> unlike other traits in certain ways. One way is that they do not have >> any methods; instead, implementing a trait like `Freeze` indicates >> that the type has certain properties (defined below). The biggest >> difference, though, is that these traits are not implemented manually >> by users. Instead, the compiler decides automatically whether or not a >> type implements them based on the contents of the type. >> >> In this proposal, I argue to change this system and instead have users >> manually implement the builtin traits for new types that they define. >> Naturally there would be `#[deriving]` options as well for >> convenience. The compiler's rules (e.g., that a sendable value cannot >> reach a non-sendable value) would still be enforced, but at the point >> where a builtin trait is explicitly implemented, rather than being >> automatically deduced. >> >> There are a couple of reasons to make this change: >> >> 1. **Consistency.** All other traits are opt-in, including very common >> traits like `Eq` and `Clone`. It is somewhat surprising that the >> builtin traits act differently. >> 2. **API Stability.** The builtin traits that are implemented by a >> type are really part of its public API, but unlike other similar >> things they are not declared. This means that seemingly innocent >> changes to the definition of a type can easily break downstream >> users. For example, imagine a type that changes from POD to non-POD >> -- suddenly, all references to instances of that type go from >> copies to moves. Similarly, a type that goes from sendable to >> non-sendable can no longer be used as a message. By opting in to >> being POD (or sendable, etc), library authors make explicit what >> properties they expect to maintain, and which they do not. >> 3. **Pedagogy.** Many users find the distinction between pod types >> (which copy) and linear types (which move) to be surprising. Making >> pod-ness opt-in would help to ease this confusion. >> 4. **Safety and correctness.** In the presence of unsafe code, >> compiler inference is unsound, and it is unfortunate that users >> must remember to "opt out" from inapplicable kinds. There are also >> concerns about future compatibility. Even in safe code, it can also >> be useful to impose additional usage constriants beyond those >> strictly required for type soundness. >> >> I will first cover the existing builtin traits and define what they >> are used for. I will then explain each of the above reasons in more >> detail. Finally, I'll give some syntax examples. >> >> <!-- more --> >> >> #### The builtin traits >> >> We currently define the following builtin traits: >> >> - `Send` -- a type that deeply owns all its contents. >> (Examples: `int`, `~int`, not `&int`) >> - `Freeze` -- a type which is deeply immutable when accessed via an >> `&T` reference. >> (Examples: `int`, `~int`, `&int`, `&mut int`, not `Cell<int>` or >> `Atomic<int>`) >> - `Pod` -- "plain old data" which can be safely copied via memcpy. >> (Examples: `int`, `&int`, not `~int` or `&mut int`) >> >> We are in the process of adding an additional trait: >> >> - `Share` -- a type which is threadsafe when accessed via an `&T` >> reference. (Examples: `int`, `~int`, `&int`, `&mut int`, >> `Atomic<int>`, not `Cell<int>`) >> >> #### Proposed syntax >> >> Under this proposal, for a struct or enum to be considered send, >> freeze, pod, etc, those traits must be explicitly implemented: >> >> struct Foo { ... } >> impl Send for Foo { } >> impl Freeze for Foo { } >> impl Pod for Foo { } >> impl Share for Foo { } >> >> For generic types, a conditional impl would be more appropriate: >> >> enum Option<T> { Some(T), None } >> impl<T:Send> Send for Option<T> { } >> // etc >> >> As usual, deriving forms would be available that would expand into >> impls like the one shown above. >> >> Whenever a builtin trait is implemented, the compiler will enforce the >> same requirements it enforces today. Therefore, code like the >> following would yield an error: >> >> struct Foo<'a> { x: &'a int } >> >> // ERROR: Cannot implement `Send` because the field `x` has type >> // `&'a int` which is not sendable. >> impl<'a> Send for Foo<'a> { } >> >> These impls would follow the usual coherence requirements. For >> example, a struct can only be declared as `Share` within the crate >> where it is defined. >> >> For convenience, I also propose a deriving shorthand >> `#[deriving(Data)]` that would implement a "package" of common traits >> for types that contain simple data: `Eq`, `Ord`, `Clone`, `Show`, >> `Send`, `Share`, `Freeze`, and `Pod`. >> >> #### Pod and linearity >> >> One of the most important aspects of this proposal is that the `Pod` >> trait would be something that one "opts in" to. This means that >> structs and enums would *move by default* unless their type is >> explicitly declared to be `Pod`. So, for example, the following >> code would be in error: >> >> struct Point { x: int, y: int } >> ... >> let p = Point { x: 1, y: 2 }; >> let q = p; // moves p >> print(p.x); // ERROR >> >> To allow that example, one would have to impl `Pod` for `Point`: >> >> struct Point { x: int, y: int } >> impl Pod for Point { } >> ... >> let p = Point { x: 1, y: 2 }; >> let q = p; // copies p, because Point is Pod >> print(p.x); // OK >> >> Effectively this change introduces a three step ladder for types: >> >> 1. If you do nothing, your type is *linear*, meaning that it moves >> from place to place and can never be copied in any way. (We need a >> better name for that.) >> 2. If you implement `Clone`, your type is *cloneable*, meaning that it >> moves from place to place, but it can be explicitly cloned. This is >> suitable for cases where copying is expensive. >> 3. If you implement `Pod`, your type is *plain old data*, meaning that >> it is just copied by default without the need for an explicit >> clone. This is suitable for small bits of data like ints or >> points. >> >> What is nice about this change is that when a type is defined, the >> user makes an *explicit choice* between these three options. >> >> #### Consistency >> >> This change would bring the builtin traits more in line with other >> common traits, such as `Eq` and `Clone`. On a historical note, this >> proposal continues a trend, in that both of those operations used to >> be natively implemented by the compiler as well. >> >> #### API Stability >> >> The set of builtin traits implemented by a type must be considered >> part of its public inferface. At present, though, it's quite invisible >> and not under user control. If a type is changed from `Pod` to >> non-pod, or `Send` to non-send, no error message will result until >> client code attempts to use an instance of that type. In general we >> have tried to avoid this sort of situation, and instead have each >> declaration contain enough information to check it indepenently of its >> uses. Issue #12202 describes this same concern, specifically with >> respect to stability attributes. >> >> Making opt-in explicit effectively solves this problem. It is clearly >> written out which traits a type is expected to fulfill, and if the >> type is changed in such a way as to violate one of these traits, an >> error will be reported at the `impl` site (or `#[deriving]` >> declaration). >> >> #### Pedagogy >> >> When users first start with Rust, ownership and ownership transfer is >> one of the first things that they must learn. This is made more >> confusing by the fact that types are automatically divided into pod >> and non-pod without any sort of declaration. It is not necessarily >> obvious why a `T` and `~T` value, which are *semantically equivalent*, >> behave so differently by default. Makes the pod category something you >> opt into means that types will all be linear by default, which can >> make teaching and leaning easier. >> >> #### Safety and correctness: unsafe code >> >> For safe code, the compiler's rules for deciding whether or not a type >> is sendable (and so forth) are perfectly sound. However, when unsafe >> code is involved, the compiler may draw the wrong conclusion. For such >> cases, types must *opt out* of the builtin traits. >> >> In general, the *opt out* approach seems to be hard to reason about: >> many people (including myself) find it easier to think about what >> properties a type *has* than what properties it *does not* have, >> though clearly the two are logically equivalent in this binary world >> we programmer's inhabit. >> >> More concretely, opt out is dangerous because it means that types with >> unsafe methods are generally *wrong by default*. As an example, >> consider the definition of the `Cell` type: >> >> struct Cell<T> { >> priv value: T >> } >> >> This is a perfectly ordinary struct, and hence the compiler would >> conclude that cells are freezable (if `T` is freezable) and so forth. >> However, the *methods* attached to `Cell` use unsafe magic to mutate >> `value`, even when the `Cell` is aliased: >> >> impl<T:Pod> Cell<T> { >> pub fn set(&self, value: T) { >> unsafe { >> *cast::transmute_mut(&self.value) = value >> } >> } >> } >> >> To accommodate this, we currently use *marker types* -- special types >> known to the compiler which are considered nonpod and so forth. Therefore, >> the full definition of `Cell` is in fact: >> >> pub struct Cell<T> { >> priv value: T, >> priv marker1: marker::InvariantType<T>, >> priv marker2: marker::NoFreeze, >> } >> >> Note the two markers. The first, `marker1`, is a hint to the variance >> engine indicating that the type `Cell` must be >> [invariant with respect to its type argument][inv]. The second, >> `marker2`, indicates that `Cell` is non-freeze. This then informs the >> compiler that the referent of a `&Cell<T>` can't be considered >> immutable. The problem here is that, if you don't know to opt-out, >> you'll wind up with a type definition that is unsafe. >> >> This argument is rather weakened by the continued necessity of a >> `marker::InvariantType` marker. This could be read as an argument >> towards explicit variance. However, I think that in this particular >> case, the better solution is to introduce the `Mut<T>` type described >> in #12577 -- the `Mut<T>` type would give us the invariance. >> >> Using `Mut<T>` brings us back to a world where any type that uses >> `Mut<T>` to obtain interior mutability is correct by default, at least >> with respect to the builtin kinds. Types like `Atomic<T>` and >> `Volatile<T>`, which guarantee data race freedom, would therefore have >> to *opt in* to the `Share` kind, and types like `Cell<T>` would simply >> do nothing. >> >> #### Safety and correctness: future compatibility >> >> Another concern about having the compiler automatically infer >> membership into builtin bounds is that we may find cause to add new >> bounds in the future. In that case, existing Rust code which uses >> unsafe methods might be inferred incorrectly, because it would not >> know to opt out of those future bounds. Therefore, any future bounds >> will *have* to be opt out anyway, so perhaps it is best to be >> consistent from the start. >> >> #### Safety and correctness: semantic constraints >> >> Even if type safety is maintained, some types ought not to be copied >> for semantic reasons. An example from the compiler is the >> `Datum<Rvalue>` type, which is used in code generation to represent >> the computed result of an rvalue expression. At present, the type >> `Rvalue` implements a (empty) destructor -- the sole purpose of this >> destructor is to ensure that datums are not consumed more than once, >> because this would likely correspond to a code gen bug, as it would >> mean that the result of the expression evaluation is consumed more >> than once. Another example might be a newtype'd integer used for >> indexing into a thread-local array: such a value ought not to be >> sendable. And so forth. Using marker types for these kinds of >> situations, or empty destructors, is very awkward. Under this >> proposal, users needs merely refrain from implementing the relevant >> traits. >> >> #### The `Sized` bound >> >> In DST, we plan to add a `Sized` bound. I do not feel like users >> should manually implemented `Sized`. It seems tedious and rather >> ludicrous. >> >> #### Counterarguments >> >> The downsides of this proposal are: >> >> - There is some annotation burden. I had intended to gather statistics >> to try and measure this but have not had the time. >> >> - If a library forgets to implement all the relevant traits for a >> type, there is little recourse for users of that library beyond pull >> requests to the original repository. This is already true with >> traits like `Eq` and `Ord`. However, as SiegeLord noted on IRC, that >> you can often work around the absence of `Eq` with a newtype >> wrapper, but this is not true if a type fails to implement `Send` or >> `Pod`. This danger (forgetting to implement traits) is essentially >> the counterbalance to the "forward compatbility" case made above: >> where implementing traits by default means types may implement too >> much, forcing explicit opt in means types may implement too little. >> One way to mitigate this problem would be to have a lint for when an >> impl of some kind (etc) would be legal, but isn't implemented, at >> least for publicly exported types in library crates. >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev@mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > _______________________________________________ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev