On Wed, Jul 24, 2013 at 10:08:27AM -0700, Graydon Hoare wrote: > I expect we'll eventually do one other obvious optimization here, to > define tags as the smallest datum width that can accommodate all the > tag values and maintan the alignment "requirement" (or absence of > speed penalty) of the data fields, not just "1 word".
I had this working at one point, but it wasn't completely finished, and then the type_ and external interator changes conflicted with it, and I haven't had time recently to fix it. It's not quite as simple as it sounds, because there's unsafe code that's transmuting ints to C-like enums, and passing C-like enums to C functions that expect an actual C enum (or some other integral type). So this more or less forces part of the problem of specifying a representation. The work in progress, unmergeable in its current form (and in need of some squashing): https://github.com/jld/rust/compare/enum-discrim-size > That sort of definition would mean option<u8> drops to 2 bytes (not > 16) and on targets with cheap unaligned access (ivy bridge etc) most > tags (with <256 variants) drop from 8 bytes to 1 (if you pass the > appropriate --target-feature flag). Is that safe? I know at least some architectures don't like atomic operations that can cross cache lines. The other thing that sort of goes here is that, if the struct-of-fields needs alignment padding anyway, the discriminant can go there. This is essentially a special case of reordering fields to reduce padding. (But if this makes the offset to the discriminant unrepresentable in (for example) an ARM load/store instruction, so that it has to be synthesized, then that's less good.) > To get much more clever than that involves hunting in the > union-of-data-fields for overlapping fields into which you can > either scrounge bits or use sentinel values. Or creating useful overlapping fields by choosing appropriate offsets. > This is not totally implausible: the low 3 bits are available in > most pointers from a malloc, and the entire zero page on almost all > platforms is unmapped for picking sentinels. The value might also have unused bits due to alignment padding, although I'm wary of what happens to them (either intentionally in rustc or due to LLVM's assumptions) when the value containing them is copied/moved. But also, an enum that doesn't use the entire range of its discriminant has sentinel values that another enum containing it can use. And then there's the possibility of handling nested enums like this: enum E { A, B } enum F { C, D } enum G { X(E), Y(F) } by assigning A=0, B=1, C=2, D=3. Which may or may not be helpful enough for any real use cases (e.g., ast) to be worth thinking about. --Jed _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
