[I’ve seen the WWDC 2016 Keynote and State of the Platforms videos. I haven’t
seen any others so I don’t spoil myself before typing my ideas down. Apologies
if this has already been covered.]
Is there any problem to adding fixed-sized arrays? From what I glanced here,
it’s more like no one has gotten around to it and less like the very idea is
hated.
Obviously, the main advantages are that object storage (or pointer storage for
reference types) is on the stack and that the shape (i.e. the number of extents
and range for each extent) is fixed at compile time. This type of, well, type
can be used when the baggage of length changing isn’t needed Such arrays are
also a classic type since we started system-programming languages. (I was
surprised by their absence in Swift when I first read the books.) They can be
mapped to a vector processing unit’s built-ins.
This:
> struct ArrayOf5<T> {
>
> let count = 5
>
> var first: T
> var second: T
> var third: T
> var fourth: T
> var fifth: T
>
> init(array: [T]) {
> first = array[0]
> second = array[1]
> third = array[2]
> fourth = array[3]
> fifth = array[4]
> }
>
> subscript(index: Int) -> T {
> get {
> var preResult: T?
> switch index {
> case 0:
> preResult = first
> case 1:
> preResult = second
> case 2:
> preResult = third
> case 3:
> preResult = fourth
> case 4:
> preResult = fifth
> default:
> preResult = nil
> }
> return preResult!
> }
>
> set {
> func doAssign(destination: UnsafeMutablePointer<T>) {
> destination.memory = newValue
> }
>
> switch index {
> case 0:
> doAssign(&first)
> case 1:
> doAssign(&second)
> case 2:
> doAssign(&third)
> case 3:
> doAssign(&fourth)
> case 4:
> doAssign(&fifth)
> default:
> doAssign(nil)
> }
> }
> }
>
> }
shouldn’t be better supported than directly having built-in fixed-sized arrays.
(I’ve wondered if we should have a library or built-in arrays. I’ve decided
that even if we add a library, we still need the built-in.) Although I
parametrized the type, altering the shape (right now 1 dimension of 5 elements)
doesn’t scale since it requires lexical change. Internally, a built-in array
type would be something like this, modulo optimizations and not having to come
up with names for each element.
—————
[The following sections are my ideas about implementation; we don’t have to use
them.]
What should array type expressions look like in code?
Since generics (currently) only use type-based parameters, we can’t do
something C++-ish like “array<T, 1, 4, 3>”. I did have a thought of “[MyType;
5]”, but that seemed too close to current arrays. I was shocked to discover
that Rust uses that syntax. However, it’s still to bad to use now because
‘[MyType]” would mean something different that it does currently
(zero-dimensional array of a single unit instead of one-dimensional array of
arbitrary number of elements).
Since arrays are product types (I think), let’s use multiplication syntax:
> var myArray: MyType * [6] // This is an array of six elements, each of
> `MyType`
> var myArray2: MyType * [6] * [5] // This is an array of five elements, each
> a six-element array of `MyType`
> var myArray3: MyType * [3, 4] // This is an array of three-by-four, each
> element of `MyType`
The “[ SHAPE ]” syntax affects the type expression to its left. An empty list
is not allowed since, although array-literal syntax allows empty lists, the
subscript operation does not. Note, to keep the syntax regular, the first
subscript when using “myArray2” is for five elements, i.e. “0..<5”, and the
second (if any) is for six, the reverse lexical order of declaration. This is
because we don’t have C’s cute “declaration equals usage” type syntax. Also
note that “myArray3” requires a single subscript call with two coordinates in
it to dereference. Supplying only one coordinate is nonsensical. (“myArray2”
is the other way when you want to dereference a “MyType” value.)
Of course, not using a shape expression gives you a zero-dimensional array
(i.e. a regular variable) of your base type by default. (So, it would actually
be "0 + extents(MyType.self)" dimensions.)
Or we could just do it the same way C does it (i.e. “MyType[8]”), and preserve
nested arrays having the same lexical order in declaration and dereference. We
could even forbid multi-dimensional arrays and just do nested
single-dimensional arrays like C (and move multi-dimensionality to wrapper
structs), but I feel that would be giving up on improving the abstraction.
——
What about row vs. column order?
If you stick with chaining one-dimensional arrays, you get only row-order, just
like C (which can only do nested one-dimensional arrays). If you use a
multi-extent shape expression, you can override the order:
> var myArray1: MyType * [3, 4] * #rank[0, 1] // Default; row-order
> var myArray2: MyType * [3, 4] * #rank[1, 0] // Reversed; column-order
> var myArray3: MyType * [3, 4, 5] * #rank[0, 2, 1] // Scrambled
The array literal attached to the rank command has to contain every value in
0..<count exactly once. The position of the “0” value is the spacing with the
longest span, “count - 1” is the adjacent elements level.
A rank command can only appear directly after a shape expression or an element
type expression, and not after another rank command. The command’s literal
must have the same length as the number of extents of the corresponding shape
expression. A rank command following an element type expression, whether it’s
a non-static-array type or a static-array type hidden behind a type alias, is
treated as a zero-dimensional array so “#rank[]” is the only legal rank command
for them.
The rank command does not pierce type aliases; “#rank[]” is the only valid
command for them. The alias itself can have a rank command, which affects all
objects declared with that alias, and cannot be overridden. (If the alias
doesn’t have a rank command, the index storage priority is fixed to the
default.) Should a rank-stripping command be added? Should an rank command
that pierces aliases and/or overrides previous ranks be added?
Types that differ in number of shape commands, number of extents within
corresponding commands, size of corresponding extents, index storage
priorities, or element type are different types. We should have a function
that can reshape an array if the total number of the outermost non-static-array
objects are the same. (The function could reshape a non-static-array type to a
static array of one element with any nesting.)
(If we give up on improving the abstraction compared to C, this rank order
would be simulated in a wrapper struct that rearranges the indices.)
——
What about initialization?
I didn’t use initialization expressions before because it’s hard. We’ll have
to work on this.
Zero-dimensional arrays (i.e. a regular non-static-array type) are iniitalized
as usual.
A one-dimensional static-array gets initialized like a current dynamic array.
But the syntax given below will also be accepted (with just one coordinate, of
course).
For higher-dimensional arrays, we need something like:
> [(0, 0, 0): myFirstValue, (2, 4, 1): mySecondValue, …]
> [(0, 0): myFirstValue, (0, 1): mySecondValue, default: myDefaultValue]
and/or:
> var myArray: MyType * [3, 4] = { return 2 * $0 + 7 * $1 }
(We can use “myParam1, myParam2 in” syntax for that last one too.) Can/could
we have an array expression that is part (XX, YY) and part $0/$etc formula?
Maybe with “_” and/or “case” and/or “where” to differentiate which elements get
the $0/$etc and which ones get a “default:” phrase? Maybe the _/case/where
could be used within incomplete (XX, YY)-style indexing?
——
Should there be flexible-ending arrays? Other value types?
Let’s say that at most one extent in a shape expression can be “any” instead of
a compile-time integer expression. Let’s restrict the “any” dimension to be
the rank-0 one, using “#rank” to override if you didn’t put the “any” first.
Let’s call these flexible-ending static-arrays.
Flexible-ending static-arrays could be used like the “MyType[]” expression in
C. Frequently it can be used as a function parameter without having to
specialize for each possible array length. In other words, for a given static
array type “MyType * […, N, …]”, it can be passed into a function parameter
that is declared as “MyType * […, any, …]”. Could there be an automatic way to
pass in the value of “N”, or would we have to resort to using another function
parameter or some other out-of-band information and trusting the user to
synchronize correctly?
It should be possible to pass in different-shaped arrays to said function
parameters with a reshape call, as long as the total number of elements is a
multiple of the span across the “any”-ed extent. Should such a function
parameter also take a "[MyInnerType]” dynamic array objects, where
“MyInnerType” is what you get after stripping the outermost, “any”-ed, extent?
A flexible-ending struct is one with a flexible-ending type as its last stored
property. A flexible-ending enum is a enum with attributed cases and at least
one case ends its tuple with a flexible-ending type. Flexible-ending types
cannot appear anywhere else as a stored property in a struct or enum. Should
we use “final” to mark a struct’s flexible ending property? Somehow adapt this
for enums too? (Right now, “final” is only used for classes.) Flexible-ending
structs and enums can be used in function parameters too, allowing different
lengths for the same parameter. Like static-arrays, a struct or enum that
matches a flexible function parameter except that its would-be flexible part is
fixed can be passed through said parameter. (It’s the same rule since all
flexible struct and enums have to have a nested flexible array at their end.)
Besides as function parameters, a flexible-ending object can be declared inside
a function/method or as a global. I’m not sure how to do heap versions yet.
If the object is immutable (i.e. “let”), the initialization block has to cover
all the elements without using $0/$etc or “default:” so the size can be
determined. I’m not sure how mutable objects are supposed to work here. I’ll
need your ideas the most here, assuming we keep this feature. It was inspired
by seeing code in Ada (discriminated record) and C (“MyType[1]” as last field
in struct, then allocate with malloc) doing this.
Oh yeah, a flexible-ending static-array, including those that are parts of
flexible-ending structs and enums, must have a inflexible base type.
(Otherwise lies madness.)
(If we keep this feature, maybe it should be limited to one-dimensional arrays.)
—
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT mac DOT com
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution