On Fri, Sep 8, 2023, at 1:12 PM, Lanre Waju wrote: > Dear PHP Internals, > > I am writing to propose a new feature for PHP that introduces the > concept of structs. This feature aims to provide a more concise and > expressive way to define and work with immutable data structures. Below > is a detailed description of the proposed syntax, usage, and behavior. > > Syntax > > struct Data > { > string $title; > Status $status; > ?DateTimeImmutable $publishedAt = null; > } > The Data struct is essentially represented as a readonly class with a > constructor as follows: > > > readonly class Data > { > public function __construct( > public string $title, > public Status $status, > public ?DateTimeImmutable $publishedAt = null, > ) {} > } > Assertions > The Data struct will always be readonly. > It has no methods besides the constructor. > Constructors > The Data struct can be constructed in three different ways, each of > which allows for named or positional arguments, which can be mixed: > > 1.1 Class like > $data = new Data('title', Status::PUBLISHED, new DateTimeImmutable()); > > 1.2 Class like (Named Syntax) > $data = new Data(title: 'title', status: Status::PUBLISHED, publishedAt: > new DateTimeImmutable()); > > 2.1 Proposed struct initialization syntax (Positional Arguments) > $data = Data{'title', Status::PUBLISHED, new DateTimeImmutable()}; > > 2.2 Proposed struct initialization syntax (Named Syntax) > $data = Data{title: 'title', status: Status::PUBLISHED, publishedAt: new > DateTimeImmutable()}; > > 3.1 Anonymous Struct (Named Arguments) > > $data = struct { > string $title; > Status $status; > ?DateTimeImmutable $publishedAt = null; > }('title', Status::PUBLISHED, new DateTimeImmutable()); > 3.2 Anonymous Struct (Named Arguments - Named Syntax) > > $data = struct { > string $title; > Status $status; > ?DateTimeImmutable $publishedAt = null; > }(title: 'title', status: Status::PUBLISHED, publishedAt: new > DateTimeImmutable()); > Nesting > The proposed feature also supports nesting of structs. For example: > > > final class HasNestedStruct > { > NestedStruct { > string $title; > Status $status; > ?DateTimeImmutable $publishedAt = null; > }; > > public function __construct( > public string $string, > public Data $normalStruct, > public NestedStruct $nestedStruct = NestedStruct{'title', > Status::PUBLISHED, new DateTimeImmutable()}, > public struct InlineNamed { int $x} $inlineNamed = {x: 1}, > public { int $x, int $y} $inlineAnonymous = {x: 1, y: 2}, > ) {} > } > This proposal aims to enhance the readability and maintainability of > code by providing a more concise and expressive way to work with > immutable data structures in PHP. > I believe this feature will be a valuable addition to the language as it > not only opens the door for future enhancements (eg. typed json > deserialization, etc.), but should also help reduce reliance on arrays > by providing a more expressive alternative. > > Your feedback and suggestions are highly appreciated, and we look > forward to discussing this proposal further within the PHP internals > community. > > Sincerely > Lanre
As I have stated in the past, I am firmly opposed to anemic structs. They offer no benefit, much confusion, and more work for future RFCs. The core concept -- that service objects and data objects are separate creatures that should not be comingled -- I fully agree with and advocate for. If I were writing PHP from scratch today, I would probably design it with separate constructs, or take a cue from Go/Rust and eliminate classes all together per se, as they just confuse matters. However, we are dealing with PHP as it exists today, and an entirely separate limited construct just doesn't make sense. I also want to make clear that I am 1000% in favor of structured, typed data. The use of associative arrays as a pseudo data structure is the weakest part of PHP, and the more we can move people away from that towards more formally typed data, the better. For that reason, making it trivial to cast between an associative array and a structured object (as a few others in the thread have suggested) is a *bad* feature, because it further reinforces the idea that an associative array is "just as good" as making a defined type. This is simply flat out false, and we should avoid language features that pretend that it is true. That said, as of PHP 8, we already have perfectly good struct-ish data structure: Classes with promoted properties and named arguments. As of PHP 8.2, the entire class can be declared readonly with a single keyword. For 95% of use cases, this is completely adequate as a struct-like structure: readonly class Person { public function __construct( public string $first, public Status $last, public ?DateTimeImmutable $birthday = null, ) {} } $p = new Person(first: 'Larry', last: 'Garfield'); So where does it fall short? 1. The proposal above suggests it's that it allows methods. Why is that an issue? Why are methods a problem on a data-centric object? This is never explained, and I don't believe it to be true. While methods that call out to other service objects would definitely be bad juju, a fullname() method on the above class poses no problems whatsoever. There is no theoretical purity being protected by disallowing methods. By the same logic, would structs also forbid property hooks, assuming those pass? I would hope not, as data objects are where those are most useful. In fact, I'd go a step further and note that a readonly struct that disallows methods *precludes* the "with-er" style of evolving an object, so if you want "the same thing but with this one change", you have to completely recreate a new struct from scratch. This is a worse experience in every way. 2. The proposal above suggests that it's because the `new` keyword is needed, and proposes both positional and named function-esque syntax, making it look more like Kotlin or Rust where there is no `new` keyword and the class name is itself the constructor. I will agree that `new` is clumsy in many cases, particularly in compound expressions, but that's not an issue unique to data-centric objects. If we were to come up with some alternate constructor invocation to make it easier for data-centric objects, it would be equally useful on non-data objects as well. There's no reason to make it specific to just data structs. (I am also not certain if the parser could even handle that, since functions and classes are in a different keyspace currently so if both a class foo and function foo are defined, `foo()` is ambiguous.) 3. The proposal suggests nested the ability to have nested struct definitions. I can see where this is useful, certainly. However... I can also see where it's useful on service objects, too. Many languages have such a feature, often called "inner classes," and it works just fine on service objects as well as data objects. Inner classes would be an interesting feature in itself that would be worth its own RFC (I won't guarantee that I'd support it, depending on the details, but I am quite open to considering it), but there's no good reason to limit that functionality to just data classes. 4. The proposal implies that structs should be always readonly. As noted above, a readonly class is trivial to define now. Moreover, while I am an outspoken proponent of immutable data structures they are not appropriate in all cases. PSR-14 events, for example, are deliberately mutable because, given PHP's design, making them immutable would have required a lot of extra work from anyone writing a listener for very little benefit. Entities are another example of a data-object that logically needs to be mutable. Mutable data-centric product types have their place, and this approach would preclude that. 5. MWOP suggested in a reply that allowing a struct to conform to a struct definition structurally by the properties it has would be useful. Potentially yes. However, interface properties, part of the hooks RFC, would get us to almost the same place without the weird world-splitting between structs and classes. 6. When dealing with a mutable data value, objects pass by handle (feels like reference even if it's not), but data feels like it should pass by value. Valid! This is a long-standing gripe, and the growth of with-er style value objects (PSR-7 et al) is in a large part to avoid that risk of "spooky action at a distance." However... the above proposal does not address this at all! I would argue that point 6 is the only valid argument for a separate construct from classes as they already exist. But there is no need to create a whole other construct (a very significant implementation lift) to achieve that. All that would need is a flag/marker on the class to indicate that it should use data-like passing semantics. Kotlin has a good example here, where you can declare a class a "data" class by just adding the `data` keyword. That has a number of implications in Kotlin (many of which are not relevant for us), but in our case it would mean either to pass the object by value, or to automatically clone it every time it is passed. (The two would be almost the same to the end user, but likely have different implementation challenges. I cannot speak to what those would be personally, but it's an implementation detail not relevant for now.) That very small change, when combined with all of the other improvements to the language in recent years, gives us all the benefits of data-centric structures without any of the downsides of a completely new construct. It would also allow the developer to opt-in to the class being readonly or not, as the situation requires. The downsides of a new construct include: 1. It would either have to be a new zval type, which is a ton of work, or built on classes, in which case you're fighting against all of the stuff classes do. 2. Which stuff that classes do should be supported by the very-similar syntax? Methods? Attributes? Can you clone it? Do you get a __clone() override if you do? Are traits supported? How does equality work? I can see an argument for where product types (which is what we're talking about) would benefit from all of the above. So we either cripple structs without useful features, or it becomes a lot of work to end up with "objects that pass funny." We can get "objects that pass funny" with a lot less effort with just a `data` keyword flag. 3. If structs are entirely separate from objects, then any time we add a new feature to objects we'll have to debate, again, if that feature should be added to structs as well. And if so, we're looking at more work for the RFC implementer for very little gain. Or, we'll add a feature to structs (like inner classes) and then ask for it on classes, too, and again have double the work and double the debate. 4. The Reflection API is complicated enough as is, without having to deal with a whole other type of type. As someone who maintains a serializer, that would be a lot of work for me to support, with no actual benefit. In short, there's only two versions of structs that could realistically end up existing: Crippled in some way, and "objects that pass funny." So if what we really want are objects that pass by value, let's just implement by-value opt-in objects and be done with it. It's much less work, much more powerful, and avoids many more debates in the future. --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php