On 04/30/2016 04:12 PM, Stanislav Malyshev wrote:
Hi!

That you for bringing real usage examples. This is an interesting point
to consider.

Reading through it, however, I can't help but wonder why these things
are not just data? I mean, why they are kept in the comments? If there
are reasons like performance, there could be solutions developed, and
"keeping things together" doesn't exactly work when the things are
2-screen-long data array and I presume equally complex class - they are
two separate and independent entities anyway, and their synchronization
has to be explicitly managed anyway, nobody can ensure they are in sync
just by looking at it - in fact, given its size, nobody can *anything*
just by looking at it. It's not really human-readable anymore than
database dump is human-readable - the data is there, true, but it's not
exactly how you'd prefer to interact with it.

That said, of course I imagine there are reasons why Drupal developers
made these choices, and I don't think it makes much sense to argue about
them now. I just outlined my first impression, but if it sounds wrong
and uninformed, just ignore it :) It does, however, makes sense arguing
about whether it is something we have to support directly in the
language, and to which lengths we should go to do so.

At the risk of turning this into a Drupal tutorial (I will try not to), let me try to explain in brief:

In previous Drupal versions, such data would be provided in an "info hook". That is, a magically named function that returns a deeply nested array of data that registers "stuff" with the system. That "stuff" was pretty much completely and utterly undefined, and as it was all anonymous structs (aka associative arrays) extremely hard to document. There's also often a corresponding "alter hook" (we still have those), which gets passed the full set of such data as a massive nested array by reference to manipulate. In addition, it was common to have the definition hook in one file and any code it referred to in another. In a few cases it was a class, but in other cases it was a template file, or a series of other magically named functions that could technically live anywhere. All kinds messy. :-)

Moving that registration metadata into the same file as the thing it's describing was done for DX reasons: If you want to add a new plugin to Drupal (and there are several dozen kinds of plugin available in Drupal 8, each with a corresponding interface), you have one single class file to edit, usually a base class to extend, and you're done. It makes adding new plugins and registering them really easy. Where the plugin then gets used is up to user configuration "elsewhere".

Entities are implemented as just very complex plugins, sort of. (Disclaimer: I disagreed with this decision for a variety of reasons, but lost that battle.)

In the complex entity case I showed, the registration information includes:

1) Base definition information common to any plugin (unique ID, human-friendly label, etc.)

2) database information (tables, keys, etc.). Honestly I think a next version of this system shouldn't allow for user control in that area and they should be fully automated.

3) Various handlers. Many of these are essentially services that are coupled to a specific entity type. Trying to register them all via the container, though, would again spread out the work for a particular developer across a lot more places in a lot more contexts. A few are form definitions that get instantiated as ordinary forms in a certain context.

4) Hooks into various other automation tools. In the ideal case, defining an entity class and its annotation, if you include the appropriate data, will result in about a dozen routes getting created with stock, auto-generated UIs and permission controls already baked in. It will integrate with the Views system (Drupal's content assembly / GUI query builder), access control, database-level revisioning, translation, serialization, and so forth. Basically, all of the cool user-facing automation that is why one uses Drupal is "just there", and because the UI is by default auto-generated it's very consistent for end users.

Again, all of that could be done elsewhere (just like Doctrine entity annotations could be done via YAML or PHP arrays, too), but having it all together in one place instead of 15 that need manual synchronization of service names and such is a big DX win.

Also, I mentioned we still have alter hooks to allow other modules to manipulate that definition, but they now get a keyed array of ContentEntityType objects rather than anonymous arrays. This is a good thing.

(I guess I failed at not making this a Drupal tutorial.)

Most of that data would not make sense on the object itself, because it's used in a completely different context. We do not have a node, or user, or comment object on hand when we need that data.

Amusingly, we don't even use reflection to get that data. Because Drupal over-does everything, there are enough annotated classes that parsing all into memory at once just to reflect on the the docblock which we'd then have to parse was deemed memory-prohibitive. Instead, we submitted a patch upstream to Doctrine to let us parse out the comment from the file stream, which could then be discarded to save that memory. That decision was made long before we adopted PHP 5.5 as a minimum version, though, so I don't know if it would still be relevant today. Either way, for core PHP to only make it available via reflection is a reasonable decision, IMO, and I am not asking for native support otherwise.

That is to say, I would be perfectly happy if PHP had annotations that
do not support such use case, at least not directly. I know that sounds
somewhat insensitive to Drupal developers, and that's not my intent do
dismiss their concerns, but I think we should not take it as a primary
requirement, sine qua non.

My opinion is good design should figure out the best use cases for the
feature and serve them as much as possible, without trying to overload
the system to serve every need and every use case there could be. Doing
80% of cases well is much better than doing 99.999% cases poorly. My
opinion is hundred-element data dumps are well outside of the 80% for
annotations. If whatever design we end up with supports it - great, if
it doesn't - that's fine too.

As a general concept I agree, although then the question becomes which 80% to target. :-) I do not expect PHP to cater to Drupal's current annotation usage specifically; I would be satisfied if it provided a syntax that allowed us to represent the same or similar concepts without total hackery, even if we had to restructure the data to do so. (As I said, some of it I'd prefer wasn't even there.) That basically boils down to the need to support more than "dumb key value with an AST loophole" as a native syntax.


  * @Block(
  *   id = "system_branding_block",
  *   admin_label = @Translation("Site branding")
  * )
Why not do just:

@Block([ "id" => "system_branding_block", "admin_label" =>
Translation("Site branding")])

Or, in current RFC syntax:

<<Block([ "id" => "system_branding_block", "admin_label" =>
Translation("Site branding")])>>

The first is essentially what we're doing, in Doctrine syntax. In Doctrine, those properties map to properties on the "Block" object that gets created. Wrapping them in an array in Doctrine wouldn't really buy us anything.

Here it's more complicated, because some data items clearly should be
different annotations, like "handlers" - these are probably something like:
<<StorageHandler(Drupal\node\NodeStorage)>>
<<FormHandler(Drupal\node\NodeForm)>>

Possibly, if those were well-namespaced keys. Although clustering them all up into a full set of handlers is something we would need to do anyway, either here or in an aggregate object we stick those into.

Without really knowing your semantics I don't see how else the system would know route_provder is not a correct parameter. The only thing that really can validate such things right now in PHP is interface, and if we really stretch it, maybe a class - but I'm not sure how that would really fit. And if we go that route, then we probably couldn't have the freedom AST allows us.

Essentially what I'm asking for is a way for us to tell PHP what those semantics are. Vis, if we define an annotation that is expected to have keys A and B, and a developer puts in keys A and C, then a static analyzer (including my IDE) can spot that and say "yo, what's this C thing?". It's the same concept as mistyping the name of a property on an object that's been defined. $this->route_provder would make my IDE upset because there's no such property defined, but there is $this->route_provider.


1) ::class constants should resolve at compile time, relative to use
statements, anywhere in the annotation.
That makes sense, but then it's unclear why should that be unique for
::class? Then all constants should work this way I think.

Concur. (Although I've become less and less supportive of constants in general, as they are just another global. That's off topic for now, though.)

2) Namespacing for attribute names.  This is necessary for both
collision-avoidance and to minimize typing of obscenely long attribute
names.  The easiest way to do that would be:
That makes sense too. I think Dmitry said he plans to add that.

Including "use" statement sensitivity? That would be necessary to avoid the crazy-long names. If so, +1.


3) Some way to provide a data definition for the annotation that can be
checked at compile time.  This could be classes, a la Doctrine. It could
be a new Annotation type as others have suggested.  It could be
I am somewhat reluctant to define a new conceptual thing just for
annotations. Class seems natural at the first glance but implications of
it are unclear - what does it mean to instantiate such class? How
annotation data is now related to the object of this class?

My initial thought would be something akin to what Doctrine does:

class MyAnnotation {
  protected $foo;
  protected $bar;
}

<<MyAnnotation(foo = 'a', bar = 'b')>>
class Me{}

$def = $r->getAttribute(Me::class);

$def instanceof MyAnnotation; //TRUE
$def->foo == 'a'; // TRUE
$def->bar == 'b'; // TRUE

MyAnnotation can then have whatever other methods I feel like to access/mutate those values. If that requires limiting MyAnnotation in some way (no constructor, the properties have to be public, etc.), I'm open to that.

I suppose this would, technically, allow a baz key to be defined and added to the object dynamically, just like any other object property; and an IDE can soft-complain like it does for an undefined property, but the runtime still accepts it.

Another option could be to (ab)use interfaces in a following fashion: if
we define

<<__Attribute>>
interface BlockPlugin() {
        function id();
        function admin_label();
}

Then when we say:

<<BlockPlugin(["id" => "system_branding_block", "admin_label" =>
Translation("Site branding") ])>>

we get an object that implements this interface and id() and
admin_label() returns something relevant. There are problems with this
approach:

1. Using array as parameter is ugly. Named arguments is what we need but
d'oh. Also enshrining array-as-parameters in syntax which is worse.

2. It is not clear what exactly admin_label() should return and how
Translation() should work - is it AST? Is it some kind of object? Should
we just give up on Translation and do it in some other way like this?

<<__Attribute>>
interface BlockPlugin() {
        function id();
        <<Translatable>>
        function admin_label();
}

Note that this is a way to enable nesting of attributes without getting
the data part too complex. But of course then all admin_labels must be
translatable or not-translatable, not half-way.

So the idea is half-baked but maybe it can be used, think about it.

That's essentially the same concept as I described above with the class. Putting annotations on the object being created seems like a reasonable way to handle nested annotations at first glance, although it still doesn't resolve the main issue of Translatable which is that we scan the source code statically to find translatable strings. (We look for t('some string'), @Translation('some string'), and various other known patterns.) This is definitely the part I understand how to solve the least.

--Larry Garfield

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to