On 19-7-2024 1:09, Bob Weinand wrote:
Hey Christoph,
Am 19.07.2024 um 00:51 schrieb Christoph M. Becker <cmbecke...@gmx.de>:
Hi Bob!
On 18.07.2024 at 15:41, Bob Weinand wrote:
Moreover, it can - at least - be worked around in tooling by special
casing the T_YIELD_FROM token and extracting the comment from the
raw parsed string:
var_dump(token_get_all('<?php yield /* comment */ from $foo;'));
will contain:
[1]=> array(3) { [0]=> int(270) [1]=> string(24) "yield /* comment
*/ from" [2]=> int(1) }
It's not optimal, but probably the least bad solution to leave it
unchanged in PHP 8.3, have tooling special case it and properly fix
it in PHP 8.4.
And what about "code" like <https://3v4l.org/4CLhM>? Is Codesniffer
supposed to scan the result of <https://3v4l.org/dKDcs> for possible CS
violations?
Cheers,
Christoph
I suppose you mean https://3v4l.org/IMi8Y, (you missed the <?php tag).
If you want to scan that, it's quite easy to strip the leading yield
and trailing from, and tokenize that again to extract all comments.
Sure, it's a hack, but it'll work: https://3v4l.org/8eAiV.
Bob
Hi Bob,
Of course, everything can be hacked around, but that still leaves the
question what should be the "proper tokenization". Having this change in
PHP 8.3 and then - as you suggest - yet another in PHP 8.4, makes it
mighty hard to have a consistent token stream in tooling, especially as
it is unclear what the "proper tokenization" should/would be.
More than anything, I find it concerning that this change sets a
precedent for tokens to include comments.
Just as an example: what does this mean for the PHP 8.0 nullsafe object
operator ? Should we now suddenly allow that to be written as `?
/*comment*/ ->` ?
Or what about a cast token ? Should that be allowed to be `(string /*for
reasons*/)` ?
Allowing this change to stay in, without having the discussion about
what the "proper tokenization" should be, feels off and random to me and
opens the door for more random changes.
As for the impact on tooling: a change in the tokenization of any token
has an impact not only on tooling like PHPCS itself, but also on every
single external standard build on top of it and is a breaking change.
To give you some perspective - for PHPCS we even went as far as to
"undo" the PHP 8.0 tokenization of namespaced names for the time being
(in the PHPCS 3.x releases) and we'll only change the PHPCS tokenizer to
use the PHP 8.0 tokenization in the PHPCS 4.0 release as it would
otherwise break too many existing sniffs. [1]
Smile,
Juliette
1: https://github.com/squizlabs/PHP_CodeSniffer/issues/3041