Hi all,
Per Seifeddine's suggestion to keep this out of the karma-request
thread, I'm opening a pre-RFC discussion for scalar object methods --
calling a small, curated set of methods directly on scalar values, e.g.
$str->trim(), (3)->pow(2). There's a complete, tested implementation and
a full write-up (links below); I'd like to surface the strongest
objections before I write the formal RFC.
Disclosure: I built this with an AI assistant (Claude) as a tool. The
design and the decisions are mine, and I've independently verified the
engine behaviour, performance, JIT correctness, leak-freedom and the BC
scan. Flagging it up front for transparency.
I know "methods on primitives" was proposed and declined before
(Nikita's 2014 "Methods on primitive types in PHP"). The reason it
stalled was loose typing: $x->trim() would need a runtime type check and
would behave differently depending on what $x held. This proposal
sidesteps that entirely, by generalizing the resolution Nikita himself
suggested in that thread -- requiring an explicit cast where the type
isn't already clear.
The idea: dispatch only on receivers the compiler already knows are
scalar. The method call is rewritten at compile time to an ordinary call
into an internal backing class -- no runtime type dispatch, no new
opcode, the object method-call path is untouched. A receiver qualifies
only if its type is guaranteed syntactically: a literal, a
(string)/(int) cast, a concatenation/interpolation, a non-nullable
scalar-typed property, or a call with a declared non-nullable scalar
return type. An untyped $x->trim() is left exactly as today (Error).
Crucially, dispatch never depends on optimizer-inferred types, so
behaviour is identical with and without opcache.
echo " Hello World "->trim()->upper(); // "HELLO WORLD"
echo (3)->pow(2); // 9
echo "hello"->length()->pow(2); // 25 -- length():int
chains into the int methods
So the cast Nikita proposed, ((string) $num)->chunk(), is only needed
where the type isn't already guaranteed; everywhere else the dispatch is
sound by construction, with no runtime check.
It's intended as one proposal with two independent votes:
1. Scalar methods on guaranteed free receivers (the above). A pure
capability -- it adds a way to call scalar operations and changes
nothing about untyped code. Proposed initial sets: a small curated Str
(trim/upper/lower/length + contains/startsWith/endsWith), Int
(abs/pow/clamp), and Float (round/ceil/floor/abs); bool deliberately
gets none (its operations are operators, not methods). The sets are
governed by explicit criteria and are the easiest thing to tune in
discussion.
2. Scalar-typed local variables (int $x = ...;, scalar types only),
which additionally make a typed local a guaranteed receiver (string $s =
...; $s->trim()). This is the more contested half -- it also carries the
"local type discipline" argument -- so it's a separate vote: a "no" here
ships the capability without typed locals.
What I'm deliberately NOT doing, up front so it's not a surprise:
- No method-call-result receivers ($this->getName()->trim()) -- that
would rest on return-type covariance under inheritance; not worth the
surface.
- Int::abs/pow return int|float (they can overflow, as the global
functions do), so they're honest terminals -- they don't chain.
(Int::clamp is the one initial int method provably :int for all inputs,
so it does chain.)
- No int|false typed locals -- that's a sentinel state, not a committed
type; ?T is supported, sentinel-unions are not.
- The backing classes are internal-only (NUL-prefixed name, like
anonymous classes): class_exists('Str') is false, no Reflection,
userland "class Str {}" can't collide.
Implementation status -- this is built and tested, not a sketch:
- Scalar methods add zero new opcodes -- the desugar emits an ordinary
static call, and the object method-call path is byte-for-byte unchanged.
(Typed locals add dedicated *_TYPED assignment opcodes, but the untyped
hot path stays byte-identical.)
- Performance (deterministic callgrind, release build): the untyped hot
path is byte-identical; the standard bench.php suite is +0.145%
instructions, entirely from predicted-not-taken branches in reference
opcodes only, with zero added cache misses or branch mispredictions. A
typed-local write benchmarks at ~0.79x the cost of a typed-property
write -- a check the language already runs on every typed-property write
since 7.4.
- References (the objection that sank prior typed-locals attempts) are
enforced through every path -- =&, by-ref params,
array/object/static-prop refs, yield, closure capture, $$name, extract,
$GLOBALS, global -- via the existing typed-property reference machinery.
Leak-checked under stress.
- Correct under JIT in all three modes (interpreter, function, tracing
-- differential byte-identical output). opcache SHM + file_cache
round-trip verified.
- BC impact, measured: an AST scan of the 1,000 most-downloaded
Packagist packages (173k+ files) found zero method-call sites with a
guaranteed-scalar receiver -- i.e. zero call sites that change behaviour
(every such site is a fatal error today). Userland Str classes (incl.
Laravel's Illuminate\Support\Str) coexist with the backing class, verified.
Full write-up (RFC draft, plus the method-set, performance, and
BC-impact analyses):
https://github.com/kralmichal/php-src/tree/rfc/docs/scalar-object-methods-rfc
Implementation branches (PHP 8.6-dev base):
- Primary (scalar methods):
https://github.com/kralmichal/php-src/tree/rfc/scalar-methods
- Secondary (typed locals, stacked):
https://github.com/kralmichal/php-src/tree/rfc/typed-locals
What I'd value discussing before I write the formal RFC:
1. Does the "compile-time-guaranteed receivers only" framing actually
resolve the loose-typing objection, or is there a hole I'm not seeing?
2. The method-set and naming is the most open part -- is a small
curated, clean-slate set (distinct from the procedural names) the right
direction, or a non-starter? How should it relate to the existing
userland efforts in this space (e.g. Psl)?
3. Anything that would sink this before I invest in the full RFC.
Thanks,
Michal Kral