Re: [PHP-DEV] Re: [RFC] [Discussion] OPcache Static Cache

Nicolas Grekas Mon, 01 Jun 2026 09:33:49 -0700

Le lun. 1 juin 2026 à 15:22, Go Kudo <[email protected]> a écrit :


>
>
> 2026年6月1日(月) 20:36 Nicolas Grekas <[email protected]>:
>
>> Hi, thanks for the reminder and for the RFC.
>>
>> Le lun. 1 juin 2026 à 09:32, Jakub Zelenka <[email protected]> a écrit :
>>
>>> Hi,
>>>
>>> On Mon, Jun 1, 2026 at 8:06 AM Go Kudo <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> 2026年5月17日(日) 0:19 Go Kudo <[email protected]>:
>>>>
>>>>> Hi internals,
>>>>>
>>>>> I'd like to start the discussion for a new RFC, OPcache Static Cache.
>>>>>
>>>>> RFC: https://wiki.php.net/rfc/opcache_static_cache
>>>>> Implementation: https://github.com/php/php-src/pull/22052
>>>>>
>>>>
>>>>
>>> The FPM shared hosting part is a problem and I don't think this can be
>>> default and probably cannot even be optional. The reason is that we
>>> consider data leaks between pools as security issues so I don't think we
>>> can have some feature that is actually causing a security issue. It will be
>>> a bit tricky to decide what to do if this passes in the current form
>>> because we would probably need to apply security fix and disable it. If you
>>> really want to have it enabled, we would need to explicitly state in the
>>> policy and docs that pool boundary is no longer considered as a security
>>> boundary which would be quite problematic for some shared hosting that rely
>>> on it. Maybe the solution would be to allow it only if there is one pool
>>> enabled.
>>>
>>
>> I agree with Jakub on this one, this should be safe by default, which
>> means at the minimum setting the default to 0. But that'd mean we couldn't
>> reliably build on the expectation that people have this feature enabled,
>> which would be a shame to me as a lib author :) I'd rather suggest we find
>> a way to scope per-pool (and also ini-configure per-pool). APCu doesn't
>> have this scope isolation, but APCu is opt-in so not really a concern
>> there. Can't we have per-pool SHM segments?
>>
>> I also have concerns about other parts:
>>
>> **Attributes**
>>
>> I was wondering why some keys have to be reserved (FQCN and two
>> prefixes). IIUC, this is for attributes to work. This looks like an
>> abstraction leak to me. Then I dug the implementation a bit and it looks
>> like a significant chunk of the complexity is for making attributes work
>> (e.g. JIT stuff, new VM hooks, CacheStrategy::Tracking machinery). I feel
>> like this belongs to a follow up RFC. The rest is significant enough to be
>> discussed on its own.
>>
>> **Serialization / data representation**
>>
>> Part of why APCu is slow is that it serializes all values and puts the
>> resulting strings in SHM, which defeats a lot of possible optimizations
>> (interned string pointers, immutable arrays, etc). It's nice that you're
>> proposing a new way to address this.
>>
>> After digging in though, my main suggestion is to restrict the storage to
>> scalars and arrays of scalars only (enums being the one exception maybe),
>> and to leave the data representation as a separate concern: no references,
>> no objects, no resources. If anyone wants to put a more complex PHP value
>> in there, it becomes their responsibility to serialize() it first, or to
>> use something like the deepclone extension I introduced a few weeks ago
>> [1], which provides the exact same semantics as serialize but returns pure
>> arrays of scalars. This decouples the "data representation / serialization"
>> topic from the storage itself (opcache here, something else in my use case).
>>
>> I'm proposing this because every issue I found in the object handling
>> points back to it being a lot of surface for not much gain:
>>
>> - the fast path doesn't handle references (neither soft = two variables
>> pointing at the same object, nor hard = `&`). It doesn't corrupt them, but
>> it silently falls back to full serialization for the whole value as soon as
>> one is present. So a single `&` or one shared object instance anywhere in a
>> large value gets zero benefit and pays APCu-level unserialize cost on every
>> fetch, invisibly. I'd rather reject hard refs explicitly (like resources)
>> and represent shared object identity properly, but honestly scalars-only
>> sidesteps the whole thing.
>>
>> - the engine already provides serialization hooks for internal objects.
>> You add a new mechanism to clone them faster, with a fallback on the
>> existing serialization infra. That's interesting, but it's yet another
>> mechanism to maintain, while the serialization hooks themselves took many
>> versions to get right on php-src (not a good signal with the state of the
>> extensions ecosystem...). __serialize already returns a plain array that's
>> easy to traverse, so it could fit properly without a parallel protocol.
>> Scalars-only removes the need for any of this (and with it the SPL
>> coupling).
>>
>> - I also don't like (in APCu too) that a call to store() can throw any
>> kind of exception, since serialization methods can throw anything and the
>> function just rethrows them as-is. It feels like an abstraction leak. With
>> scalars-only there's no serialization in the storage layer, so this goes
>> away by construction.
>>
>> So: would you consider restricting to scalars / arrays-of-scalars (not
>> the deepclone part, just the type restriction)? It makes the storage do
>> what it does well and keeps representation as a separate concern. It'd be
>> best IMHO, and it deletes a large chunk of the complexity above in one move.
>>
>> [1] https://github.com/symfony/php-ext-deepclone
>>
>> **API**
>>
>> 27 functions is a lot, with many of them being variants of the same base
>> API. Also, the $throw_on_error part is something we'd rather not have IMHO.
>> What about an OOP API instead?
>>
>> Here is a quick draft:
>>
>> ```php
>> namespace OPcache;
>>
>> // Values are scalars or arrays of scalars; callers serialize anything
>> richer themselves.
>> //
>> // Error model for every method:
>> //   misses and lock contention are normal and never throw
>> //     get() miss -> $default ; has() miss -> false ; lock() contended ->
>> false
>> //   real errors always throw CacheException (no per-call flag)
>> //     unstorable value, backend disabled/unavailable, pinned exhausted
>> interface CacheInterface {
>>     public function get(string $key, mixed $default = null): mixed;
>>     public function getMultiple(iterable $keys, mixed $default = null):
>> array;
>>     public function set(string $key, mixed $value): bool;
>>     public function setMultiple(iterable $values): bool;
>>     public function has(string $key): bool;
>>     public function delete(string $key): bool;
>>     public function deleteMultiple(iterable $keys): bool;
>>     public function clear(): bool;
>>     public function lock(string $key, int $lease = 0): bool;   // lease 0
>> = until rshutdown
>>     public function unlock(string $key): bool;
>>     public function info(): CacheInfo;
>> }
>>
>> // TTL only where it is meaningful
>> class VolatileCache implements CacheInterface {
>>     public function set(string $key, mixed $value, int $ttl = 0): bool;
>>     public function setMultiple(iterable $values, int $ttl = 0): bool;
>> }
>>
>> // atomics only where entries never expire
>> class PinnedCache implements CacheInterface {
>>     public function increment(string $key, int $step = 1): int;
>>     public function decrement(string $key, int $step = 1): int;
>> }
>>
>> final readonly class CacheInfo { /* [...] */ }
>> class CacheException extends \Exception {}
>>
>> function volatile_cache(): VolatileCache {}   // process-wide singleton
>> per backend
>> function pinned_cache(): PinnedCache {}
>> ```
>>
>> **API still**
>>
>> I read your arguments for the non-volatile API, yet I'm wondering if that
>> makes sense at all. I understand the motivation, but is this really worth
>> all the challenges it brings (see above: serialization, SHM management,
>> pool scoping, ini settings, etc), when the alternative already exists and
>> doesn't have any of these? By alternative I mean what we do today: generate
>> PHP code that contains the pinned values, and rely on opcache to cache them.
>>
>> What we miss in the engine is the volatile API. A better APCu. But the
>> pinned API we might not need one. The only thing is the increment/decrement
>> part, and I'm not sure it's enough reason to keep it. Maybe another
>> approach could provide this in a simpler way?
>>
>> Worth noting too: the per-pool / SHM / ini concerns from my first point
>> apply entirely to pinned and only partly to volatile, so dropping pinned
>> also shrinks the blocking security surface, not just the API.
>>
>> Overall, I'd really like a better APCu to be provided by default, so
>> thanks for pushing for this!
>>
>> Cheers,
>> Nicolas
>>
>>
> Hi Nicolas.
>
> Thanks again for the detailed read. A fair amount of this is now addressed
> in 1.4.0, and you asked for a concrete OOP shape, so let me start with
> those and then come back to the scalars/references discussion.
>
> **Per-pool scoping (your first point)**
>
> FPM is solved in 1.4.0. There's now one volatile and one pinned partition
> per worker pool, created before any worker forks
> (fpm_static_cache_init_main() walks fpm_worker_all_pools and calls
> partition_create(wp->config->name) for each pool), and each child activates
> its own pool's partition in fpm_child_init() before user code runs. A value
> stored in one pool isn't visible from another pool under the same master,
> which is the per-pool SHM segment you asked for.
>

That's great thanks.


> Where I'd like your view is the rest. FPM is the only SAPI where PHP has a
> tenant boundary it can pick before request handling, so it's the only one
> that gets real per-pool segments. apache2handler, LSAPI, cgi-fcgi and
> friends have no equivalent pre-request identity to key a partition on, so
> rather than invent one, 1.4.0 leaves the feature off there unless
> opcache.static_cache.allow_unsafe_runtime=1.
>

IMHO "unsafe" wording is too strong: these are safe SAPIs, they just don't
have a scoping concept built in. And disabling it by default for them
brings back the very concern I raised, that this won't be a
generally-available primitive authors can rely on. My take: enable it by
default with a single default scope for those SAPIs, plus a clear internal
API so a SAPI can define its own scoped segments. I know FrankenPHP
would leverage it, and maybe others will find a way (e.g. apache2handler)
to expose similar boundaries.



> **API shape (static classes, no $throw_on_error)**
>
> I went a slightly different way from your sketch: two classes with static
> methods, no instances and no shared interface.
>

We've been historically against static methods in php when plain functions
provide the same.To me it's either instances xor functions.OOP brings
abstraction which brings possible IoC, that's the benefit. I'm fine
with functions also, but then the duplication is bloating the list of
functions. Dunno if that's an issue for others. I proposed just dropping
the pinned variants, which kills most of that :) And note: if we also drop
object support and pinned (below), the whole thing collapses to a single
volatile cache, at which point instances-vs-functions is a small call and
either is fine by me.



> $throw_on_error is gone in this shape, which I agree is better. Misses and
> contention never throw (get returns the default, getMultiple fills per-key
> defaults, lock returns false); real backend failures return false /
> int|false; argument errors (empty key, reserved key, top-level
> Closure/resource, negative ttl) still raise TypeError/ValueError.
> StaticCacheException is then only the strict #[PinnedStatic] publication
> failure. The one thing the flag covered, treating a disabled backend as a
> hard config error, is a one-line StaticCacheInfo::available check at
> bootstrap, which the RFC already recommends.
>

Thanks.


> I could also add VolatileCache::remember($key, $compute, $ttl = 0)
> wrapping the safe lock -> build-outside-the-lock -> store sequence, since
> that's the pattern people reach for; happy to include or drop it. If you'd
> still rather have instances, tell me what they'd buy and I'll reconsider, I
> just couldn't find a concrete thing here.
>

Personally I like this kind of transactional API.


> **Scalars + arrays-of-scalars only**
>
> This is the one place I'd push back, because the measurements point the
> other way. The whole point of the design is to avoid the serialize-on-store
> / unserialize-on-fetch round trip. If storage only takes scalars and arrays
> of scalars, anything richer has to be serialize()'d by the caller and
> unserialize()'d on every read, which is the APCu cost model. So
> scalars-only doesn't remove that cost, it moves it into userland and makes
> it mandatory for every object, including the ones the engine can already
> restore cheaply.
>
> Carbon is the clearest case because it defines __serialize/__unserialize,
> so under a scalars-only rule it's a forced round trip every time. From the
> 1.4.0 numbers on NTS php-fpm:
>
>   APCu (serialize + unserialize per fetch):                ~189 us
>   VolatileCache::get via the Date/Time safe-direct handler: ~45 us
>   #[VolatileStatic] property (restored once into the slot): ~1.5 us
>
> The ~45 us is the relevant number. Carbon keeps its own __serialize, but
> the Date/Time handler is registered with allows_custom_serializers = true,
> so a Carbon instance still takes the safe-direct copy path rather than
> php_var_serialize. Under scalars-only, that ~45 us goes back to the ~189 us
> round trip and the ~1.5 us attribute path disappears. The other object rows
> are the same shape: metadata object ~166 us vs ~35 us, SPL collections ~20
> us vs ~5.6 us, small DateTime ~2.6 us vs ~1.1 us.
>
> So "a lot of surface for not much gain" is the reading I'd disagree with:
> the gain is the 4x to ~130x, and it exists specifically because the value
> isn't scalarised.
>

I went and measured it. I built php-src from your branch (1.4.0) with
ext/apcu and my ext/deepclone all compiled in, and timed a warm-cache fetch
of the same value. A = APCu (serialize + unserialize). B = your native
OPcache\volatile_fetch (warm, so the request-local prototype is built and
it just clones). C = the array representation kept as a resident immutable
value (what an opcache literal already gives you) + deepclone hydrate.
Warm, NTS, us/op:

  fixture                          bytes |   APCu | B native | C
immut-array+hydrate
  plain object graph (5-deep)       1.7K |   6.66 |     2.52 |   1.79
  big object graph (400 objs)       476K |   2358 |      618 |    382
  big config array (4k entries)     480K |   1590 |      331 |   0.045

Three things this settles for me:

- the "Nx faster than APCu" headline is size-dependent. APCu is 2-7 us for
small objects and only reaches the hundreds-of-us range at ~half-a-MB
payloads, so the big multiplier is a large-object effect, not the common
case.
- C (objects-as-arrays + userland hydrate) ties or beats your native path
in every warm case I tried, which is the static cache's best case. The
in-engine object machinery isn't buying speed over a plain array
representation, it's slightly slower than it.
- for array data, the dominant config/metadata case, an immutable array is
essentially free (0.045 us): a zero-copy read with nothing to hydrate.
That's ~7000x faster than the static cache's own array fetch, which pays an
O(n) walk per read and so doesn't even deliver the immutable-array win that
opcache literals already give. The preload/generated-code path wins this
one decisively, without any of the new machinery.

The other direction is telling too: in a fetch-once pattern (each key read
a single time) the native path is *slower* than APCu, e.g. 38 us vs 7 us on
the shared-identity object, because it builds a request-local prototype it
never reuses. The prototype only pays off under repeated same-key fetches,
which is exactly the in-request registry case I describe below.

JIT was off, but the timed work is all C-side so it barely moves the
numbers.


> **References and shared identity**
>
> You're right about the mechanics and I won't gloss over it. There are
> three store paths:
>
>   1. shared graph: built straight into SHM, fetched with no userland code
>   2. the OPcache serializer: SHM-safe binary encode, bytes copied under
> the lock and rebuilt after it's released, still no userland code
>   3. php_var_serialize fallback
>
> A circular array (enter_array sees the same HashTable twice) or a shared
> object identity (mark_object sees the same zend_object twice) makes paths 1
> and 2 bail, and the value lands on path 3. A hard reference inside object
> state does the same. So a value carrying one of those shapes pays path-3
> cost, and today that's silent.
>
> Two things on that. First, path 3 is APCu parity, not worse than APCu, so
> it's a floor rather than a regression. Second, the values that hit it are
> exactly the values scalars-only would push through
> serialize()/unserialize() unconditionally, so the current worst case is the
> normal case under your proposal.
>
> The "invisible" part is fair, though. I'd rather make it visible (surface
> the chosen path in info(), or in a debug build) than ban objects, since
> banning them gives up the common no-ref case that's the reason for the
> feature. And I have no real objection to rejecting top-level hard refs up
> front the way resources are rejected, if people think a silent cliff is
> worse than an explicit error. That's a small change.
>

"top-level hard ref" confuses me: it sounds like store($var) where $var is
a reference, but the parameter isn't by-ref, so the engine doesn't pass the
reference through anyway. A problematic hard ref is always nested,
self-referencing a sub-part of the passed graph, which is exactly the shape
you can't cheaply reject up front.

But step back on what the fallback means: it triggers in cases that are
hard to anticipate, so in practice this is APCu-level perf much of the
time. The same object reachable from two places in a graph is not an
exceptional shape.


> On "yet another mechanism vs __serialize": the serializer hooks are still
> the fallback, not a replacement. The safe-direct tables only cover a fixed,
> engine-vetted set (Date/Time and four SPL collections), they're registered
> in C by the owning extension, and nothing is exposed to userland, so it
> isn't a protocol the ecosystem has to implement or get right. __serialize
> keeps handling everything else.
>
> **Dropping pinned**
>
> The preload + generated-array pattern is good, and the RFC treats it as
> the existing workaround it's trying to formalise, not something to replace.
> But it has one hard limit: it only works for data you can express as a PHP
> literal that opcache can intern, i.e. scalars and arrays. As soon as the
> thing you want to keep across requests is an object graph, that route is
> back to a serialized string plus a per-request unserialize, which is the
> cost we started from.
>
> That's the gap pinned covers. PinnedStatic on the Carbon shape is ~1.5 us
> (restored once into the static slot) against ~189 us for the round trip,
> and there's no preload trick that reaches that number, because preload
> can't bake a live object graph into an opcode literal. I agree the
> increment/decrement part isn't enough to justify pinned on its own; the
> object case is the reason it's there. And with the per-pool partitions now
> in, pinned lives in the same isolated per-pool segment as volatile, so
> dropping it no longer shrinks the security surface the way it would have
> before 1.4.0.
>

All these items above are variants of the same need for a solution that'd
allow passing objects through the API.

I think this should be dropped. I get it can feel convenient to bring this,
but not at all costs. All things discussed above come down to addressing
this need, at the cost a significant abstraction leak (exceptions thrown by
userland serialization hooks), duplicate functions for pinned/non-pinned,
magic behavior that breaks the advertised perf benefits compared to
existing solutions (the serialize fallback), etc.I worked a lot on this
topic in the previous years, the symfony/var-exporter component was built
for this need: conveying objects using arrays. This proves that yes,
current immutable arrays are perfectly able to describe objects, provided
one uses a conversion layer on top of them. (BTW you tie this to
preloading, but that's not accurate: you don't need preload to get
immutable arrays, opcache interns array literals from any cached file;
preload only saves the recompile.)

The mechanism described in the RFC brings something new to me: the
per-request unserialize-once, copy-many mechanism. It's an optimization
that prevents unserializing many times in the same request.This is nice,
but I'm doubtful it justifies the added complexity on its own: in a single
request, it's quite easy for libraries to wrap the cache backend and keep a
live registry of unserialized objects for the duration of the
request.That's already what most libs do, since that saves doing round
trips to the cache backend in the same request. To me this is an already
solved problem, with a better existing solution: a request registry returns
the same instance with zero copy, where the engine hands back N independent
clones. And for the read-only config/metadata that's the actual workload
here, you want that shared instance, not isolated copies, so the isolation
the copy buys you solves a problem this use case doesn't have.

 I remain unconvinced about this object-transmission machinery. Dunno what
others think about it.

So where I land, concretely: I'd happily vote yes on a focused "better
APCu": a volatile backend, scalars and arrays of scalars, per-pool segments
(with a default scope + an internal API for the other SAPIs), and the
functions-or-instances API with a remember() helper. Objects left to a
userland hydration layer; attributes and pinned to a later RFC if someone
still wants them once this ships. That's a primitive I could build on, and
it sidesteps every hard problem in this thread.

Nicolas

Re: [PHP-DEV] Re: [RFC] [Discussion] OPcache Static Cache

Reply via email to