Re: [PHP-DEV] Re: [RFC] [Discussion] OPcache Static Cache

Go Kudo Mon, 01 Jun 2026 06:25:03 -0700

2026年6月1日(月) 20:36 Nicolas Grekas <[email protected]>:

> Hi, thanks for the reminder and for the RFC.
>
> Le lun. 1 juin 2026 à 09:32, Jakub Zelenka <[email protected]> a écrit :
>
>> Hi,
>>
>> On Mon, Jun 1, 2026 at 8:06 AM Go Kudo <[email protected]> wrote:
>>
>>>
>>>
>>> 2026年5月17日(日) 0:19 Go Kudo <[email protected]>:
>>>
>>>> Hi internals,
>>>>
>>>> I'd like to start the discussion for a new RFC, OPcache Static Cache.
>>>>
>>>> RFC: https://wiki.php.net/rfc/opcache_static_cache
>>>> Implementation: https://github.com/php/php-src/pull/22052
>>>>
>>>
>>>
>> The FPM shared hosting part is a problem and I don't think this can be
>> default and probably cannot even be optional. The reason is that we
>> consider data leaks between pools as security issues so I don't think we
>> can have some feature that is actually causing a security issue. It will be
>> a bit tricky to decide what to do if this passes in the current form
>> because we would probably need to apply security fix and disable it. If you
>> really want to have it enabled, we would need to explicitly state in the
>> policy and docs that pool boundary is no longer considered as a security
>> boundary which would be quite problematic for some shared hosting that rely
>> on it. Maybe the solution would be to allow it only if there is one pool
>> enabled.
>>
>
> I agree with Jakub on this one, this should be safe by default, which
> means at the minimum setting the default to 0. But that'd mean we couldn't
> reliably build on the expectation that people have this feature enabled,
> which would be a shame to me as a lib author :) I'd rather suggest we find
> a way to scope per-pool (and also ini-configure per-pool). APCu doesn't
> have this scope isolation, but APCu is opt-in so not really a concern
> there. Can't we have per-pool SHM segments?
>
> I also have concerns about other parts:
>
> **Attributes**
>
> I was wondering why some keys have to be reserved (FQCN and two prefixes).
> IIUC, this is for attributes to work. This looks like an abstraction leak
> to me. Then I dug the implementation a bit and it looks like a significant
> chunk of the complexity is for making attributes work (e.g. JIT stuff, new
> VM hooks, CacheStrategy::Tracking machinery). I feel like this belongs to a
> follow up RFC. The rest is significant enough to be discussed on its own.
>
> **Serialization / data representation**
>
> Part of why APCu is slow is that it serializes all values and puts the
> resulting strings in SHM, which defeats a lot of possible optimizations
> (interned string pointers, immutable arrays, etc). It's nice that you're
> proposing a new way to address this.
>
> After digging in though, my main suggestion is to restrict the storage to
> scalars and arrays of scalars only (enums being the one exception maybe),
> and to leave the data representation as a separate concern: no references,
> no objects, no resources. If anyone wants to put a more complex PHP value
> in there, it becomes their responsibility to serialize() it first, or to
> use something like the deepclone extension I introduced a few weeks ago
> [1], which provides the exact same semantics as serialize but returns pure
> arrays of scalars. This decouples the "data representation / serialization"
> topic from the storage itself (opcache here, something else in my use case).
>
> I'm proposing this because every issue I found in the object handling
> points back to it being a lot of surface for not much gain:
>
> - the fast path doesn't handle references (neither soft = two variables
> pointing at the same object, nor hard = `&`). It doesn't corrupt them, but
> it silently falls back to full serialization for the whole value as soon as
> one is present. So a single `&` or one shared object instance anywhere in a
> large value gets zero benefit and pays APCu-level unserialize cost on every
> fetch, invisibly. I'd rather reject hard refs explicitly (like resources)
> and represent shared object identity properly, but honestly scalars-only
> sidesteps the whole thing.
>
> - the engine already provides serialization hooks for internal objects.
> You add a new mechanism to clone them faster, with a fallback on the
> existing serialization infra. That's interesting, but it's yet another
> mechanism to maintain, while the serialization hooks themselves took many
> versions to get right on php-src (not a good signal with the state of the
> extensions ecosystem...). __serialize already returns a plain array that's
> easy to traverse, so it could fit properly without a parallel protocol.
> Scalars-only removes the need for any of this (and with it the SPL
> coupling).
>
> - I also don't like (in APCu too) that a call to store() can throw any
> kind of exception, since serialization methods can throw anything and the
> function just rethrows them as-is. It feels like an abstraction leak. With
> scalars-only there's no serialization in the storage layer, so this goes
> away by construction.
>
> So: would you consider restricting to scalars / arrays-of-scalars (not the
> deepclone part, just the type restriction)? It makes the storage do what it
> does well and keeps representation as a separate concern. It'd be best
> IMHO, and it deletes a large chunk of the complexity above in one move.
>
> [1] https://github.com/symfony/php-ext-deepclone
>
> **API**
>
> 27 functions is a lot, with many of them being variants of the same base
> API. Also, the $throw_on_error part is something we'd rather not have IMHO.
> What about an OOP API instead?
>
> Here is a quick draft:
>
> ```php
> namespace OPcache;
>
> // Values are scalars or arrays of scalars; callers serialize anything
> richer themselves.
> //
> // Error model for every method:
> //   misses and lock contention are normal and never throw
> //     get() miss -> $default ; has() miss -> false ; lock() contended ->
> false
> //   real errors always throw CacheException (no per-call flag)
> //     unstorable value, backend disabled/unavailable, pinned exhausted
> interface CacheInterface {
>     public function get(string $key, mixed $default = null): mixed;
>     public function getMultiple(iterable $keys, mixed $default = null):
> array;
>     public function set(string $key, mixed $value): bool;
>     public function setMultiple(iterable $values): bool;
>     public function has(string $key): bool;
>     public function delete(string $key): bool;
>     public function deleteMultiple(iterable $keys): bool;
>     public function clear(): bool;
>     public function lock(string $key, int $lease = 0): bool;   // lease 0
> = until rshutdown
>     public function unlock(string $key): bool;
>     public function info(): CacheInfo;
> }
>
> // TTL only where it is meaningful
> class VolatileCache implements CacheInterface {
>     public function set(string $key, mixed $value, int $ttl = 0): bool;
>     public function setMultiple(iterable $values, int $ttl = 0): bool;
> }
>
> // atomics only where entries never expire
> class PinnedCache implements CacheInterface {
>     public function increment(string $key, int $step = 1): int;
>     public function decrement(string $key, int $step = 1): int;
> }
>
> final readonly class CacheInfo { /* [...] */ }
> class CacheException extends \Exception {}
>
> function volatile_cache(): VolatileCache {}   // process-wide singleton
> per backend
> function pinned_cache(): PinnedCache {}
> ```
>
> **API still**
>
> I read your arguments for the non-volatile API, yet I'm wondering if that
> makes sense at all. I understand the motivation, but is this really worth
> all the challenges it brings (see above: serialization, SHM management,
> pool scoping, ini settings, etc), when the alternative already exists and
> doesn't have any of these? By alternative I mean what we do today: generate
> PHP code that contains the pinned values, and rely on opcache to cache them.
>
> What we miss in the engine is the volatile API. A better APCu. But the
> pinned API we might not need one. The only thing is the increment/decrement
> part, and I'm not sure it's enough reason to keep it. Maybe another
> approach could provide this in a simpler way?
>
> Worth noting too: the per-pool / SHM / ini concerns from my first point
> apply entirely to pinned and only partly to volatile, so dropping pinned
> also shrinks the blocking security surface, not just the API.
>
> Overall, I'd really like a better APCu to be provided by default, so
> thanks for pushing for this!
>
> Cheers,
> Nicolas
>
>
Hi Nicolas.

Thanks again for the detailed read. A fair amount of this is now addressed
in 1.4.0, and you asked for a concrete OOP shape, so let me start with
those and then come back to the scalars/references discussion.

**Per-pool scoping (your first point)**

FPM is solved in 1.4.0. There's now one volatile and one pinned partition
per worker pool, created before any worker forks
(fpm_static_cache_init_main() walks fpm_worker_all_pools and calls
partition_create(wp->config->name) for each pool), and each child activates
its own pool's partition in fpm_child_init() before user code runs. A value
stored in one pool isn't visible from another pool under the same master,
which is the per-pool SHM segment you asked for.

Where I'd like your view is the rest. FPM is the only SAPI where PHP has a
tenant boundary it can pick before request handling, so it's the only one
that gets real per-pool segments. apache2handler, LSAPI, cgi-fcgi and
friends have no equivalent pre-request identity to key a partition on, so
rather than invent one, 1.4.0 leaves the feature off there unless
opcache.static_cache.allow_unsafe_runtime=1.

My honest read is that we don't need to chase per-pool isolation for those
SAPIs, for two reasons. The shared multi-tenant case under a non-FPM web
runtime is off by default, so nobody is silently exposed. And with the
1.4.0 error model, an unavailable backend isn't a hazard for callers: a
disabled backend returns false / the default instead of throwing, so
libraries that call opportunistically just degrade rather than break. So
the cost of *not* having universal per-pool is an admin choice (leave it
off, or knowingly accept runtime-wide sharing), not a correctness or safety
problem.

Do you see a non-FPM deployment where default-off-plus-opt-in isn't enough
and a real per-SAPI tenant boundary would actually be needed? If so I'd
rather scope it as future work for that specific SAPI than block on it, but
I want to know if you think it's load-bearing.

**API shape (static classes, no $throw_on_error)**

I went a slightly different way from your sketch: two classes with static
methods, no instances and no shared interface.

  namespace OPcache;

  final class VolatileCache
  {
      public static function get(string $key, mixed $default = null): mixed
{}
      public static function getMultiple(iterable $keys, mixed $default =
null): array {}
      public static function set(string $key, mixed $value, int $ttl = 0):
bool {}
      public static function setMultiple(iterable $values, int $ttl = 0):
bool {}
      public static function has(string $key): bool {}
      public static function delete(string $key): bool {}          // exact
key, or a loaded class name to drop its attribute-backed state
      public static function deleteMultiple(iterable $keys): bool {}
      public static function clear(): bool {}
      public static function lock(string $key, int $lease = 0): bool {}
// single-builder primitive; miss/contention -> false
      public static function unlock(string $key): bool {}
      public static function info(): StaticCacheInfo {}
  }

  final class PinnedCache
  {
      public static function get(string $key, mixed $default = null): mixed
{}
      public static function getMultiple(iterable $keys, mixed $default =
null): array {}
      public static function set(string $key, mixed $value): bool {}     //
no TTL
      public static function setMultiple(iterable $values): bool {}
      public static function has(string $key): bool {}
      public static function delete(string $key): bool {}
      public static function deleteMultiple(iterable $keys): bool {}
      public static function clear(): bool {}
      public static function lock(string $key, int $lease = 0): bool {}
      public static function unlock(string $key): bool {}
      public static function increment(string $key, int $step = 1):
int|false {}
      public static function decrement(string $key, int $step = 1):
int|false {}
      public static function info(): StaticCacheInfo {}
  }

  // StaticCacheInfo and StaticCacheException are the existing RFC types,
reused as-is.

Why static rather than instances: there's exactly one backend per partition
and a handle would carry no per-instance state, volatile and pinned aren't
interchangeable (different eviction semantics, TTL vs atomics), so there's
nothing to gain from passing one around, and the differing set() arity a
shared interface would impose (volatile has $ttl, pinned doesn't) is the
exact awkwardness the interface would create. Static methods sidestep all
of it. Grouping them as VolatileCache:: / PinnedCache:: also answers the
"many variants of the same base API" part of your 27-functions point
directly, while dropping the volatile_/pinned_ prefix noise.

$throw_on_error is gone in this shape, which I agree is better. Misses and
contention never throw (get returns the default, getMultiple fills per-key
defaults, lock returns false); real backend failures return false /
int|false; argument errors (empty key, reserved key, top-level
Closure/resource, negative ttl) still raise TypeError/ValueError.
StaticCacheException is then only the strict #[PinnedStatic] publication
failure. The one thing the flag covered, treating a disabled backend as a
hard config error, is a one-line StaticCacheInfo::available check at
bootstrap, which the RFC already recommends.

This replaces the volatile_*/pinned_* functions rather than adding to them.
I could also add VolatileCache::remember($key, $compute, $ttl = 0) wrapping
the safe lock -> build-outside-the-lock -> store sequence, since that's the
pattern people reach for; happy to include or drop it. If you'd still
rather have instances, tell me what they'd buy and I'll reconsider, I just
couldn't find a concrete thing here.

**Scalars + arrays-of-scalars only**

This is the one place I'd push back, because the measurements point the
other way. The whole point of the design is to avoid the serialize-on-store
/ unserialize-on-fetch round trip. If storage only takes scalars and arrays
of scalars, anything richer has to be serialize()'d by the caller and
unserialize()'d on every read, which is the APCu cost model. So
scalars-only doesn't remove that cost, it moves it into userland and makes
it mandatory for every object, including the ones the engine can already
restore cheaply.

Carbon is the clearest case because it defines __serialize/__unserialize,
so under a scalars-only rule it's a forced round trip every time. From the
1.4.0 numbers on NTS php-fpm:

  APCu (serialize + unserialize per fetch):                ~189 us
  VolatileCache::get via the Date/Time safe-direct handler: ~45 us
  #[VolatileStatic] property (restored once into the slot): ~1.5 us

The ~45 us is the relevant number. Carbon keeps its own __serialize, but
the Date/Time handler is registered with allows_custom_serializers = true,
so a Carbon instance still takes the safe-direct copy path rather than
php_var_serialize. Under scalars-only, that ~45 us goes back to the ~189 us
round trip and the ~1.5 us attribute path disappears. The other object rows
are the same shape: metadata object ~166 us vs ~35 us, SPL collections ~20
us vs ~5.6 us, small DateTime ~2.6 us vs ~1.1 us.

So "a lot of surface for not much gain" is the reading I'd disagree with:
the gain is the 4x to ~130x, and it exists specifically because the value
isn't scalarised.

**References and shared identity**

You're right about the mechanics and I won't gloss over it. There are three
store paths:

  1. shared graph: built straight into SHM, fetched with no userland code
  2. the OPcache serializer: SHM-safe binary encode, bytes copied under the
lock and rebuilt after it's released, still no userland code
  3. php_var_serialize fallback

A circular array (enter_array sees the same HashTable twice) or a shared
object identity (mark_object sees the same zend_object twice) makes paths 1
and 2 bail, and the value lands on path 3. A hard reference inside object
state does the same. So a value carrying one of those shapes pays path-3
cost, and today that's silent.

Two things on that. First, path 3 is APCu parity, not worse than APCu, so
it's a floor rather than a regression. Second, the values that hit it are
exactly the values scalars-only would push through
serialize()/unserialize() unconditionally, so the current worst case is the
normal case under your proposal.

The "invisible" part is fair, though. I'd rather make it visible (surface
the chosen path in info(), or in a debug build) than ban objects, since
banning them gives up the common no-ref case that's the reason for the
feature. And I have no real objection to rejecting top-level hard refs up
front the way resources are rejected, if people think a silent cliff is
worse than an explicit error. That's a small change.

On "yet another mechanism vs __serialize": the serializer hooks are still
the fallback, not a replacement. The safe-direct tables only cover a fixed,
engine-vetted set (Date/Time and four SPL collections), they're registered
in C by the owning extension, and nothing is exposed to userland, so it
isn't a protocol the ecosystem has to implement or get right. __serialize
keeps handling everything else.

**Dropping pinned**

The preload + generated-array pattern is good, and the RFC treats it as the
existing workaround it's trying to formalise, not something to replace. But
it has one hard limit: it only works for data you can express as a PHP
literal that opcache can intern, i.e. scalars and arrays. As soon as the
thing you want to keep across requests is an object graph, that route is
back to a serialized string plus a per-request unserialize, which is the
cost we started from.

That's the gap pinned covers. PinnedStatic on the Carbon shape is ~1.5 us
(restored once into the static slot) against ~189 us for the round trip,
and there's no preload trick that reaches that number, because preload
can't bake a live object graph into an opcode literal. I agree the
increment/decrement part isn't enough to justify pinned on its own; the
object case is the reason it's there. And with the per-pool partitions now
in, pinned lives in the same isolated per-pool segment as volatile, so
dropping it no longer shrinks the security surface the way it would have
before 1.4.0.

Thanks again for the detailed read.

Best regards,
Go Kudo

Re: [PHP-DEV] Re: [RFC] [Discussion] OPcache Static Cache

Reply via email to