Hi, thanks for the reminder and for the RFC. Le lun. 1 juin 2026 à 09:32, Jakub Zelenka <[email protected]> a écrit :
> Hi, > > On Mon, Jun 1, 2026 at 8:06 AM Go Kudo <[email protected]> wrote: > >> >> >> 2026年5月17日(日) 0:19 Go Kudo <[email protected]>: >> >>> Hi internals, >>> >>> I'd like to start the discussion for a new RFC, OPcache Static Cache. >>> >>> RFC: https://wiki.php.net/rfc/opcache_static_cache >>> Implementation: https://github.com/php/php-src/pull/22052 >>> >> >> > The FPM shared hosting part is a problem and I don't think this can be > default and probably cannot even be optional. The reason is that we > consider data leaks between pools as security issues so I don't think we > can have some feature that is actually causing a security issue. It will be > a bit tricky to decide what to do if this passes in the current form > because we would probably need to apply security fix and disable it. If you > really want to have it enabled, we would need to explicitly state in the > policy and docs that pool boundary is no longer considered as a security > boundary which would be quite problematic for some shared hosting that rely > on it. Maybe the solution would be to allow it only if there is one pool > enabled. > I agree with Jakub on this one, this should be safe by default, which means at the minimum setting the default to 0. But that'd mean we couldn't reliably build on the expectation that people have this feature enabled, which would be a shame to me as a lib author :) I'd rather suggest we find a way to scope per-pool (and also ini-configure per-pool). APCu doesn't have this scope isolation, but APCu is opt-in so not really a concern there. Can't we have per-pool SHM segments? I also have concerns about other parts: **Attributes** I was wondering why some keys have to be reserved (FQCN and two prefixes). IIUC, this is for attributes to work. This looks like an abstraction leak to me. Then I dug the implementation a bit and it looks like a significant chunk of the complexity is for making attributes work (e.g. JIT stuff, new VM hooks, CacheStrategy::Tracking machinery). I feel like this belongs to a follow up RFC. The rest is significant enough to be discussed on its own. **Serialization / data representation** Part of why APCu is slow is that it serializes all values and puts the resulting strings in SHM, which defeats a lot of possible optimizations (interned string pointers, immutable arrays, etc). It's nice that you're proposing a new way to address this. After digging in though, my main suggestion is to restrict the storage to scalars and arrays of scalars only (enums being the one exception maybe), and to leave the data representation as a separate concern: no references, no objects, no resources. If anyone wants to put a more complex PHP value in there, it becomes their responsibility to serialize() it first, or to use something like the deepclone extension I introduced a few weeks ago [1], which provides the exact same semantics as serialize but returns pure arrays of scalars. This decouples the "data representation / serialization" topic from the storage itself (opcache here, something else in my use case). I'm proposing this because every issue I found in the object handling points back to it being a lot of surface for not much gain: - the fast path doesn't handle references (neither soft = two variables pointing at the same object, nor hard = `&`). It doesn't corrupt them, but it silently falls back to full serialization for the whole value as soon as one is present. So a single `&` or one shared object instance anywhere in a large value gets zero benefit and pays APCu-level unserialize cost on every fetch, invisibly. I'd rather reject hard refs explicitly (like resources) and represent shared object identity properly, but honestly scalars-only sidesteps the whole thing. - the engine already provides serialization hooks for internal objects. You add a new mechanism to clone them faster, with a fallback on the existing serialization infra. That's interesting, but it's yet another mechanism to maintain, while the serialization hooks themselves took many versions to get right on php-src (not a good signal with the state of the extensions ecosystem...). __serialize already returns a plain array that's easy to traverse, so it could fit properly without a parallel protocol. Scalars-only removes the need for any of this (and with it the SPL coupling). - I also don't like (in APCu too) that a call to store() can throw any kind of exception, since serialization methods can throw anything and the function just rethrows them as-is. It feels like an abstraction leak. With scalars-only there's no serialization in the storage layer, so this goes away by construction. So: would you consider restricting to scalars / arrays-of-scalars (not the deepclone part, just the type restriction)? It makes the storage do what it does well and keeps representation as a separate concern. It'd be best IMHO, and it deletes a large chunk of the complexity above in one move. [1] https://github.com/symfony/php-ext-deepclone **API** 27 functions is a lot, with many of them being variants of the same base API. Also, the $throw_on_error part is something we'd rather not have IMHO. What about an OOP API instead? Here is a quick draft: ```php namespace OPcache; // Values are scalars or arrays of scalars; callers serialize anything richer themselves. // // Error model for every method: // misses and lock contention are normal and never throw // get() miss -> $default ; has() miss -> false ; lock() contended -> false // real errors always throw CacheException (no per-call flag) // unstorable value, backend disabled/unavailable, pinned exhausted interface CacheInterface { public function get(string $key, mixed $default = null): mixed; public function getMultiple(iterable $keys, mixed $default = null): array; public function set(string $key, mixed $value): bool; public function setMultiple(iterable $values): bool; public function has(string $key): bool; public function delete(string $key): bool; public function deleteMultiple(iterable $keys): bool; public function clear(): bool; public function lock(string $key, int $lease = 0): bool; // lease 0 = until rshutdown public function unlock(string $key): bool; public function info(): CacheInfo; } // TTL only where it is meaningful class VolatileCache implements CacheInterface { public function set(string $key, mixed $value, int $ttl = 0): bool; public function setMultiple(iterable $values, int $ttl = 0): bool; } // atomics only where entries never expire class PinnedCache implements CacheInterface { public function increment(string $key, int $step = 1): int; public function decrement(string $key, int $step = 1): int; } final readonly class CacheInfo { /* [...] */ } class CacheException extends \Exception {} function volatile_cache(): VolatileCache {} // process-wide singleton per backend function pinned_cache(): PinnedCache {} ``` **API still** I read your arguments for the non-volatile API, yet I'm wondering if that makes sense at all. I understand the motivation, but is this really worth all the challenges it brings (see above: serialization, SHM management, pool scoping, ini settings, etc), when the alternative already exists and doesn't have any of these? By alternative I mean what we do today: generate PHP code that contains the pinned values, and rely on opcache to cache them. What we miss in the engine is the volatile API. A better APCu. But the pinned API we might not need one. The only thing is the increment/decrement part, and I'm not sure it's enough reason to keep it. Maybe another approach could provide this in a simpler way? Worth noting too: the per-pool / SHM / ini concerns from my first point apply entirely to pinned and only partly to volatile, so dropping pinned also shrinks the blocking security surface, not just the API. Overall, I'd really like a better APCu to be provided by default, so thanks for pushing for this! Cheers, Nicolas
