Hi
Am 2025-09-08 23:14, schrieb Jakub Zelenka:
I understand your concern about the complexity but this can apply to
many
other parts in php-src. As I'm sure you know, this still wouldn't
likely to
be considered as a security issue.
I've intentionally said “security sensitive”. I believe the risk of
security issues for the unserializer is significantly higher than in
other parts of the language and I also believe that by keeping
complexity down it is both easier to verify correctness and also to fix
any bugs that crop up. From what I see __wakeup() and __unserialize()
work quite differently internally. When there's __unserialize(), the
deserialization process only creates an empty object shell (similarly to
newInstanceWithoutConstructor) and then calls __unserialize() at the
very end. For __wakeup() all the properties are filled directly and then
__wakeup() is called at the end. In both cases the destructor is skipped
for all following objects once one of the deserialization hooks fails.
For __wakeup() this has the interesting effect that in case of circular
structures, some objects may appear to already be properly initialized
(all the properties are there), but __wakeup() has not executed yet,
potentially making them unsafe to touch. In case of __unserialize() the
objects are clearly in a partially initialized state (e.g. properties
being uninit), which I'd claim is safer:
<?php
class A {
public $a;
public function __construct(public string $name)
{
}
public function __unserialize(array $data): void
{
$this->a = $data['a'];
$this->name = $data['name'];
echo "Waking up ", $this->name, PHP_EOL;
var_dump($this->a->name);
}
public function __wakeup(): void
{
echo "Waking up ", $this->name, PHP_EOL;
var_dump($this->a->name);
}
public function __destruct()
{
echo __METHOD__, PHP_EOL;
}
}
$a = new A('A');
$a->a = new A('B');
$a->a->a = new A('C');
$a->a->a->a = $a;
echo "Before", PHP_EOL;
var_dump(unserialize(serialize($a)));
echo "After", PHP_EOL;
In case of `__unserialize()` the unsafe access to `$this->a->name` will
throw, whereas in `__wakeup()` it will return `A`, despite `A` not being
fully available yet. Similarly skipping the destructor for an empty
object shell is safer than skipping the destructor for an object that
may appear usable.
Saying that “unserialize is not security-relevant because you must
only
feed it safe inputs” as an excuse to avoid making unserialize safer is
not helping anyone and is downplaying the risks involved in
unserialization.
I don't have problem with making unserialize safer as there might be
users
that use it improperly. But we shouldn't be saying that this is a
security
sensitive part when we don't consider those sort of issues as security
issues because none of those issues will get fixed in our security
support
only releases. In that sense I don't think we can consider it as a
security
sensitive part.
The unserializer already contains quite a bit of logic to make it as
safe as possible, e.g. running the unserialization hooks only at the
very end and skipping destructors for objects where the unserialization
hook didn't successfully run. The forefathers definitely considered
possible edge cases that might lead to security issues.
Best regards
Tim Düsterhus