Hi
On 8/26/22 11:14, Hans Henrik Bergan wrote:
you can't efficiently validate JSON in userland
Has anyone actually put that claim to the test? Has anyone actually made a
userland json validator (not just wrap json_decode()/json_last_error()) for
performance comparison?
( if not, https://www.json.org/JSON_checker/JSON_checker.c would probably
be a good start)
Worded like "you can't efficiently" the claim is false. Of course you
can memory-efficiently validate the input by traversing the string byte
by byte and keeping track of the nesting.
However the points that make a userland implementation infeasible are:
1. Writing a JSON parser is non-trivial as evidenced by:
https://github.com/nst/JSONTestSuite. I expect userland implementations
to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is
certainly more battle-tested and in fact it appears to pass all of the
tests in the linked test suite.
2. Even if the userland implementation is written very carefully, it
might behave differently than the native implementation used by
json_decode() (e.g. because the latter is buggy for some reason or
because the correct behavior is undefined). This would imply that an
input string that was successfully validated by your userland parser
might ultimately fail to parse when passed to json_decode(). This is
exactly what you don't want to happen.
Best regards
Tim Düsterhus
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php