Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
The RFC got accepted and merge. Thanks for contributing :)
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Don't think it makes a difference to the discussion at this point but just thought I'd point out there's another example missing from the RFC list that almost everyone probably has installed with their projects. https://github.com/sebastianbergmann/phpunit/blob/main/src/Framework/Constraint/String/IsJson.php On Wed, Sep 21, 2022 at 11:39 PM juan carlos morales < dev.juan.mora...@gmail.com> wrote: > The RFC is in voting phase now. > > I want to thank everyone involved in the discussions related to this > RFC, the ones who reviewed the code in advance, and the ones that with > some emojis in github showed their interest and support. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
The RFC is in voting phase now. I want to thank everyone involved in the discussions related to this RFC, the ones who reviewed the code in advance, and the ones that with some emojis in github showed their interest and support. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 9/20/22 00:15, juan carlos morales wrote: The RFC still contains a non-empty "Open Issues" section. This needs to be resolved before the vote starts. I would also recommend inserting a closed voting widget (or multiple, if you want to have additional votes for the details), so that it's clear for everyone how exactly the vote will look like. Tim, how should I proceed with the Open Issues/Questions? is only about the return value of the function. Should I create a poll for that? Should I leave it like this and decide during Voting phase? This question raised during the mailing discussion and never moved forward, can you advice please? Or I should delete the Open Issues/question and move forward with the current approach I choosed? It's your RFC, you do whatever you are comfortable with. If you feel that a vote is the best option, then add a vote. If you as the author want to decide what's "best", then that's also fine. It just needs to be clear what the function will look and feel like before the vote starts - so leaving it as it is and deciding yourself *during* voting is not acceptable. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
> The RFC still contains a non-empty "Open Issues" section. This needs to > be resolved before the vote starts. > > I would also recommend inserting a closed voting widget (or multiple, if > you want to have additional votes for the details), so that it's clear > for everyone how exactly the vote will look like. Tim, how should I proceed with the Open Issues/Questions? is only about the return value of the function. Should I create a poll for that? Should I leave it like this and decide during Voting phase? This question raised during the mailing discussion and never moved forward, can you advice please? Or I should delete the Open Issues/question and move forward with the current approach I choosed? A lot of days passed by already, and I would like to move on with the RFC process. Thanks in advance. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 9/19/22 23:26, juan carlos morales wrote: It appears the discussion came to a halt. How do you plan to proceed with your RFC? Thanks for the heads up email Tim. People, following the procedure, this is a "heads up" email to all of you regarding this RFC. If no significant issue/complain raises, then The RFC still contains a non-empty "Open Issues" section. This needs to be resolved before the vote starts. I would also recommend inserting a closed voting widget (or multiple, if you want to have additional votes for the details), so that it's clear for everyone how exactly the vote will look like. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
> It appears the discussion came to a halt. How do you plan to proceed > with your RFC? Thanks for the heads up email Tim. People, following the procedure, this is a "heads up" email to all of you regarding this RFC. If no significant issue/complain raises, then I will move the RFC for voting by Wednesday 21 of September. I also want to thank your support in the github pull request, the hearts and thums up emojis were awesome (https://github.com/php/php-src/pull/9399) Regards Juan. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 8/25/22 18:02, juan carlos morales wrote: I am glad to present to you the RFC for json_validate() function. It appears the discussion came to a halt. How do you plan to proceed with your RFC? Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El lun, 29 ago 2022 a las 11:26, Deleu () escribió: > > > > On Mon, Aug 29, 2022 at 11:19 AM juan carlos morales > wrote: >> >> El lun, 29 ago 2022 a las 11:06, Deleu () escribió: >> > >> > Has the option of returning a Result object been discussed/considered? Can >> > it be an option? I imagine that if `json_validate(): JsonValidationResult` >> > always returns a `JsonValidationResult` which contains a `public readonly >> > bool $valid` and a `public readonly ?string $error` it would be better >> > than both options on the table right now. The option of returning CLI-like >> > results means that we will need a `if (! json_validate())` to treat a >> > valid JSON (really awkward) and the option of using `json_last_error()` >> > relies on an internal state instead of an immutable structure. >> > Effectively, what we need is to return a complex structure which can >> > contain a boolean and a string and that is a class/object. >> > >> Interesting, it was not considered. >> >> I still think BOOLEAN is my preferred choice, but can you provide an >> example about how the code would look like with your approach? Imagine >> you have the feature like in your suggestion ... how a developer would >> write the code? > > > ``` > $json = '{}'; > > $result = json_validate($json); > > if (! $result->valid) { > throw new \Exception('Invalid JSON provided: ' . $result->error); > } > > $something->saveUserProvidedJson($json); > ``` > > -- > Marco Deleu Thanks for participating on the discussion and I will take note about this also, but ... Quick question: Is in PHP some rule, norma, estandard about "result objects" already? (like, "should have these methods, and implement this interface, etc.) FYI: Just want to clarify that there is nothing bad/wrong about relying on "internal state" by calling json_last_error(), as this pattern is used not only by json_decode() but also mysqli extension for example, among others. Nothing wrong honeslty IMO. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El lun, 29 ago 2022 a las 11:06, Deleu () escribió: > > Has the option of returning a Result object been discussed/considered? Can it > be an option? I imagine that if `json_validate(): JsonValidationResult` > always returns a `JsonValidationResult` which contains a `public readonly > bool $valid` and a `public readonly ?string $error` it would be better than > both options on the table right now. The option of returning CLI-like results > means that we will need a `if (! json_validate())` to treat a valid JSON > (really awkward) and the option of using `json_last_error()` relies on an > internal state instead of an immutable structure. Effectively, what we need > is to return a complex structure which can contain a boolean and a string and > that is a class/object. > Interesting, it was not considered. I still think BOOLEAN is my preferred choice, but can you provide an example about how the code would look like with your approach? Imagine you have the feature like in your suggestion ... how a developer would write the code? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Mon, Aug 29, 2022 at 10:50 AM juan carlos morales < dev.juan.mora...@gmail.com> wrote: > There is still 1 open question on the RFC, and is about the return value. > > https://wiki.php.net/rfc/json_validate#open_issuesquestions > > I would appreciate your feedback on this. Even though I was told the > RFC can go with 2 votings, I would like to know your thoughts about > that open question, in short, the return value. > Has the option of returning a Result object been discussed/considered? Can it be an option? I imagine that if `json_validate(): JsonValidationResult` always returns a `JsonValidationResult` which contains a `public readonly bool $valid` and a `public readonly ?string $error` it would be better than both options on the table right now. The option of returning CLI-like results means that we will need a `if (! json_validate())` to treat a valid JSON (really awkward) and the option of using `json_last_error()` relies on an internal state instead of an immutable structure. Effectively, what we need is to return a complex structure which can contain a boolean and a string and that is a class/object. Thoughts? -- Marco Deleu
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
There is still 1 open question on the RFC, and is about the return value. https://wiki.php.net/rfc/json_validate#open_issuesquestions I would appreciate your feedback on this. Even though I was told the RFC can go with 2 votings, I would like to know your thoughts about that open question, in short, the return value. The same thing was asked during code review of the pull request. I already expressed my opinion in the RFC, I prefer to return boolean and check errors using json_last_error() and I wrote down the details about this; instead of returning INT ... with 0 or JSON_ERROR_NONE representing successful validation. Thanks in advance. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El sáb, 27 ago 2022 a las 2:50, David Gebler () escribió: > > You can always offer two votes on the RFC - one for the function itself, then > one for should it return boolean or return an int representing the > json_last_error constants and let it be decided that way. Thanks for the advice I did not know that. At the moment this is the only open question for this RFC, and I would like to know what the people thinks about it, otherwise, will go with 2 votings. > I think on the whole, I agree with the sentiment that returning boolean and > checking json_last_error() on false is probably the best / least worst option. > So if I could vote, I would vote yes and for the boolean option, with a > secondary preference for returning int if boolean option is rejected. Thanks for the feedback. > And I was unconvinced about the whole idea originally, so a good example of > where positive, robust discussion can change someone's mind. That is why we have these discussions :) thanks a lot for taking your time on this. > Good luck with progressing the RFC, I don't think I have anything else to add. Thanks a lot! -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Juan, You can always offer two votes on the RFC - one for the function itself, then one for should it return boolean or return an int representing the json_last_error constants and let it be decided that way. I think on the whole, I agree with the sentiment that returning boolean and checking json_last_error() on false is probably the best / least worst option. So if I could vote, I would vote yes and for the boolean option, with a secondary preference for returning int if boolean option is rejected. And I was unconvinced about the whole idea originally, so a good example of where positive, robust discussion can change someone's mind. Good luck with progressing the RFC, I don't think I have anything else to add. On Sat, Aug 27, 2022 at 12:46 AM juan carlos morales < dev.juan.mora...@gmail.com> wrote: > OMG, I need to take a rest, sorry for this, here it goes again; the > about JSON_INVALID_UTF8_IGNORE opinion is the same, but previous code > was wrong > > Code: > > > var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512), > json_last_error_msg()); > var_dump(""); > var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512, > JSON_INVALID_UTF8_IGNORE), json_last_error_msg()); > > Result: > > NULL > string(56) "Malformed UTF-8 characters, possibly incorrectly encoded" > string(12) "" > array(1) { ["ab"]=> string(5) "dummy" } > string(8) "No error" > > Saying so, now ... yes I support and think is NEEDED the usage of the > JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same > direction with json_decode(). I think we need to have this flag. > > RFC: https://wiki.php.net/rfc/json_validate > Implementation: https://github.com/php/php-src/pull/9399 > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
OMG, I need to take a rest, sorry for this, here it goes again; the about JSON_INVALID_UTF8_IGNORE opinion is the same, but previous code was wrong Code: string(5) "dummy" } string(8) "No error" Saying so, now ... yes I support and think is NEEDED the usage of the JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same direction with json_decode(). I think we need to have this flag. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El sáb, 27 ago 2022 a las 1:31, juan carlos morales () escribió: > > === Open issues/concerns === > > > @@@ Usage of JSON_INVALID_UTF8_IGNORE @@@ > > - I have my doubts now, because of this codes: > > ``` > > var_dump(json_validate("\"a\xb0b\""), json_last_error_msg()); > var_dump(""); > var_dump(json_validate("\"a\xb0b\"", 512, JSON_INVALID_UTF8_IGNORE), > json_last_error_msg()); > ``` > > Results: > > bool(false) > string(56) "Malformed UTF-8 characters, possibly incorrectly encoded" > string(12) "" > bool(true) > string(8) "No error" > > > Gives different results, but ... > > ``` > > var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }"), json_last_error_msg()); > var_dump(""); > var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", 512, > JSON_INVALID_UTF8_IGNORE), json_last_error_msg()); > ``` > > Results in: > NULL > string(56) "Malformed UTF-8 characters, possibly incorrectly encoded" > string(12) "" > NULL > string(56) "Malformed UTF-8 characters, possibly incorrectly encoded" > > > > So at a first look, seems the flag should also be remove, as with > json_validate() we say its valid, but then on json_decode() we get > NULL, even with the JSON_INVALID_UTF8_IGNORE provided; does not make > sense I believe. > > What do you think? Is there a use case I am missing here? Is this flag > worth to have still? > I made a huge mistake sorry, long day here, 1:30 am , the code for json_decode should be as: var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }"), true, 512, json_last_error_msg()); var_dump(""); var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512, JSON_INVALID_UTF8_IGNORE), json_last_error_msg()); Resulting in NULL bool(true) int(512) string(56) "Malformed UTF-8 characters, possibly incorrectly encoded" string(12)"" array(1) { ["ab"]=> string(5) "dummy" } string(8) "No error" Saying so, now ... yes I support the usage of the JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same direction with json_decode(). I think we need to have this flag. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
I now provide an update of the discussion. The good, the bads, the open questions, etc. All of this will go into the RFC also, as requested by the procedure in https://wiki.php.net/rfc/howto "Listen to the feedback, and try to answer/resolve all questions. Update your RFC to document all the issues and discussions. Cover both the positive and negative arguments. Put the RFC URL into all your replies." === UPDATES === - Different users have tested the functionality and obtained the promissed results. Also their feedback about it was positive. - Most part of the community in the mailing list showed a positive opinion about this RFC, and looks forward for its integration into PHP. - The ones that checked the code also agree that is small implementation, easy to mantain, and at the same time provides a big benefit for such small implementation. - The community got involve very actively in the discussion of the RFC and provided all kind of useful feedback, and also took the time to test json_validate() by themselves. === Bad reasons for json_validate() provided by the community === - One member of the mailing list expressed that: 1) Incorporating such a small implementation that can be achieve with userland code is not a good idea. Quote: "If we keep the tendency to pollute already bloated standard library with an army of small functions that could have not exists and be replaced with normal PHP counterparts IMHO we'll end with frustration from developers as I believe DX slowly falls down here." 2) json_validate() would only be useful for edge cases. Quote: "A `json_decode()` is a substitute that IMO solves 99% of use cases. If I'd follow your logic and accept every small addition that handles 1% of use cases, somebody will raise another RFC for simplexml_validate_string or yaml_validate and the next PhpToken::validate. All above can be valid if we trust that people normally validate 300MB payloads to do nothing if they DON'T fail and there is nothing strange about that." 3) The user also provided an implementation of a JSON parser written in pure PHP. https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc === Good reasons for json_validate() provided by the community === @@@ Use cases provided by some members, I quote: - "Yes well-formed JSON from a trusted source tends to be small-ish. But a validation function also needs to deal with non-well-formed JSON, otherwise you would not need to validate it." - "If with a new function (json_validate()) it becomes much easier to defend against a Denial-of-Service attack for some parts of a JSON API, then this can be a good addition just for security reasons." - "fast / efficient validation of a common communication format reduces the attack surface for Denial-of-Service attacks." @@@ Memory usage - During the test of json_validate() from some users, they were happy about the memory usage that was zero in most cases (which is the main benefit out this feature). Someone also did a test with a very large string (75 MB) and only a few bytes were needed as reported by him; also the same user reported an execution speed improvement by a 20-25% over using json_decode(). @@@ Reasons not to depend on userland JSON parsers Even possible to write an excellent JSON parser in PHP like one of the members in the mailing list provided us, there are good reasons for dont relying on userland solutions. # 1 - User Tim Düsterhus provided nice thoughts about this, in favor to json_validate(), ... I quote him: - "Writing a JSON parser is non-trivial as evidenced by: https://github.com/nst/JSONTestSuite. I expect userland implementations to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is certainly more battle-tested and in fact it appears to pass all of the tests in the linked test suite." - "Even if the userland implementation is written very carefully, it might behave differently than the native implementation used by json_decode() (e.g. because the latter is buggy for some reason or because the correct behavior is undefined). This would imply that an input string that was successfully validated by your userland parser might ultimately fail to parse when passed to json_decode(). This is exactly what you don't want to happen." (Some other members including me, also share this opinion.) # 2 - The JSON parser in PHP follows an special convention, marked in the PHP documentation. # 3 - We already have a JSON parser in PHP, that is used by json_decode(); reusing the existing JSON Parser provides 100% compatibility between the validation of a json-string, and the decoding of it. # 4 - The user Larry Gafield also provided good reason to integrate this implementation into PHP. I quote him: "The heuristic I use is that an API should be "reasonably complete" in one location. Having a half-assed API in C and the rest left to inconsistent and redundant user-space implementations is a terrible API; the same would apply for a user-space
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Fri, Aug 26, 2022, at 1:41 PM, Ayesh Karunaratne wrote: >> whether returning boolean is the right thing to do at all. It seems obviously >> intuitive it should, returning true for valid and false for invalid JSON >> but then if you consider you're still going to be in the situation of >> calling json_last_error() if you want to know why invalid JSON was invalid >> and in particular you might not expect the "last error" to have changed >> just from an attempt to check a string. How can there be an error when by >> definition you weren't trying to do anything except check the validity of >> some unknown data? Not sure what the answer is there...curious what other >> people's views are on that. I don't think throwing an exception on invalid >> JSON is the right answer in any case. > > One of the reasons why I like `json_validate` over `is_json` is that > the former does not imply a boolean return value as strongly as the > latter. > > I think we should not modify any state with the `json_validate` > function, and I'd be happy with the function returning 0 for a valid > JSON, or the error code as an integer (`JSON_ERROR_STATE_MISMATCH`, > `JSON_ERROR_SYNTAX`, etc.). `JSON_ERROR_NONE` constant is already > assigned 0, so they align quite well too. The problem here is that you're using a falsy return (0) to indicate "it validated", which is logically true. PHP's type juggling makes that a very bad idea. I somehow doubt it will get much traction, but this is the sort of case where a "naked either" approach could work well: https://peakd.com/hive-168588/@crell/much-ado-about-null Though there's also the risk then of all objects being "truthy", since there seems to be no appetite for a __toBool() method as that's been shot down before. At some point, we need to realize that our current design philosophy boxes us in on too many sides and something is going to have to change. I see the argument for json_last_error() not being a great fit, but given the limitations we've chosen to have in the language it seems like "return bool and use json_last_error()" is the least-bad option. --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On 26 August 2022 18:28:59 BST, Kamil Tekiela wrote: >What is the reasoning behind the name? I can't find it explained in the >RFC. What about other alternatives like is_json or validate_json? It's part of the json extension so it should start with json_ following naming guidelines for functions in extensions. cheers Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
> whether returning boolean is the right thing to do at all. It seems obviously > intuitive it should, returning true for valid and false for invalid JSON > but then if you consider you're still going to be in the situation of > calling json_last_error() if you want to know why invalid JSON was invalid > and in particular you might not expect the "last error" to have changed > just from an attempt to check a string. How can there be an error when by > definition you weren't trying to do anything except check the validity of > some unknown data? Not sure what the answer is there...curious what other > people's views are on that. I don't think throwing an exception on invalid > JSON is the right answer in any case. One of the reasons why I like `json_validate` over `is_json` is that the former does not imply a boolean return value as strongly as the latter. I think we should not modify any state with the `json_validate` function, and I'd be happy with the function returning 0 for a valid JSON, or the error code as an integer (`JSON_ERROR_STATE_MISMATCH`, `JSON_ERROR_SYNTAX`, etc.). `JSON_ERROR_NONE` constant is already assigned 0, so they align quite well too. As for exceptions, I too think throwing an exception here is an anti-pattern. It makes perfect sense to throw on `json_decode` if the program doesn't want to deal with invalid JSON gracefully. But calling `json_validate` is an intended call, and returning an error is excepted, and not an exception. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Fri, Aug 26, 2022 at 6:29 PM Kamil Tekiela wrote: > What is the reasoning behind the name? I can't find it explained in the > RFC. What about other alternatives like is_json or validate_json? > The name json_validate makes most sense to me; it groups itself together nicely with the other json_* API functions. I think is_json is what was originally proposed but besides being then inconsistent with the rest of the JSON API function naming, consider most of the is_* functions are for checking types (and some for file properties), not validation. But on the function, the other question which remains for me is whether returning boolean is the right thing to do at all. It seems obviously intuitive it should, returning true for valid and false for invalid JSON but then if you consider you're still going to be in the situation of calling json_last_error() if you want to know why invalid JSON was invalid and in particular you might not expect the "last error" to have changed just from an attempt to check a string. How can there be an error when by definition you weren't trying to do anything except check the validity of some unknown data? Not sure what the answer is there...curious what other people's views are on that. I don't think throwing an exception on invalid JSON is the right answer in any case.
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
What is the reasoning behind the name? I can't find it explained in the RFC. What about other alternatives like is_json or validate_json?
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Fri, Aug 26, 2022, at 4:00 AM, Michał Marcin Brzuchalski wrote: > A `json_decode()` is a substitute that IMO solves 99% of use cases. > If I'd follow your logic and accept every small addition that handles 1% of > use cases, somebody will raise another RFC > for simplexml_validate_string or yaml_validate and the next > PhpToken::validate. The heuristic I use is that an API should be "reasonably complete" in one location. Having a half-assed API in C and the rest left to inconsistent and redundant user-space implementations is a terrible API; the same would apply for a user-space library that is half-assed and leaves the rest to "someone else to write." Naturally "reasonably complete" is a somewhat squishy term, which is why it's a heuristic. By that metric, yes, str_starts_with() and friends absolutely belonged in core, because we already have a bunch of string functions and str_starts_with() is by a wide margin the most common usage of strpos(). By the same token, yes, json_validate() makes sense to include in the main API, which means in C. If there's a performance benefit to doing so as well, that makes it an easy sell for me. simplexml_validate_string: I could see the argument there as well, quite frankly. Unless there is already an equivalently capable option in core, which in this case it sounds like there is. yaml_validate: There is no YAML parser that ships with PHP, so no, this wouldn't make sense. If at some point in the future PHP added a C-based YAML extension to the standard distribution, then including a validate function in that would make sense. (Whether we should include a YAML extension in the standard distribution is an entirely separate question.) PHPToken::validate: Again, PHPToken is in core, so why wouldn't we also include a validate for it, especially if it's easy to do and logically belongs in the same "surface area"? Really, the whole "functions don't belong in C" argument is tired. That ship sailed decades ago. PHP has a mixed-implementation life, whether we like it or not. Some APIs and standard lib components are in C, deal. For those that are or make sense to be, they should be robust and self-contained and complete. For those that aren't, they should be kept out as a set, or added "as a set." That's the granularity where we should be talking, not the function level. I'm in favor of this RFC. However, I do agree that it should never throw, period. Only return a boolean. I'm not sure if JSON_INVALID_UTF8_IGNORE makes sense either, frankly. That should get corrected before it goes to a vote. --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
RE: [PHP-DEV] RFC json_validate() - status: Under Discussion
> -Original Message- > From: juan carlos morales > Sent: Thursday, August 25, 2022 12:02 PM > To: PHP Internals List > Subject: [PHP-DEV] RFC json_validate() - status: Under Discussion > > Hello. > > I am glad to present to you the RFC for json_validate() function. > > The code/implementation still needs some workout, but seems to be fine > enough to be presented to you all. > > I look forward for feedback > > Thanks in advance. > > Juan. > > RFC: https://wiki.php.net/rfc/json_validate > > Implementation: https://github.com/php/php-src/pull/9399 Not a voting member, but I'd like to voice support for this. JSON is widely used, core already supports this functionality, the scope of change is minimal, and the feature is useful. -Jeff -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie., 26 de agosto de 2022 13:08, juan carlos morales < dev.juan.mora...@gmail.com> escribió: > > Is there a better way to do validation other than json_decode? > Better way than json_decode... Always from the core perspective >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Just Want to remind that this discussion is not about "a json parser can be written in PHP or not?". We Have a JSON parser already in the core, ready to be use for validation. Does it make sense to have another parser in User land to do validation if We already have one? Is there a better way to do validation other than json_decode? El vie., 26 de agosto de 2022 12:48, Michał Marcin Brzuchalski < michal.brzuchal...@gmail.com> escribió: > Hi Tim, > > pt., 26 sie 2022 o 12:15 Tim Düsterhus napisał(a): > >> Hi >> >> On 8/26/22 11:14, Hans Henrik Bergan wrote: >> >> you can't efficiently validate JSON in userland >> > >> > Has anyone actually put that claim to the test? Has anyone actually >> made a >> > userland json validator (not just wrap json_decode()/json_last_error()) >> for >> > performance comparison? >> > ( if not, https://www.json.org/JSON_checker/JSON_checker.c would >> probably >> > be a good start) >> > >> >> Worded like "you can't efficiently" the claim is false. Of course you >> can memory-efficiently validate the input by traversing the string byte >> by byte and keeping track of the nesting. >> >> However the points that make a userland implementation infeasible are: >> >> 1. Writing a JSON parser is non-trivial as evidenced by: >> https://github.com/nst/JSONTestSuite. I expect userland implementations >> to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is >> certainly more battle-tested and in fact it appears to pass all of the >> tests in the linked test suite. >> >> 2. Even if the userland implementation is written very carefully, it >> might behave differently than the native implementation used by >> json_decode() (e.g. because the latter is buggy for some reason or >> because the correct behavior is undefined). This would imply that an >> input string that was successfully validated by your userland parser >> might ultimately fail to parse when passed to json_decode(). This is >> exactly what you don't want to happen. >> > > Now this is an argument I could think of. > But that one is not even mentioned in RFC. > > The JSON_checker.c example delivered by json.org is probably not > something impossible > as it required around 1h of work to port it see working implementation > here https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc > > Cheers, > Michał Marcin Brzuchalski >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi Tim, pt., 26 sie 2022 o 12:15 Tim Düsterhus napisał(a): > Hi > > On 8/26/22 11:14, Hans Henrik Bergan wrote: > >> you can't efficiently validate JSON in userland > > > > Has anyone actually put that claim to the test? Has anyone actually made > a > > userland json validator (not just wrap json_decode()/json_last_error()) > for > > performance comparison? > > ( if not, https://www.json.org/JSON_checker/JSON_checker.c would > probably > > be a good start) > > > > Worded like "you can't efficiently" the claim is false. Of course you > can memory-efficiently validate the input by traversing the string byte > by byte and keeping track of the nesting. > > However the points that make a userland implementation infeasible are: > > 1. Writing a JSON parser is non-trivial as evidenced by: > https://github.com/nst/JSONTestSuite. I expect userland implementations > to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is > certainly more battle-tested and in fact it appears to pass all of the > tests in the linked test suite. > > 2. Even if the userland implementation is written very carefully, it > might behave differently than the native implementation used by > json_decode() (e.g. because the latter is buggy for some reason or > because the correct behavior is undefined). This would imply that an > input string that was successfully validated by your userland parser > might ultimately fail to parse when passed to json_decode(). This is > exactly what you don't want to happen. > Now this is an argument I could think of. But that one is not even mentioned in RFC. The JSON_checker.c example delivered by json.org is probably not something impossible as it required around 1h of work to port it see working implementation here https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc Cheers, Michał Marcin Brzuchalski
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 8/26/22 11:00, Michał Marcin Brzuchalski wrote: A `json_decode()` is a substitute that IMO solves 99% of use cases. If I'd follow your logic and accept every small addition that handles 1% of use cases, somebody will raise another RFC for simplexml_validate_string or yaml_validate and the next PhpToken::validate. All above can be valid if we trust that people normally validate 300MB payloads to do nothing if they DON'T fail and there is nothing strange about that. You phrase this as if this was a bad thing. But yes, if PHP would include a YAML parser in core (there appears to be one in PECL) then having a yaml_validate() function with a value-add compared to actually constructing the object representation then it would certainly be an appropriate inclusion. I don't see why JSON should be special here. For XML my understanding is that the XMLReader class can already be used to efficiently validate an XML input in userland [1], because that class is defined to be a streaming parser. These types of parsers are not uncommon in XML land. Best regards Tim Düsterhus [1] https://www.php.net/manual/en/xmlreader.isvalid.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 8/26/22 11:14, Hans Henrik Bergan wrote: you can't efficiently validate JSON in userland Has anyone actually put that claim to the test? Has anyone actually made a userland json validator (not just wrap json_decode()/json_last_error()) for performance comparison? ( if not, https://www.json.org/JSON_checker/JSON_checker.c would probably be a good start) Worded like "you can't efficiently" the claim is false. Of course you can memory-efficiently validate the input by traversing the string byte by byte and keeping track of the nesting. However the points that make a userland implementation infeasible are: 1. Writing a JSON parser is non-trivial as evidenced by: https://github.com/nst/JSONTestSuite. I expect userland implementations to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is certainly more battle-tested and in fact it appears to pass all of the tests in the linked test suite. 2. Even if the userland implementation is written very carefully, it might behave differently than the native implementation used by json_decode() (e.g. because the latter is buggy for some reason or because the correct behavior is undefined). This would imply that an input string that was successfully validated by your userland parser might ultimately fail to parse when passed to json_decode(). This is exactly what you don't want to happen. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 8/26/22 10:56, Paul Crovella wrote: I'm for this as well. Though something more than a boolean is useful to indicate when the max depth has been exceeded as that's not strictly a problem with the json's validity. Yes, it's not strictly invalid JSON, but I expect json_validate() to be used as a check to mean "will json_decode() be able to decode the JSON with these parameters". Then `false` is the logical result when exceeding the depth. Also json_last_error() will do what you expect it to do: https://github.com/php/php-src/pull/9399/files#diff-1cdb6c7eb68aa60b323b056a06d6f7427b38e9d8d967ce97992129ff65c7951eR55. So you can find out the reason why the JSON is invalid. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 11:55, Peter Bowyer () escribió: > > That's a good point which I had overlooked. Does an exception make most > sense when this happens? > > Peter I just got a code review from Derick Rethans explaining me that a validation routine should not throw an exception. I think is a good idea to take it out, after all , the depth parameter is part of the validation too. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Fri, 26 Aug 2022 at 09:56, Paul Crovella wrote: > I'm for this as well. Though something more than a boolean is useful to > indicate when the max depth has been exceeded as that's not strictly a > problem with the json's validity. > That's a good point which I had overlooked. Does an exception make most sense when this happens? Peter
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 11:43, Andreas Leathley () escribió: > > On 26.08.22 11:00, Michał Marcin Brzuchalski wrote: > > There is already a way to validate XML in PHP, and Yaml or PHP is > something within the control of a PHP programmer, while JSON is mostly > used as a format for communication in APIs, so you never know what you > get. If with a new function it becomes much easier to defend against a > Denial-of-Service attack for some parts of a JSON API, then this can be > a good addition just for security reasons. > > But this reason, which most resonates with me, is currently missing in > the RFC, so I would suggest to add that fast / efficient validation of a > common communication format reduces the attack surface for > Denial-of-Service attacks. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > For sure I will add this. Thanks a lot !! That is exactly why we are having this discussion. Once again, Thanks! RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 11:35, Michał Marcin Brzuchalski () escribió: > Your examples are a couple of functions. > Assuming that they're heavily used is as true as my assumptions. > > Cheers, Is good you clarify that your numbers were assumptions. By the way, I never said anything about how my examples were use in real life, I just provided them as examples of a "need"/requirement to do something. Thanks for participating. Cheers. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On 26.08.22 11:00, Michał Marcin Brzuchalski wrote: A `json_decode()` is a substitute that IMO solves 99% of use cases. If I'd follow your logic and accept every small addition that handles 1% of use cases, somebody will raise another RFC for simplexml_validate_string or yaml_validate and the next PhpToken::validate. All above can be valid if we trust that people normally validate 300MB payloads to do nothing if they DON'T fail and there is nothing strange about that. There is already a way to validate XML in PHP, and Yaml or PHP is something within the control of a PHP programmer, while JSON is mostly used as a format for communication in APIs, so you never know what you get. If with a new function it becomes much easier to defend against a Denial-of-Service attack for some parts of a JSON API, then this can be a good addition just for security reasons. But this reason, which most resonates with me, is currently missing in the RFC, so I would suggest to add that fast / efficient validation of a common communication format reduces the attack surface for Denial-of-Service attacks. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi Juan, pt., 26 sie 2022 o 11:26 juan carlos morales napisał(a): > El vie, 26 ago 2022 a las 11:00, Michał Marcin Brzuchalski > () escribió: > > > > A `json_decode()` is a substitute that IMO solves 99% of use cases. > > If I'd follow your logic and accept every small addition that handles 1% > of use cases, somebody will raise another RFC > > for simplexml_validate_string or yaml_validate and the next > PhpToken::validate. > > All above can be valid if we trust that people normally validate 300MB > payloads to do nothing if they DON'T fail and there is nothing strange > about that. > > > > Cheers, > > > > How can you make such an assertion in those numbers (99% of use cases > and son on, that you mention) ? Can you give us more information about > this assertions? > > I have provide real examples where the need to validate-only a > json-string is actually needed, also the need from our developers > community asking for this. > Your examples are a couple of functions. Assuming that they're heavily used is as true as my assumptions. Cheers,
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 11:26, juan carlos morales () escribió: > > How can you make such an assertion in those numbers (99% of use cases > and son on, that you mention) ? Can you give us more information about > this assertions? > > I have provide real examples where the need to validate-only a > json-string is actually needed, also the need from our developers > community asking for this. > > RFC: https://wiki.php.net/rfc/json_validate > > Implementation: https://github.com/php/php-src/pull/9399 By the way you dont need to frame this function to handle 300MB of json-string, as I already mentioned a couple of times, that a 2 MB json-sring validation with json_decode() needs a large amount of memory to be validated, possibly hitting the allowed memory limit, which usually is not that high at all. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 11:00, Michał Marcin Brzuchalski () escribió: > > A `json_decode()` is a substitute that IMO solves 99% of use cases. > If I'd follow your logic and accept every small addition that handles 1% of > use cases, somebody will raise another RFC > for simplexml_validate_string or yaml_validate and the next > PhpToken::validate. > All above can be valid if we trust that people normally validate 300MB > payloads to do nothing if they DON'T fail and there is nothing strange about > that. > > Cheers, > How can you make such an assertion in those numbers (99% of use cases and son on, that you mention) ? Can you give us more information about this assertions? I have provide real examples where the need to validate-only a json-string is actually needed, also the need from our developers community asking for this. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 10:56, Paul Crovella () escribió: > > I'm for this as well. Though something more than a boolean is useful to > indicate when the max depth has been exceeded as that's not strictly a > problem with the json's validity. Thanks, seems to be good idea to take it out, and I will. Nice observation. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 10:28, Peter Bowyer () escribió: > > Hi Juan, > > On Thu, 25 Aug 2022 at 17:02, juan carlos morales > wrote: >> >> RFC: https://wiki.php.net/rfc/json_validate > > > Thanks for bringing forward this RFC. I am in favour of this change, as you > can't efficiently validate JSON in userland. > > Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't think > of a case where more than a boolean response is needed. > > Peter Seems that will be a good idea to take out the usage of the flag JSON_THROW_ON_ERROR after all. That was a nice and objective critic and I appreciate it. Thanks! I will take it out of the implementation. Regards RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
quote > you can't efficiently validate JSON in userland Has anyone actually put that claim to the test? Has anyone actually made a userland json validator (not just wrap json_decode()/json_last_error()) for performance comparison? ( if not, https://www.json.org/JSON_checker/JSON_checker.c would probably be a good start) On Fri, 26 Aug 2022 at 11:00, Michał Marcin Brzuchalski < michal.brzuchal...@gmail.com> wrote: > Hi Dusk, > > pt., 26 sie 2022 o 08:17 Dusk napisał(a): > > > On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski < > > michal.brzuchal...@gmail.com> wrote: > > > The same goes here and I'm not convinced we should introduce next small > > function that can be simply implemented in user land. > > > > What "simple implementation in userland" do you have in mind? Can you > > provide an example? > > > > json_decode() is not an acceptable substitute here -- as David Gebler has > > observed, decoding a large JSON structure can have a significant impact > on > > memory usage, even if the data is immediately discarded. Any > implementation > > based on string processing, on the other hand, is likely to be > dramatically > > slower, and may have subtle differences in behavior from PHP's JSON > parser. > > > A `json_decode()` is a substitute that IMO solves 99% of use cases. > If I'd follow your logic and accept every small addition that handles 1% of > use cases, somebody will raise another RFC > for simplexml_validate_string or yaml_validate and the next > PhpToken::validate. > All above can be valid if we trust that people normally validate 300MB > payloads to do nothing if they DON'T fail and there is nothing strange > about that. > > Cheers, >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi Dusk, pt., 26 sie 2022 o 08:17 Dusk napisał(a): > On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski < > michal.brzuchal...@gmail.com> wrote: > > The same goes here and I'm not convinced we should introduce next small > function that can be simply implemented in user land. > > What "simple implementation in userland" do you have in mind? Can you > provide an example? > > json_decode() is not an acceptable substitute here -- as David Gebler has > observed, decoding a large JSON structure can have a significant impact on > memory usage, even if the data is immediately discarded. Any implementation > based on string processing, on the other hand, is likely to be dramatically > slower, and may have subtle differences in behavior from PHP's JSON parser. A `json_decode()` is a substitute that IMO solves 99% of use cases. If I'd follow your logic and accept every small addition that handles 1% of use cases, somebody will raise another RFC for simplexml_validate_string or yaml_validate and the next PhpToken::validate. All above can be valid if we trust that people normally validate 300MB payloads to do nothing if they DON'T fail and there is nothing strange about that. Cheers,
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On 8/26/2022 1:28 AM, Peter Bowyer wrote: Hi Juan, On Thu, 25 Aug 2022 at 17:02, juan carlos morales < dev.juan.mora...@gmail.com> wrote: RFC: https://wiki.php.net/rfc/json_validate Thanks for bringing forward this RFC. I am in favour of this change, as you can't efficiently validate JSON in userland. Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't think of a case where more than a boolean response is needed. Peter I'm for this as well. Though something more than a boolean is useful to indicate when the max depth has been exceeded as that's not strictly a problem with the json's validity. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi Juan, On Thu, 25 Aug 2022 at 17:02, juan carlos morales < dev.juan.mora...@gmail.com> wrote: > RFC: https://wiki.php.net/rfc/json_validate Thanks for bringing forward this RFC. I am in favour of this change, as you can't efficiently validate JSON in userland. Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't think of a case where more than a boolean response is needed. Peter
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
El vie, 26 ago 2022 a las 6:47, Michał Marcin Brzuchalski () escribió: > > > I share the same opinion you expressed here even though you admit in recent > email that you changed your mind. > > In recent versions we tend to accept more and more small standard library > functions with IMO questionable argumentation. The same goes here and I'm not > convinced we should introduce next small function that can be simply > implemented in user land. Sorry but I dont think that a JSON parser with memory usage zero (or maybe a few bytes) can be simply done in the userland. This function (json_validate) is small by itself, but gives you access to the JSON parser. Can you please provide an example of what you commented? > Any example testing > 3MB JSON string are for me edge cases that normally > don't happen often to deserve special treatment. I don't agree with your definition of "edge case" here, as edge cases depend/belong on/to the system under analysis. By the way, the test case provided in the PR , "test 005" uses a json-string of about 3 MB (maybe 3.1) , and in order to decode it json_decode() needs something around 109 MB of memory. For me, validation can be done in a better and efficient way, like it has been probe with this proposal. > If we keep the tendency to pollute already bloated standard library with an > army of small functions that could have not exists and be replaced with > normal PHP counterparts IMHO we'll end with frustration from developers as I > believe DX slowly falls down here. Last but not least, I want to say that the function json_validate() is small, easy to maintain and extend if needed, and at the same gives us access to something that is not trivial to write in userland the existing JSON parser. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski wrote: > The same goes here and I'm not convinced we should introduce next small > function that can be simply implemented in user land. What "simple implementation in userland" do you have in mind? Can you provide an example? json_decode() is not an acceptable substitute here -- as David Gebler has observed, decoding a large JSON structure can have a significant impact on memory usage, even if the data is immediately discarded. Any implementation based on string processing, on the other hand, is likely to be dramatically slower, and may have subtle differences in behavior from PHP's JSON parser. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
czw., 25 sie 2022, 21:12 użytkownik David Gebler napisał: > I'm not a voter on RFCs so my input may be largely irrelevant here but for > discussion purposes: > > I remain unconvinced regarding the justification for this proposal. I'm not > saying there's a strong reason to NOT implement it, but I'm not convinced > it's really going to be a significant benefit to many people at all. > > I agree that the number of userland implementations for a "is_valid_json" > type function including in some widely used frameworks and systems > indicates there's some degree of demand in the ecosystem for validating a > JSON string. > > But the more salient question is whether there is a significant demand for > whatever memory and speed benefit the implementation of a new core ext_json > function delivers; that is, has it been established that the use of > json_decode or common userland solutions are in practice not good enough? > > There are many examples of userland code which could be faster and more > memory efficient if they were written in C and compiled in, so the mere > fact this proposal may introduce a somewhat faster way of validating a JSON > string over decoding it is not necessarily a sufficient reason to include > it. > > Are there are examples of raising issues for frameworks or systems saying > they need to validate some JSON but the only existing solutions available > to them are causing memory limit errors, or taking too long? The Stack > Overflow question linked on the RFC says "I need a really, really fast > method of checking if a string is JSON or not." > > "Really, really fast" is subjective. No context or further information is > given about what that person would regard as an acceptable time to validate > what size blob of valid or invalid JSON, or why. Indeed that same page > offers a userland solution based around only going to json_decode if some > other much simpler checks on the input are indeterminate for validation > purposes. Haven't tested it personally but no doubt in the vast majority of > cases it is sufficiently performant. > > In most real world use cases [that I've encountered over the years] JSON > blobs tend to be quite small. I have dealt with much, much larger JSON > blobs, up to a few hundred MB, and in those cases I've used a streaming > parser. If you're talking about JSON that size, a streaming parser is the > only realistic answer - you probably don't want to drop a 300MB string in > to this RFC's new function either, if performance and memory efficiency is > your concern. > > So I'm curious as to whether a real world example can be given where the > efficiency difference between json_decode and a new json_validate function > would be important to the system, whether anyone's encountered a scenario > where this would have made a real difference to them. > I share the same opinion you expressed here even though you admit in recent email that you changed your mind. In recent versions we tend to accept more and more small standard library functions with IMO questionable argumentation. The same goes here and I'm not convinced we should introduce next small function that can be simply implemented in user land. Any example testing > 3MB JSON string are for me edge cases that normally don't happen often to deserve special treatment. If we keep the tendency to pollute already bloated standard library with an army of small functions that could have not exists and be replaced with normal PHP counterparts IMHO we'll end with frustration from developers as I believe DX slowly falls down here. Cheers, Michał Marcin Brzuchalski >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
> Having actually compiled the branch and tried it out, I have to say > regardless of whether validating arbitrarily large blocks of JSON without > being interested in the contents is a common or more niche use case, the > memory savings ARE highly impressive. I had thought that because the function > was built on top of the existing parser and is still parsing the entire > string (or up until invalid JSON is encountered), the performance saving for > a very large input would be smaller than it is. > > I tested using a 75MB valid JSON input - a string large enough that it's not > going to be very common. The processing time isn't hugely different, the > saving appears to be around maybe 20-25% (and it's not a significant amount > of time using either json_decode or json_validate, even on an input of this > size, about half a second on my machine for both). But the memory saving is > enormous, almost total. Gone from needing ~5x the size of the input to almost > literally just a few extra bytes. > > I'm persuaded now on both that benchmarking and having had a closer look at > the implementation PR, which is clearly a minimal and easily maintainable > change. > > As I've said, my feelings are irrelevant to the extent I'm not a voter, but I > am in principle a +1 thumbs up for including this now. > > David, thanks very much for sharing your results and opinion. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Having actually compiled the branch and tried it out, I have to say regardless of whether validating arbitrarily large blocks of JSON without being interested in the contents is a common or more niche use case, the memory savings ARE highly impressive. I had thought that because the function was built on top of the existing parser and is still parsing the entire string (or up until invalid JSON is encountered), the performance saving for a very large input would be smaller than it is. I tested using a 75MB valid JSON input - a string large enough that it's not going to be very common. The processing time isn't hugely different, the saving appears to be around maybe 20-25% (and it's not a significant amount of time using either json_decode or json_validate, even on an input of this size, about half a second on my machine for both). But the memory saving is enormous, almost total. Gone from needing ~5x the size of the input to almost literally just a few extra bytes. I'm persuaded now on both that benchmarking and having had a closer look at the implementation PR, which is clearly a minimal and easily maintainable change. As I've said, my feelings are irrelevant to the extent I'm not a voter, but I am in principle a +1 thumbs up for including this now.
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
I also added in the RFC your concern about JSON_THROW_ON_ERROR usage. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Ok then ... I take it out of the RFC just because we have doubts about it. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
[Replying to list - make sure to hit "Reply List" or "Reply All", not "Reply" in your mail client] On 25/08/2022 21:59, juan carlos morales wrote: Hello Rowan thanks for participating. My answer for your comments would be: === Regarding JSON_THROW_ON_ERROR === I think is a valid point from your side, and I am open to change the implementation if the maintainers believe so. I leave it there as it existed in json_decode() and wanted to give the developer the option to use it or not. But as I said, you hae a valid point, that I also like. Just to clarify, "the maintainers" of PHP are basically everyone on this list, trying to reach a consensus; there isn't really any core group that makes the final decision. There is a binding vote at the end of the RFC process, but it's in principle just a way to measure that consensus. === Regarding the examples you mentioned === The one from humhub and Symfony http-kernel, yes you are right, I will take them out from the RFC right now. Nice catch. But the Magento one, I took a look again, and I still believe is right example, because the only reason is using json_decode() for, is to validate only. Am I missing something here? It's not entirely clear *what* that Magento code is doing, but it's definitely not just validation: the output of json_decode is passed to $this->_jsonEncoder->encode(); if it was just validating, it would return $input, or true. My guess is that like the Symfony example it is "pretty-printing" existing JSON strings. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hey Tim, thanks a lot for participating and taking your time to use the code and making your own tests, I highly appreciate it. Regards. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hello Rowan thanks for participating. My answer for your comments would be: === Regarding JSON_THROW_ON_ERROR === I think is a valid point from your side, and I am open to change the implementation if the maintainers believe so. I leave it there as it existed in json_decode() and wanted to give the developer the option to use it or not. But as I said, you hae a valid point, that I also like. === Regarding the examples you mentioned === The one from humhub and Symfony http-kernel, yes you are right, I will take them out from the RFC right now. Nice catch. But the Magento one, I took a look again, and I still believe is right example, because the only reason is using json_decode() for, is to validate only. Am I missing something here? Regards. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On 25/08/2022 17:02, juan carlos morales wrote: I am glad to present to you the RFC for json_validate() function. RFC: https://wiki.php.net/rfc/json_validate Hi, Leaving aside the overall question of inclusion and looking at details, I'm not convinced JSON_THROW_ON_ERROR belongs here. With json_decode, throwing an exception is a useful feature because a) there is no natural value to signal an error, false and null both being valid results; and b) most code is written expecting the input to be valid, so failure is an "exceptional state" that needs to break out of normal flow. With json_validate, neither reason applies: there is no ambiguity in what false means, and no reason to write code that calls json_validate but doesn't check that return value. On a different note, I notice that several of your examples are not actually checking the validity of JSON at all, they are parsing it in order to re-serialize it - the Magento getJSONString, the Symfony getPrettyJson, and the humhub actionIndex functions are all directly using the output of json_decode, not just reporting whether it succeeded. There's still plenty of other examples that make your point, but those three should probably be removed. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hi On 8/25/22 21:11, David Gebler wrote: There are many examples of userland code which could be faster and more memory efficient if they were written in C and compiled in, so the mere fact this proposal may introduce a somewhat faster way of validating a JSON string over decoding it is not necessarily a sufficient reason to include it. Are there are examples of raising issues for frameworks or systems saying they need to validate some JSON but the only existing solutions available to them are causing memory limit errors, or taking too long? The Stack Overflow question linked on the RFC says "I need a really, really fast method of checking if a string is JSON or not." The proposed function is meant to be used for validation. Validation processes by definition need to deal with untrusted data. So the input data might even be actively malicious in order to tie up resources on the server (DoS attack - single D there). In most real world use cases [that I've encountered over the years] JSON blobs tend to be quite small. I have dealt with much, much larger JSON Yes well-formed JSON from a trusted source tends to be small-ish. But a validation function also needs to deal with non-well-formed JSON, otherwise you would not need to validate it. I was able to use up an extra 100 MB of RAM with a 3 MB input that is invalid JSON when using json_decode(), just for it to reject the input. For json_validate() the extra memory (as per memory_get_peak_usage()) required for the same operation effectively zero. It was able to deal with 60 MB of input just fine. I've attached the script I used for the test. I left out the actual JSON string to not give script kiddies a loaded weapon, but you likely should be able to craft some input yourself. blobs, up to a few hundred MB, and in those cases I've used a streaming parser. If you're talking about JSON that size, a streaming parser is the only realistic answer - you probably don't want to drop a 300MB string in to this RFC's new function either, if performance and memory efficiency is your concern. So I'm curious as to whether a real world example can be given where the efficiency difference between json_decode and a new json_validate function would be important to the system, whether anyone's encountered a scenario where this would have made a real difference to them. While my example is not a real world example, I don't believe it's a stretch to say it can be applied as-is to the real world. So IMO: - The proposed function does exactly what it promises to do, not more, not less. - If it's introduced, then it is going to be the obvious choice for JSON validation and at the same time it is going to be the best choice for JSON validation. I strongly believe it is a good thing if users are steered to make the correct choice by default without needing to invest brain cycles. - The patch is pretty small, because the hard work of JSON parsing is already implemented. - Userland implementations are non-obvious and non-trivial as evidenced by the examples in the RFC: They are all slightly different and one of them even mishandles a plain `false` input, because it does not check json_last_error(). - Userland implementations are also either less efficient (relying on json_decode()) or potentially inconsistent (hand-rolling a validating parser). Best regards Tim Düsterhus<> -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Thanks for participating. Cheers. RFC: https://wiki.php.net/rfc/json_validate Implementation: https://github.com/php/php-src/pull/9399
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
Hello David, thanks for your feedback. I believe that the answer to most of your concerns/questions related to other projects that would benefit out of something like this is already answered by me in the RFC. I mentioned major PHP projects that will directly benefit out of something like this. About the demand, well, that is also being written in the RFC. Regarding the question in StackOverflow, was just mentioned to expose the "demand" of validating a json string, and is one of the most ranked questions, is in the top 10 of questions with php && json tags. I believe that such a thing is talking by iself. Providing real use cases wont make a difference, because the number of examples are infinite honestly. Yes there are userland packages that can do this for us, but in big products on big companies, relying on 3rd party packages is not always cool, I have being experiencing that in my company actually. Regarding examples of raising issues ... there are a LOT in github, and the only possible solution is always to increase the memory limit or use a 3rd party package (which as I said, is not always suitable). But ... lets be objective, or try to be at least ... PHP has a parser already, this functionality is just allowing us to use it without the need to consume memory. I see it useful, actually, this RFC is the result of a real need we have in my current company at the moment, and after digging around the internet I found hat we were not the only ones. Thanks for taking our time on this, I appreciate it.
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
On Thu, Aug 25, 2022, 9:12 PM David Gebler wrote: > > I agree that the number of userland implementations for a "is_valid_json" > type function including in some widely used frameworks and systems > indicates there's some degree of demand in the ecosystem for validating a > JSON string. > I don't have an option about the rest of your email, but I wanted to point out that the vast userland need for validating JSON right now is being "hacked" for lack of a better functionality. Whether it's justifiable to bring this to core is up for debate, but I at least take it as a valid point that "asking a computer to do a task just to check whether it failed" is odd and limiting to say the least >
Re: [PHP-DEV] RFC json_validate() - status: Under Discussion
I'm not a voter on RFCs so my input may be largely irrelevant here but for discussion purposes: I remain unconvinced regarding the justification for this proposal. I'm not saying there's a strong reason to NOT implement it, but I'm not convinced it's really going to be a significant benefit to many people at all. I agree that the number of userland implementations for a "is_valid_json" type function including in some widely used frameworks and systems indicates there's some degree of demand in the ecosystem for validating a JSON string. But the more salient question is whether there is a significant demand for whatever memory and speed benefit the implementation of a new core ext_json function delivers; that is, has it been established that the use of json_decode or common userland solutions are in practice not good enough? There are many examples of userland code which could be faster and more memory efficient if they were written in C and compiled in, so the mere fact this proposal may introduce a somewhat faster way of validating a JSON string over decoding it is not necessarily a sufficient reason to include it. Are there are examples of raising issues for frameworks or systems saying they need to validate some JSON but the only existing solutions available to them are causing memory limit errors, or taking too long? The Stack Overflow question linked on the RFC says "I need a really, really fast method of checking if a string is JSON or not." "Really, really fast" is subjective. No context or further information is given about what that person would regard as an acceptable time to validate what size blob of valid or invalid JSON, or why. Indeed that same page offers a userland solution based around only going to json_decode if some other much simpler checks on the input are indeterminate for validation purposes. Haven't tested it personally but no doubt in the vast majority of cases it is sufficiently performant. In most real world use cases [that I've encountered over the years] JSON blobs tend to be quite small. I have dealt with much, much larger JSON blobs, up to a few hundred MB, and in those cases I've used a streaming parser. If you're talking about JSON that size, a streaming parser is the only realistic answer - you probably don't want to drop a 300MB string in to this RFC's new function either, if performance and memory efficiency is your concern. So I'm curious as to whether a real world example can be given where the efficiency difference between json_decode and a new json_validate function would be important to the system, whether anyone's encountered a scenario where this would have made a real difference to them.