Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-10-08 Thread juan carlos morales
The RFC got accepted and merge. Thanks for contributing :)


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-10-07 Thread James Gilliland
Don't think it makes a difference to the discussion at this point but just
thought I'd point out there's another example missing from the RFC list
that almost everyone probably has installed with their projects.

https://github.com/sebastianbergmann/phpunit/blob/main/src/Framework/Constraint/String/IsJson.php

On Wed, Sep 21, 2022 at 11:39 PM juan carlos morales <
dev.juan.mora...@gmail.com> wrote:

> The RFC is in voting phase now.
>
> I want to thank everyone involved in the discussions related to this
> RFC, the ones who reviewed the code in advance, and the ones that with
> some emojis in github showed their interest and support.
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-21 Thread juan carlos morales
The RFC is in voting phase now.

I want to thank everyone involved in the discussions related to this
RFC, the ones who reviewed the code in advance, and the ones that with
some emojis in github showed their interest and support.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-20 Thread Tim Düsterhus

Hi

On 9/20/22 00:15, juan carlos morales wrote:

The RFC still contains a non-empty "Open Issues" section. This needs to
be resolved before the vote starts.

I would also recommend inserting a closed voting widget (or multiple, if
you want to have additional votes for the details), so that it's clear
for everyone how exactly the vote will look like.


Tim, how should I proceed with the Open Issues/Questions? is only
about the return value of the function.

Should I create a poll for that? Should I leave it like this and
decide during Voting phase? This question raised during the mailing
discussion and never moved forward, can you advice please? Or I should
delete the Open Issues/question and move forward with the current
approach I choosed?


It's your RFC, you do whatever you are comfortable with. If you feel 
that a vote is the best option, then add a vote. If you as the author 
want to decide what's "best", then that's also fine.


It just needs to be clear what the function will look and feel like 
before the vote starts - so leaving it as it is and deciding yourself 
*during* voting is not acceptable.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-19 Thread juan carlos morales
> The RFC still contains a non-empty "Open Issues" section. This needs to
> be resolved before the vote starts.
>
> I would also recommend inserting a closed voting widget (or multiple, if
> you want to have additional votes for the details), so that it's clear
> for everyone how exactly the vote will look like.

Tim, how should I proceed with the Open Issues/Questions? is only
about the return value of the function.

Should I create a poll for that? Should I leave it like this and
decide during Voting phase? This question raised during the mailing
discussion and never moved forward, can you advice please? Or I should
delete the Open Issues/question and move forward with the current
approach I choosed?

A lot of days passed by already, and I would like to move on with the
RFC process.

Thanks in advance.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-19 Thread Tim Düsterhus

Hi

On 9/19/22 23:26, juan carlos morales wrote:

It appears the discussion came to a halt. How do you plan to proceed
with your RFC?


Thanks for the heads up email Tim.

People, following the procedure, this is a "heads up" email to all of
you regarding this RFC. If no significant issue/complain raises, then


The RFC still contains a non-empty "Open Issues" section. This needs to 
be resolved before the vote starts.


I would also recommend inserting a closed voting widget (or multiple, if 
you want to have additional votes for the details), so that it's clear 
for everyone how exactly the vote will look like.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-19 Thread juan carlos morales
> It appears the discussion came to a halt. How do you plan to proceed
> with your RFC?

Thanks for the heads up email Tim.

People, following the procedure, this is a "heads up" email to all of
you regarding this RFC. If no significant issue/complain raises, then
I will move the RFC for voting by Wednesday 21 of September.

I also want to thank your support in the github pull request, the
hearts and thums up emojis were awesome
(https://github.com/php/php-src/pull/9399)

Regards

Juan.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-09-19 Thread Tim Düsterhus

Hi

On 8/25/22 18:02, juan carlos morales wrote:

I am glad to present to you the RFC for json_validate() function.


It appears the discussion came to a halt. How do you plan to proceed 
with your RFC?


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-29 Thread juan carlos morales
El lun, 29 ago 2022 a las 11:26, Deleu () escribió:
>
>
>
> On Mon, Aug 29, 2022 at 11:19 AM juan carlos morales 
>  wrote:
>>
>> El lun, 29 ago 2022 a las 11:06, Deleu () escribió:
>> >
>> > Has the option of returning a Result object been discussed/considered? Can 
>> > it be an option? I imagine that if `json_validate(): JsonValidationResult` 
>> > always returns a `JsonValidationResult` which contains a `public readonly 
>> > bool $valid` and a `public readonly ?string $error` it would be better 
>> > than both options on the table right now. The option of returning CLI-like 
>> > results means that we will need a `if (! json_validate())` to treat a 
>> > valid JSON (really awkward) and the option of using `json_last_error()` 
>> > relies on an internal state instead of an immutable structure. 
>> > Effectively, what we need is to return a complex structure which can 
>> > contain a boolean and a string and that is a class/object.
>> >
>> Interesting, it was not considered.
>>
>> I still think BOOLEAN is my preferred choice, but can you provide an
>> example about how the code would look like with your approach? Imagine
>> you have the feature like in your suggestion ... how a developer would
>> write the code?
>
>
> ```
> $json = '{}';
>
> $result = json_validate($json);
>
> if (! $result->valid) {
> throw new \Exception('Invalid JSON provided: ' . $result->error);
> }
>
> $something->saveUserProvidedJson($json);
> ```
>
> --
> Marco Deleu

Thanks for participating on the discussion and I will take note about
this also, but ...
Quick question: Is in PHP some rule, norma, estandard about "result
objects" already? (like, "should have these methods, and implement
this interface, etc.)

FYI: Just want to clarify that there is nothing bad/wrong about
relying on "internal state" by calling json_last_error(), as this
pattern is used not only by json_decode() but also mysqli extension
for example, among others. Nothing wrong honeslty IMO.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-29 Thread juan carlos morales
El lun, 29 ago 2022 a las 11:06, Deleu () escribió:
>
> Has the option of returning a Result object been discussed/considered? Can it 
> be an option? I imagine that if `json_validate(): JsonValidationResult` 
> always returns a `JsonValidationResult` which contains a `public readonly 
> bool $valid` and a `public readonly ?string $error` it would be better than 
> both options on the table right now. The option of returning CLI-like results 
> means that we will need a `if (! json_validate())` to treat a valid JSON 
> (really awkward) and the option of using `json_last_error()` relies on an 
> internal state instead of an immutable structure. Effectively, what we need 
> is to return a complex structure which can contain a boolean and a string and 
> that is a class/object.
>
Interesting, it was not considered.

I still think BOOLEAN is my preferred choice, but can you provide an
example about how the code would look like with your approach? Imagine
you have the feature like in your suggestion ... how a developer would
write the code?

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-29 Thread Deleu
On Mon, Aug 29, 2022 at 10:50 AM juan carlos morales <
dev.juan.mora...@gmail.com> wrote:

> There is still 1 open question on the RFC, and is about the return value.
>
> https://wiki.php.net/rfc/json_validate#open_issuesquestions
>
> I would appreciate your feedback on this. Even though I was told the
> RFC can go with 2 votings, I would like to know your thoughts about
> that open question, in short, the return value.
>

Has the option of returning a Result object been discussed/considered? Can
it be an option? I imagine that if `json_validate(): JsonValidationResult`
always returns a `JsonValidationResult` which contains a `public readonly
bool $valid` and a `public readonly ?string $error` it would be better than
both options on the table right now. The option of returning CLI-like
results means that we will need a `if (! json_validate())` to treat a valid
JSON (really awkward) and the option of using `json_last_error()` relies on
an internal state instead of an immutable structure. Effectively, what we
need is to return a complex structure which can contain a boolean and a
string and that is a class/object.

Thoughts?

-- 
Marco Deleu


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-29 Thread juan carlos morales
There is still 1 open question on the RFC, and is about the return value.

https://wiki.php.net/rfc/json_validate#open_issuesquestions

I would appreciate your feedback on this. Even though I was told the
RFC can go with 2 votings, I would like to know your thoughts about
that open question, in short, the return value.

The same thing was asked during code review of the pull request.

I already expressed my opinion in the RFC, I prefer to return boolean
and check errors using json_last_error() and I wrote down the details
about this; instead of returning INT ... with 0 or JSON_ERROR_NONE
representing successful validation.

Thanks in advance.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-29 Thread juan carlos morales
El sáb, 27 ago 2022 a las 2:50, David Gebler () escribió:
>
> You can always offer two votes on the RFC - one for the function itself, then 
> one for should it return boolean or return an int representing the 
> json_last_error constants and let it be decided that way.


Thanks for the advice I did not know that. At the moment this is the
only open question for this RFC, and I would like to know what the
people thinks about it, otherwise, will go with 2 votings.


> I think on the whole, I agree with the sentiment that returning boolean and 
> checking json_last_error() on false is probably the best / least worst option.

> So if I could vote, I would vote yes and for the boolean option, with a 
> secondary preference for returning int if boolean option is rejected.

Thanks for the feedback.


> And I was unconvinced about the whole idea originally, so a good example of 
> where positive, robust discussion can change someone's mind.


That is why we have these discussions :)  thanks a lot for taking
your time on this.


> Good luck with progressing the RFC, I don't think I have anything else to add.

Thanks a lot!

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread David Gebler
Juan,
You can always offer two votes on the RFC - one for the function itself,
then one for should it return boolean or return an int representing the
json_last_error constants and let it be decided that way.

I think on the whole, I agree with the sentiment that returning boolean and
checking json_last_error() on false is probably the best / least worst
option.

So if I could vote, I would vote yes and for the boolean option, with a
secondary preference for returning int if boolean option is rejected.

And I was unconvinced about the whole idea originally, so a good example of
where positive, robust discussion can change someone's mind.

Good luck with progressing the RFC, I don't think I have anything else to
add.


On Sat, Aug 27, 2022 at 12:46 AM juan carlos morales <
dev.juan.mora...@gmail.com> wrote:

> OMG, I need to take a rest, sorry for this, here it goes again; the
> about JSON_INVALID_UTF8_IGNORE opinion is the same, but previous code
> was wrong
>
> Code:
>
> 
> var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512),
> json_last_error_msg());
> var_dump("");
> var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512,
> JSON_INVALID_UTF8_IGNORE), json_last_error_msg());
>
> Result:
>
> NULL
> string(56) "Malformed UTF-8 characters, possibly incorrectly encoded"
> string(12) ""
> array(1) { ["ab"]=> string(5) "dummy" }
> string(8) "No error"
>
> Saying so, now ... yes I support and think is NEEDED the usage of the
> JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same
> direction with json_decode(). I think we need to have this flag.
>
> RFC: https://wiki.php.net/rfc/json_validate
> Implementation: https://github.com/php/php-src/pull/9399
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
OMG, I need to take a rest, sorry for this, here it goes again; the
about JSON_INVALID_UTF8_IGNORE opinion is the same, but previous code
was wrong

Code:

 string(5) "dummy" }
string(8) "No error"

Saying so, now ... yes I support and think is NEEDED the usage of the
JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same
direction with json_decode(). I think we need to have this flag.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El sáb, 27 ago 2022 a las 1:31, juan carlos morales
() escribió:
>
> === Open issues/concerns ===
>
>
> @@@ Usage of JSON_INVALID_UTF8_IGNORE @@@
>
> - I have my doubts now, because of this codes:
>
> ```
> 
> var_dump(json_validate("\"a\xb0b\""), json_last_error_msg());
> var_dump("");
> var_dump(json_validate("\"a\xb0b\"", 512, JSON_INVALID_UTF8_IGNORE),
> json_last_error_msg());
> ```
>
> Results:
>
> bool(false)
> string(56) "Malformed UTF-8 characters, possibly incorrectly encoded"
> string(12) ""
> bool(true)
> string(8) "No error"
>
>
> Gives different results, but ...
>
> ```
> 
> var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }"), json_last_error_msg());
> var_dump("");
> var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", 512,
> JSON_INVALID_UTF8_IGNORE), json_last_error_msg());
> ```
>
> Results in:
> NULL
> string(56) "Malformed UTF-8 characters, possibly incorrectly encoded"
> string(12) ""
> NULL
> string(56) "Malformed UTF-8 characters, possibly incorrectly encoded"
>
>
>
> So at a first look, seems the flag should also be remove, as with
> json_validate() we say its valid, but then on json_decode() we get
> NULL, even with the JSON_INVALID_UTF8_IGNORE provided; does not make
> sense I believe.
>
> What do you think? Is there a use case I am missing here? Is this flag
> worth to have still?
>

I made a huge mistake sorry, long day here, 1:30 am , the code for
json_decode should be as:

var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }"), true, 512,
json_last_error_msg());
var_dump("");
var_dump(json_decode("{ \"a\xb0b\" : \"dummy\" }", true, 512,
JSON_INVALID_UTF8_IGNORE), json_last_error_msg());

Resulting in

NULL
bool(true)
int(512)
string(56) "Malformed UTF-8 characters, possibly incorrectly encoded"
string(12)""
array(1) { ["ab"]=> string(5) "dummy" }
string(8) "No error"


Saying so, now ... yes I support the usage of the
JSON_INVALID_UTF8_IGNORE , as json_validate() result goes in the same
direction with json_decode(). I think we need to have this flag.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
I now provide an update of the discussion.

The good, the bads, the open questions, etc.

All of this will go into the RFC also, as requested by the procedure
in https://wiki.php.net/rfc/howto

"Listen to the feedback, and try to answer/resolve all questions.
Update your RFC to document all the issues and discussions. Cover both
the positive and negative arguments. Put the RFC URL into all your
replies."

=== UPDATES ===

- Different users have tested the functionality and obtained the
promissed results. Also their feedback about it was positive.

- Most part of the community in the mailing list showed a positive
opinion about this RFC, and looks forward for its integration into
PHP.

- The ones that checked the code also agree that is small
implementation, easy to mantain, and at the same time provides a big
benefit for such small implementation.

- The community got involve very actively in the discussion of the RFC
and provided all kind of useful feedback, and also took the time to
test json_validate() by themselves.


=== Bad reasons for json_validate() provided by the community ===


- One member of the mailing list expressed that:

1) Incorporating such a small implementation that can be achieve with
userland code is not a good idea.

Quote: "If we keep the tendency to pollute already bloated standard
library with an army of small functions that could have not exists and
be replaced with normal PHP counterparts IMHO we'll end with
frustration from developers as I believe DX slowly falls down here."

2) json_validate() would only be useful for edge cases.

Quote: "A `json_decode()` is a substitute that IMO solves 99% of use cases.
If I'd follow your logic and accept every small addition that handles
1% of use cases, somebody will raise another RFC
for simplexml_validate_string or yaml_validate and the next PhpToken::validate.
All above can be valid if we trust that people normally validate 300MB
payloads to do nothing if they DON'T fail and there is nothing strange
about that."

3) The user also provided an implementation of a JSON parser written
in pure PHP. https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc


=== Good reasons for json_validate() provided by the community ===


@@@ Use cases provided by some members, I quote:

- "Yes well-formed JSON from a trusted source tends to be small-ish.
But a validation function also needs to deal with non-well-formed
JSON, otherwise you would not need to validate it."

- "If with a new function (json_validate()) it becomes much easier to
defend against a Denial-of-Service attack for some parts of a JSON
API, then this can be a good addition just for security reasons."

- "fast / efficient validation of a common communication format
reduces the attack surface for Denial-of-Service attacks."

@@@ Memory usage

- During the test of json_validate() from some users, they were happy
about the memory usage that was zero in most cases (which is the main
benefit out this feature).
Someone also did a test with a very large string (75 MB) and only a
few bytes were needed as reported by him; also the same user reported
an execution speed improvement by a 20-25% over using json_decode().

@@@ Reasons not to depend on userland JSON parsers

Even possible to write an excellent JSON parser in PHP like one of the
members in the mailing list provided us, there are good reasons for
dont relying on userland solutions.

# 1 - User Tim Düsterhus provided nice thoughts about this, in favor
to json_validate(), ... I quote him:

- "Writing a JSON parser is non-trivial as evidenced by:
https://github.com/nst/JSONTestSuite. I expect userland
implementations to be subtly buggy in edge cases. The JSON parser in
PHP 7.0+ is
certainly more battle-tested and in fact it appears to pass all of the
tests in the linked test suite."

- "Even if the userland implementation is written very carefully, it
might behave differently than the native implementation used by
json_decode() (e.g. because the latter is buggy for some reason or
because the correct behavior is undefined). This would imply that an
input string that was successfully validated by your userland parser
might ultimately fail to parse when passed to json_decode(). This is
exactly what you don't want to happen."


(Some other members including me, also share this opinion.)


# 2 - The JSON parser in PHP follows an special convention, marked in
the PHP documentation.

# 3 - We already have a JSON parser in PHP, that is used by
json_decode(); reusing the existing JSON Parser provides 100%
compatibility between the validation of a json-string, and the
decoding of it.

# 4 - The user Larry Gafield also provided good reason to integrate
this implementation into PHP. I quote him:

"The heuristic I use is that an API should be "reasonably complete" in
one location.  Having a half-assed API in C and the rest left to
inconsistent and redundant user-space implementations is a terrible
API; the same would apply for a user-space 

Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Larry Garfield
On Fri, Aug 26, 2022, at 1:41 PM, Ayesh Karunaratne wrote:
>> whether returning boolean is the right thing to do at all. It seems obviously
>> intuitive it should, returning true for valid and false for invalid JSON
>> but then if you consider you're still going to be in the situation of
>> calling json_last_error() if you want to know why invalid JSON was invalid
>> and in particular you might not expect the "last error" to have changed
>> just from an attempt to check a string. How can there be an error when by
>> definition you weren't trying to do anything except check the validity of
>> some unknown data? Not sure what the answer is there...curious what other
>> people's views are on that. I don't think throwing an exception on invalid
>> JSON is the right answer in any case.
>
> One of the reasons why I like `json_validate` over `is_json` is that
> the former does not imply a boolean return value as strongly as the
> latter.
>
> I think we should not modify any state with the `json_validate`
> function, and I'd be happy with the function returning 0 for a valid
> JSON, or the error code as an integer (`JSON_ERROR_STATE_MISMATCH`,
> `JSON_ERROR_SYNTAX`, etc.). `JSON_ERROR_NONE` constant is already
> assigned 0, so they align quite well too.

The problem here is that you're using a falsy return (0) to indicate "it 
validated", which is logically true.  PHP's type juggling makes that a very bad 
idea.

I somehow doubt it will get much traction, but this is the sort of case where a 
"naked either" approach could work well: 
https://peakd.com/hive-168588/@crell/much-ado-about-null

Though there's also the risk then of all objects being "truthy", since there 
seems to be no appetite for a __toBool() method as that's been shot down 
before.  At some point, we need to realize that our current design philosophy 
boxes us in on too many sides and something is going to have to change.

I see the argument for json_last_error() not being a great fit, but given the 
limitations we've chosen to have in the language it seems like "return bool and 
use json_last_error()" is the least-bad option.

--Larry Garfield

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Derick Rethans
On 26 August 2022 18:28:59 BST, Kamil Tekiela  wrote:
>What is the reasoning behind the name? I can't find it explained in the
>RFC. What about other alternatives like is_json or validate_json?

It's part of the json extension so it should start with json_ following naming 
guidelines for functions in extensions.

cheers
Derick

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Ayesh Karunaratne
> whether returning boolean is the right thing to do at all. It seems obviously
> intuitive it should, returning true for valid and false for invalid JSON
> but then if you consider you're still going to be in the situation of
> calling json_last_error() if you want to know why invalid JSON was invalid
> and in particular you might not expect the "last error" to have changed
> just from an attempt to check a string. How can there be an error when by
> definition you weren't trying to do anything except check the validity of
> some unknown data? Not sure what the answer is there...curious what other
> people's views are on that. I don't think throwing an exception on invalid
> JSON is the right answer in any case.

One of the reasons why I like `json_validate` over `is_json` is that
the former does not imply a boolean return value as strongly as the
latter.

I think we should not modify any state with the `json_validate`
function, and I'd be happy with the function returning 0 for a valid
JSON, or the error code as an integer (`JSON_ERROR_STATE_MISMATCH`,
`JSON_ERROR_SYNTAX`, etc.). `JSON_ERROR_NONE` constant is already
assigned 0, so they align quite well too.

As for exceptions, I too think throwing an exception here is an
anti-pattern. It makes perfect sense to throw on `json_decode` if the
program doesn't want to deal with invalid JSON gracefully. But calling
`json_validate` is an intended call, and returning an error is
excepted, and not an exception.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread David Gebler
On Fri, Aug 26, 2022 at 6:29 PM Kamil Tekiela  wrote:

> What is the reasoning behind the name? I can't find it explained in the
> RFC. What about other alternatives like is_json or validate_json?
>

The name json_validate makes most sense to me; it groups itself together
nicely with the other json_* API functions. I think is_json is what was
originally proposed but besides being then inconsistent with the rest of
the JSON API function naming, consider most of the is_* functions are for
checking types (and some for file properties), not validation.

But on the function, the other question which remains for me is whether
returning boolean is the right thing to do at all. It seems obviously
intuitive it should, returning true for valid and false for invalid JSON
but then if you consider you're still going to be in the situation of
calling json_last_error() if you want to know why invalid JSON was invalid
and in particular you might not expect the "last error" to have changed
just from an attempt to check a string. How can there be an error when by
definition you weren't trying to do anything except check the validity of
some unknown data? Not sure what the answer is there...curious what other
people's views are on that. I don't think throwing an exception on invalid
JSON is the right answer in any case.


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Kamil Tekiela
What is the reasoning behind the name? I can't find it explained in the
RFC. What about other alternatives like is_json or validate_json?


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Larry Garfield
On Fri, Aug 26, 2022, at 4:00 AM, Michał Marcin Brzuchalski wrote:

> A `json_decode()` is a substitute that IMO solves 99% of use cases.
> If I'd follow your logic and accept every small addition that handles 1% of
> use cases, somebody will raise another RFC
> for simplexml_validate_string or yaml_validate and the next
> PhpToken::validate.

The heuristic I use is that an API should be "reasonably complete" in one 
location.  Having a half-assed API in C and the rest left to inconsistent and 
redundant user-space implementations is a terrible API; the same would apply 
for a user-space library that is half-assed and leaves the rest to "someone 
else to write."  Naturally "reasonably complete" is a somewhat squishy term, 
which is why it's a heuristic.

By that metric, yes, str_starts_with() and friends absolutely belonged in core, 
because we already have a bunch of string functions and str_starts_with() is by 
a wide margin the most common usage of strpos().

By the same token, yes, json_validate() makes sense to include in the main API, 
which means in C.  If there's a performance benefit to doing so as well, that 
makes it an easy sell for me.

simplexml_validate_string: I could see the argument there as well, quite 
frankly.  Unless there is already an equivalently capable option in core, which 
in this case it sounds like there is.

yaml_validate: There is no YAML parser that ships with PHP, so no, this 
wouldn't make sense.  If at some point in the future PHP added a C-based YAML 
extension to the standard distribution, then including a validate function in 
that would make sense.  (Whether we should include a YAML extension in the 
standard distribution is an entirely separate question.)

PHPToken::validate: Again, PHPToken is in core, so why wouldn't we also include 
a validate for it, especially if it's easy to do and logically belongs in the 
same "surface area"?

Really, the whole "functions don't belong in C" argument is tired.  That ship 
sailed decades ago.  PHP has a mixed-implementation life, whether we like it or 
not.  Some APIs and standard lib components are in C, deal.  For those that are 
or make sense to be, they should be robust and self-contained and complete.  
For those that aren't, they should be kept out as a set, or added "as a set."  
That's the granularity where we should be talking, not the function level.

I'm in favor of this RFC.  However, I do agree that it should never throw, 
period.  Only return a boolean.  I'm not sure if JSON_INVALID_UTF8_IGNORE makes 
sense either, frankly.  That should get corrected before it goes to a vote.

--Larry Garfield

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



RE: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Jeffrey Dafoe
> -Original Message-
> From: juan carlos morales 
> Sent: Thursday, August 25, 2022 12:02 PM
> To: PHP Internals List 
> Subject: [PHP-DEV] RFC json_validate() - status: Under Discussion
> 
> Hello.
> 
> I am glad to present to you the RFC for json_validate() function.
> 
> The code/implementation still needs some workout, but seems to be fine
> enough to be presented to you all.
> 
> I look forward for feedback
> 
> Thanks in advance.
> 
> Juan.
> 
> RFC: https://wiki.php.net/rfc/json_validate
> 
> Implementation: https://github.com/php/php-src/pull/9399

Not a voting member, but I'd like to voice support for this. JSON is widely 
used, core already supports this functionality, the scope of change is minimal, 
and the feature is useful.

-Jeff

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie., 26 de agosto de 2022 13:08, juan carlos morales <
dev.juan.mora...@gmail.com> escribió:

>
> Is there a better way to do validation other than json_decode?
>


Better way than json_decode... Always from the core perspective

>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
Just Want to remind that this discussion is not about "a json parser can be
written in PHP or not?".

We Have a JSON parser already in the core, ready to be use for validation.

Does it make sense to have another parser in User land to do validation if
We already have one?

Is there a better way to do validation other than json_decode?



El vie., 26 de agosto de 2022 12:48, Michał Marcin Brzuchalski <
michal.brzuchal...@gmail.com> escribió:

> Hi Tim,
>
> pt., 26 sie 2022 o 12:15 Tim Düsterhus  napisał(a):
>
>> Hi
>>
>> On 8/26/22 11:14, Hans Henrik Bergan wrote:
>> >> you can't efficiently validate JSON in userland
>> >
>> > Has anyone actually put that claim to the test? Has anyone actually
>> made a
>> > userland json validator (not just wrap json_decode()/json_last_error())
>> for
>> > performance comparison?
>> > ( if not, https://www.json.org/JSON_checker/JSON_checker.c  would
>> probably
>> > be a good start)
>> >
>>
>> Worded like "you can't efficiently" the claim is false. Of course you
>> can memory-efficiently validate the input by traversing the string byte
>> by byte and keeping track of the nesting.
>>
>> However the points that make a userland implementation infeasible are:
>>
>> 1. Writing a JSON parser is non-trivial as evidenced by:
>> https://github.com/nst/JSONTestSuite. I expect userland implementations
>> to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is
>> certainly more battle-tested and in fact it appears to pass all of the
>> tests in the linked test suite.
>>
>> 2. Even if the userland implementation is written very carefully, it
>> might behave differently than the native implementation used by
>> json_decode() (e.g. because the latter is buggy for some reason or
>> because the correct behavior is undefined). This would imply that an
>> input string that was successfully validated by your userland parser
>> might ultimately fail to parse when passed to json_decode(). This is
>> exactly what you don't want to happen.
>>
>
> Now this is an argument I could think of.
> But that one is not even mentioned in RFC.
>
> The JSON_checker.c example delivered by json.org is probably not
> something impossible
> as it required around 1h of work to port it see working implementation
> here https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc
>
> Cheers,
> Michał Marcin Brzuchalski
>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Michał Marcin Brzuchalski
Hi Tim,

pt., 26 sie 2022 o 12:15 Tim Düsterhus  napisał(a):

> Hi
>
> On 8/26/22 11:14, Hans Henrik Bergan wrote:
> >> you can't efficiently validate JSON in userland
> >
> > Has anyone actually put that claim to the test? Has anyone actually made
> a
> > userland json validator (not just wrap json_decode()/json_last_error())
> for
> > performance comparison?
> > ( if not, https://www.json.org/JSON_checker/JSON_checker.c  would
> probably
> > be a good start)
> >
>
> Worded like "you can't efficiently" the claim is false. Of course you
> can memory-efficiently validate the input by traversing the string byte
> by byte and keeping track of the nesting.
>
> However the points that make a userland implementation infeasible are:
>
> 1. Writing a JSON parser is non-trivial as evidenced by:
> https://github.com/nst/JSONTestSuite. I expect userland implementations
> to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is
> certainly more battle-tested and in fact it appears to pass all of the
> tests in the linked test suite.
>
> 2. Even if the userland implementation is written very carefully, it
> might behave differently than the native implementation used by
> json_decode() (e.g. because the latter is buggy for some reason or
> because the correct behavior is undefined). This would imply that an
> input string that was successfully validated by your userland parser
> might ultimately fail to parse when passed to json_decode(). This is
> exactly what you don't want to happen.
>

Now this is an argument I could think of.
But that one is not even mentioned in RFC.

The JSON_checker.c example delivered by json.org is probably not something
impossible
as it required around 1h of work to port it see working implementation
here https://gist.github.com/brzuchal/37e888d9b13937891c3e05fead5042bc

Cheers,
Michał Marcin Brzuchalski


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Tim Düsterhus

Hi

On 8/26/22 11:00, Michał Marcin Brzuchalski wrote:

A `json_decode()` is a substitute that IMO solves 99% of use cases.
If I'd follow your logic and accept every small addition that handles 1% of
use cases, somebody will raise another RFC
for simplexml_validate_string or yaml_validate and the next
PhpToken::validate.
All above can be valid if we trust that people normally validate 300MB
payloads to do nothing if they DON'T fail and there is nothing strange
about that.


You phrase this as if this was a bad thing.

But yes, if PHP would include a YAML parser in core (there appears to be 
one in PECL) then having a yaml_validate() function with a value-add 
compared to actually constructing the object representation then it 
would certainly be an appropriate inclusion. I don't see why JSON should 
be special here.


For XML my understanding is that the XMLReader class can already be used 
to efficiently validate an XML input in userland [1], because that class 
is defined to be a streaming parser. These types of parsers are not 
uncommon in XML land.


Best regards
Tim Düsterhus

[1] https://www.php.net/manual/en/xmlreader.isvalid.php

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Tim Düsterhus

Hi

On 8/26/22 11:14, Hans Henrik Bergan wrote:

you can't efficiently validate JSON in userland


Has anyone actually put that claim to the test? Has anyone actually made a
userland json validator (not just wrap json_decode()/json_last_error()) for
performance comparison?
( if not, https://www.json.org/JSON_checker/JSON_checker.c  would probably
be a good start)



Worded like "you can't efficiently" the claim is false. Of course you 
can memory-efficiently validate the input by traversing the string byte 
by byte and keeping track of the nesting.


However the points that make a userland implementation infeasible are:

1. Writing a JSON parser is non-trivial as evidenced by: 
https://github.com/nst/JSONTestSuite. I expect userland implementations 
to be subtly buggy in edge cases. The JSON parser in PHP 7.0+ is 
certainly more battle-tested and in fact it appears to pass all of the 
tests in the linked test suite.


2. Even if the userland implementation is written very carefully, it 
might behave differently than the native implementation used by 
json_decode() (e.g. because the latter is buggy for some reason or 
because the correct behavior is undefined). This would imply that an 
input string that was successfully validated by your userland parser 
might ultimately fail to parse when passed to json_decode(). This is 
exactly what you don't want to happen.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Tim Düsterhus

Hi

On 8/26/22 10:56, Paul Crovella wrote:

I'm for this as well. Though something more than a boolean is useful to
indicate when the max depth has been exceeded as that's not strictly a
problem with the json's validity.



Yes, it's not strictly invalid JSON, but I expect json_validate() to be 
used as a check to mean "will json_decode() be able to decode the JSON 
with these parameters". Then `false` is the logical result when 
exceeding the depth.


Also json_last_error() will do what you expect it to do: 
https://github.com/php/php-src/pull/9399/files#diff-1cdb6c7eb68aa60b323b056a06d6f7427b38e9d8d967ce97992129ff65c7951eR55. 
So you can find out the reason why the JSON is invalid.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 11:55, Peter Bowyer
() escribió:
>
> That's a good point which I had overlooked. Does an exception make most
> sense when this happens?
>
> Peter

I just got a code review from Derick Rethans explaining me that a
validation routine should not throw an exception.

I think is a good idea to take it out, after all , the depth parameter
is part of the validation too.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Peter Bowyer
On Fri, 26 Aug 2022 at 09:56, Paul Crovella  wrote:

> I'm for this as well. Though something more than a boolean is useful to
> indicate when the max depth has been exceeded as that's not strictly a
> problem with the json's validity.
>

That's a good point which I had overlooked. Does an exception make most
sense when this happens?

Peter


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 11:43, Andreas Leathley
() escribió:
>
> On 26.08.22 11:00, Michał Marcin Brzuchalski wrote:
>
> There is already a way to validate XML in PHP, and Yaml or PHP is
> something within the control of a PHP programmer, while JSON is mostly
> used as a format for communication in APIs, so you never know what you
> get. If with a new function it becomes much easier to defend against a
> Denial-of-Service attack for some parts of a JSON API, then this can be
> a good addition just for security reasons.
>
> But this reason, which most resonates with me, is currently missing in
> the RFC, so I would suggest to add that fast / efficient validation of a
> common communication format reduces the attack surface for
> Denial-of-Service attacks.
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>

For sure I will add this. Thanks a lot !! That is exactly why we
are having this discussion. Once again, Thanks!

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 11:35, Michał Marcin Brzuchalski
() escribió:
> Your examples are a couple of functions.
> Assuming that they're heavily used is as true as my assumptions.
>
> Cheers,

Is good you clarify that your numbers were assumptions.

By the way, I never said anything about how my examples were use in
real life, I just provided them as examples of a "need"/requirement to
do something.

Thanks for participating.

Cheers.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Andreas Leathley

On 26.08.22 11:00, Michał Marcin Brzuchalski wrote:

A `json_decode()` is a substitute that IMO solves 99% of use cases.
If I'd follow your logic and accept every small addition that handles 1% of
use cases, somebody will raise another RFC
for simplexml_validate_string or yaml_validate and the next
PhpToken::validate.
All above can be valid if we trust that people normally validate 300MB
payloads to do nothing if they DON'T fail and there is nothing strange
about that.


There is already a way to validate XML in PHP, and Yaml or PHP is
something within the control of a PHP programmer, while JSON is mostly
used as a format for communication in APIs, so you never know what you
get. If with a new function it becomes much easier to defend against a
Denial-of-Service attack for some parts of a JSON API, then this can be
a good addition just for security reasons.

But this reason, which most resonates with me, is currently missing in
the RFC, so I would suggest to add that fast / efficient validation of a
common communication format reduces the attack surface for
Denial-of-Service attacks.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Michał Marcin Brzuchalski
Hi Juan,

pt., 26 sie 2022 o 11:26 juan carlos morales 
napisał(a):

> El vie, 26 ago 2022 a las 11:00, Michał Marcin Brzuchalski
> () escribió:
> >
> > A `json_decode()` is a substitute that IMO solves 99% of use cases.
> > If I'd follow your logic and accept every small addition that handles 1%
> of use cases, somebody will raise another RFC
> > for simplexml_validate_string or yaml_validate and the next
> PhpToken::validate.
> > All above can be valid if we trust that people normally validate 300MB
> payloads to do nothing if they DON'T fail and there is nothing strange
> about that.
> >
> > Cheers,
> >
>
> How can you make such an assertion in those numbers (99% of use cases
> and son on, that you mention) ? Can you give us more information about
> this assertions?
>
> I have provide real examples where the need to validate-only a
> json-string is actually needed, also the need from our developers
> community asking for this.
>

Your examples are a couple of functions.
Assuming that they're heavily used is as true as my assumptions.

Cheers,


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 11:26, juan carlos morales
() escribió:
>
> How can you make such an assertion in those numbers (99% of use cases
> and son on, that you mention) ? Can you give us more information about
> this assertions?
>
> I have provide real examples where the need to validate-only a
> json-string is actually needed, also the need from our developers
> community asking for this.
>
> RFC: https://wiki.php.net/rfc/json_validate
>
> Implementation: https://github.com/php/php-src/pull/9399

By the way you dont need to frame this function to handle 300MB of
json-string, as I already mentioned a couple of times, that a 2 MB
json-sring validation with json_decode() needs a large amount of
memory to be validated, possibly hitting the allowed memory limit,
which usually is not that high at all.

RFC: https://wiki.php.net/rfc/json_validate
Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 11:00, Michał Marcin Brzuchalski
() escribió:
>
> A `json_decode()` is a substitute that IMO solves 99% of use cases.
> If I'd follow your logic and accept every small addition that handles 1% of 
> use cases, somebody will raise another RFC
> for simplexml_validate_string or yaml_validate and the next 
> PhpToken::validate.
> All above can be valid if we trust that people normally validate 300MB 
> payloads to do nothing if they DON'T fail and there is nothing strange about 
> that.
>
> Cheers,
>

How can you make such an assertion in those numbers (99% of use cases
and son on, that you mention) ? Can you give us more information about
this assertions?

I have provide real examples where the need to validate-only a
json-string is actually needed, also the need from our developers
community asking for this.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 10:56, Paul Crovella
() escribió:
>
> I'm for this as well. Though something more than a boolean is useful to
> indicate when the max depth has been exceeded as that's not strictly a
> problem with the json's validity.

Thanks, seems to be good idea to take it out, and I will. Nice observation.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 10:28, Peter Bowyer
() escribió:
>
> Hi Juan,
>
> On Thu, 25 Aug 2022 at 17:02, juan carlos morales 
>  wrote:
>>
>> RFC: https://wiki.php.net/rfc/json_validate
>
>
> Thanks for bringing forward this RFC. I am in favour of this change, as you 
> can't efficiently validate JSON in userland.
>
> Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't think 
> of a case where more than a boolean response is needed.
>
> Peter

Seems that will be a good idea to take out the usage of the flag
JSON_THROW_ON_ERROR after all. That was a nice and objective critic
and I appreciate it. Thanks! I will take it out of the implementation.

Regards

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Hans Henrik Bergan
quote
> you can't efficiently validate JSON in userland

Has anyone actually put that claim to the test? Has anyone actually made a
userland json validator (not just wrap json_decode()/json_last_error()) for
performance comparison?
( if not, https://www.json.org/JSON_checker/JSON_checker.c  would probably
be a good start)

On Fri, 26 Aug 2022 at 11:00, Michał Marcin Brzuchalski <
michal.brzuchal...@gmail.com> wrote:

> Hi Dusk,
>
> pt., 26 sie 2022 o 08:17 Dusk  napisał(a):
>
> > On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski <
> > michal.brzuchal...@gmail.com> wrote:
> > > The same goes here and I'm not convinced we should introduce next small
> > function that can be simply implemented in user land.
> >
> > What "simple implementation in userland" do you have in mind? Can you
> > provide an example?
> >
> > json_decode() is not an acceptable substitute here -- as David Gebler has
> > observed, decoding a large JSON structure can have a significant impact
> on
> > memory usage, even if the data is immediately discarded. Any
> implementation
> > based on string processing, on the other hand, is likely to be
> dramatically
> > slower, and may have subtle differences in behavior from PHP's JSON
> parser.
>
>
> A `json_decode()` is a substitute that IMO solves 99% of use cases.
> If I'd follow your logic and accept every small addition that handles 1% of
> use cases, somebody will raise another RFC
> for simplexml_validate_string or yaml_validate and the next
> PhpToken::validate.
> All above can be valid if we trust that people normally validate 300MB
> payloads to do nothing if they DON'T fail and there is nothing strange
> about that.
>
> Cheers,
>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Michał Marcin Brzuchalski
Hi Dusk,

pt., 26 sie 2022 o 08:17 Dusk  napisał(a):

> On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski <
> michal.brzuchal...@gmail.com> wrote:
> > The same goes here and I'm not convinced we should introduce next small
> function that can be simply implemented in user land.
>
> What "simple implementation in userland" do you have in mind? Can you
> provide an example?
>
> json_decode() is not an acceptable substitute here -- as David Gebler has
> observed, decoding a large JSON structure can have a significant impact on
> memory usage, even if the data is immediately discarded. Any implementation
> based on string processing, on the other hand, is likely to be dramatically
> slower, and may have subtle differences in behavior from PHP's JSON parser.


A `json_decode()` is a substitute that IMO solves 99% of use cases.
If I'd follow your logic and accept every small addition that handles 1% of
use cases, somebody will raise another RFC
for simplexml_validate_string or yaml_validate and the next
PhpToken::validate.
All above can be valid if we trust that people normally validate 300MB
payloads to do nothing if they DON'T fail and there is nothing strange
about that.

Cheers,


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Paul Crovella

On 8/26/2022 1:28 AM, Peter Bowyer wrote:

Hi Juan,

On Thu, 25 Aug 2022 at 17:02, juan carlos morales <
dev.juan.mora...@gmail.com> wrote:


RFC: https://wiki.php.net/rfc/json_validate



Thanks for bringing forward this RFC. I am in favour of this change, as you
can't efficiently validate JSON in userland.

Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't
think of a case where more than a boolean response is needed.

Peter




I'm for this as well. Though something more than a boolean is useful to 
indicate when the max depth has been exceeded as that's not strictly a 
problem with the json's validity.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Peter Bowyer
Hi Juan,

On Thu, 25 Aug 2022 at 17:02, juan carlos morales <
dev.juan.mora...@gmail.com> wrote:

> RFC: https://wiki.php.net/rfc/json_validate


Thanks for bringing forward this RFC. I am in favour of this change, as you
can't efficiently validate JSON in userland.

Like Rowan I'm not convinced JSON_THROW_ON_ERROR belongs here. I can't
think of a case where more than a boolean response is needed.

Peter


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread juan carlos morales
El vie, 26 ago 2022 a las 6:47, Michał Marcin Brzuchalski
() escribió:
>
>
> I share the same opinion you expressed here even though you admit in recent 
> email that you changed your mind.
>
> In recent versions we tend to accept more and more small standard library 
> functions with IMO questionable argumentation. The same goes here and I'm not 
> convinced we should introduce next small function that can be simply 
> implemented in user land.

Sorry but I dont think that a JSON parser with memory usage zero (or
maybe a few bytes) can be simply done in the userland. This function
(json_validate) is small by itself, but gives you access to the JSON
parser. Can you please provide an example of what you commented?

> Any example testing > 3MB JSON string are for me edge cases that normally 
> don't happen often to deserve special treatment.

I don't agree with your definition of "edge case" here, as edge cases
depend/belong on/to the system under analysis.

By the way, the test case provided in the PR , "test 005" uses a
json-string of about 3 MB (maybe 3.1) , and in order to decode it
json_decode() needs something around 109 MB of memory. For me,
validation can be done in a better and efficient way, like it has been
probe with this proposal.

> If we keep the tendency to pollute already bloated standard library with an 
> army of small functions that could have not exists and be replaced with 
> normal PHP counterparts IMHO we'll end with frustration from developers as I 
> believe DX slowly falls down here.

Last but not least, I want to say that the function json_validate() is
small, easy to maintain and extend if needed, and at the same gives us
access to something that is not trivial to write in userland  the
existing JSON parser.


RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-26 Thread Dusk
On Aug 25, 2022, at 21:47, Michał Marcin Brzuchalski 
 wrote:
> The same goes here and I'm not convinced we should introduce next small 
> function that can be simply implemented in user land.

What "simple implementation in userland" do you have in mind? Can you provide 
an example?

json_decode() is not an acceptable substitute here -- as David Gebler has 
observed, decoding a large JSON structure can have a significant impact on 
memory usage, even if the data is immediately discarded. Any implementation 
based on string processing, on the other hand, is likely to be dramatically 
slower, and may have subtle differences in behavior from PHP's JSON parser.
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread Michał Marcin Brzuchalski
czw., 25 sie 2022, 21:12 użytkownik David Gebler 
napisał:

> I'm not a voter on RFCs so my input may be largely irrelevant here but for
> discussion purposes:
>
> I remain unconvinced regarding the justification for this proposal. I'm not
> saying there's a strong reason to NOT implement it, but I'm not convinced
> it's really going to be a significant benefit to many people at all.
>
> I agree that the number of userland implementations for a "is_valid_json"
> type function including in some widely used frameworks and systems
> indicates there's some degree of demand in the ecosystem for validating a
> JSON string.
>
> But the more salient question is whether there is a significant demand for
> whatever memory and speed benefit the implementation of a new core ext_json
> function delivers; that is, has it been established that the use of
> json_decode or common userland solutions are in practice not good enough?
>
> There are many examples of userland code which could be faster and more
> memory efficient if they were written in C and compiled in, so the mere
> fact this proposal may introduce a somewhat faster way of validating a JSON
> string over decoding it is not necessarily a sufficient reason to include
> it.
>
> Are there are examples of raising issues for frameworks or systems saying
> they need to validate some JSON but the only existing solutions available
> to them are causing memory limit errors, or taking too long? The Stack
> Overflow question linked on the RFC says "I need a really, really fast
> method of checking if a string is JSON or not."
>
> "Really, really fast" is subjective. No context or further information is
> given about what that person would regard as an acceptable time to validate
> what size blob of valid or invalid JSON, or why. Indeed that same page
> offers a userland solution based around only going to json_decode if some
> other much simpler checks on the input are indeterminate for validation
> purposes. Haven't tested it personally but no doubt in the vast majority of
> cases it is sufficiently performant.
>
> In most real world use cases [that I've encountered over the years] JSON
> blobs tend to be quite small. I have dealt with much, much larger JSON
> blobs, up to a few hundred MB, and in those cases I've used a streaming
> parser. If you're talking about JSON that size, a streaming parser is the
> only realistic answer - you probably don't want to drop a 300MB string in
> to this RFC's new function either, if performance and memory efficiency is
> your concern.
>
> So I'm curious as to whether a real world example can be given where the
> efficiency difference between json_decode and a new json_validate function
> would be important to the system, whether anyone's encountered a scenario
> where this would have made a real difference to them.
>

I share the same opinion you expressed here even though you admit in recent
email that you changed your mind.

In recent versions we tend to accept more and more small standard library
functions with IMO questionable argumentation. The same goes here and I'm
not convinced we should introduce next small function that can be simply
implemented in user land.

Any example testing > 3MB JSON string are for me edge cases that normally
don't happen often to deserve special treatment.

If we keep the tendency to pollute already bloated standard library with an
army of small functions that could have not exists and be replaced with
normal PHP counterparts IMHO we'll end with frustration from developers as
I believe DX slowly falls down here.

Cheers,
Michał Marcin Brzuchalski

>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
> Having actually compiled the branch and tried it out, I have to say 
> regardless of whether validating arbitrarily large blocks of JSON without 
> being interested in the contents is a common or more niche use case, the 
> memory savings ARE highly impressive. I had thought that because the function 
> was built on top of the existing parser and is still parsing the entire 
> string (or up until invalid JSON is encountered), the performance saving for 
> a very large input would be smaller than it is.
>
> I tested using a 75MB valid JSON input - a string large enough that it's not 
> going to be very common. The processing time isn't hugely different, the 
> saving appears to be around maybe 20-25% (and it's not a significant amount 
> of time using either json_decode or json_validate, even on an input of this 
> size, about half a second on my machine for both). But the memory saving is 
> enormous, almost total. Gone from needing ~5x the size of the input to almost 
> literally just a few extra bytes.
>
> I'm persuaded now on both that benchmarking and having had a closer look at 
> the implementation PR, which is clearly a minimal and easily maintainable 
> change.
>
> As I've said, my feelings are irrelevant to the extent I'm not a voter, but I 
> am in principle a +1 thumbs up for including this now.
>
>

David, thanks very much for sharing your results and opinion.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread David Gebler
Having actually compiled the branch and tried it out, I have to say
regardless of whether validating arbitrarily large blocks of JSON without
being interested in the contents is a common or more niche use case, the
memory savings ARE highly impressive. I had thought that because the
function was built on top of the existing parser and is still parsing the
entire string (or up until invalid JSON is encountered), the performance
saving for a very large input would be smaller than it is.

I tested using a 75MB valid JSON input - a string large enough that it's
not going to be very common. The processing time isn't hugely different,
the saving appears to be around maybe 20-25% (and it's not a significant
amount of time using either json_decode or json_validate, even on an input
of this size, about half a second on my machine for both). But the memory
saving is enormous, almost total. Gone from needing ~5x the size of the
input to almost literally just a few extra bytes.

I'm persuaded now on both that benchmarking and having had a closer look at
the implementation PR, which is clearly a minimal and easily maintainable
change.

As I've said, my feelings are irrelevant to the extent I'm not a voter, but
I am in principle a +1 thumbs up for including this now.


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
I also added in the RFC your concern about JSON_THROW_ON_ERROR usage.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
Ok then ... I take it out of the RFC just because we have doubts about it.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread Rowan Tommins
[Replying to list - make sure to hit "Reply List" or "Reply All", not 
"Reply" in your mail client]


On 25/08/2022 21:59, juan carlos morales wrote:

Hello Rowan thanks for participating.

My answer for your comments would be:

=== Regarding JSON_THROW_ON_ERROR ===
I think is a valid point from your side, and I am open to change the 
implementation if the maintainers believe so. I leave it there as it 
existed in json_decode() and wanted to give the developer the option 
to use it or not. But as I said, you hae a valid point, that I also like.



Just to clarify, "the maintainers" of PHP are basically everyone on this 
list, trying to reach a consensus; there isn't really any core group 
that makes the final decision. There is a binding vote at the end of the 
RFC process, but it's in principle just a way to measure that consensus.





=== Regarding the examples you mentioned ===
The one from humhub and Symfony http-kernel, yes you are right, I will 
take them out from the RFC right now. Nice catch.


But the Magento one, I took a look again, and I still believe is right 
example, because the only reason is using json_decode() for, is to 
validate only. Am I missing something here?



It's not entirely clear *what* that Magento code is doing, but it's 
definitely not just validation: the output of json_decode is passed to 
$this->_jsonEncoder->encode(); if it was just validating, it would 
return $input, or true. My guess is that like the Symfony example it is 
"pretty-printing" existing JSON strings.



Regards,

--
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
Hey Tim, thanks a lot for participating and taking your time to use
the code and making your own tests, I highly appreciate it.

Regards.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
Hello Rowan thanks for participating.

My answer for your comments would be:

=== Regarding JSON_THROW_ON_ERROR ===
I think is a valid point from your side, and I am open to change the
implementation if the maintainers believe so. I leave it there as it
existed in json_decode() and wanted to give the developer the option to use
it or not. But as I said, you hae a valid point, that I also like.

=== Regarding the examples you mentioned ===
The one from humhub and Symfony http-kernel, yes you are right, I will take
them out from the RFC right now. Nice catch.

But the Magento one, I took a look again, and I still believe is right
example, because the only reason is using json_decode() for, is to validate
only. Am I missing something here?

Regards.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread Rowan Tommins

On 25/08/2022 17:02, juan carlos morales wrote:

I am glad to present to you the RFC for json_validate() function.

RFC: https://wiki.php.net/rfc/json_validate



Hi,

Leaving aside the overall question of inclusion and looking at details, 
I'm not convinced JSON_THROW_ON_ERROR belongs here.


With json_decode, throwing an exception is a useful feature because a) 
there is no natural value to signal an error, false and null both being 
valid results; and b) most code is written expecting the input to be 
valid, so failure is an "exceptional state" that needs to break out of 
normal flow.


With json_validate, neither reason applies: there is no ambiguity in 
what false means, and no reason to write code that calls json_validate 
but doesn't check that return value.



On a different note, I notice that several of your examples are not 
actually checking the validity of JSON at all, they are parsing it in 
order to re-serialize it - the Magento getJSONString, the Symfony 
getPrettyJson, and the humhub actionIndex functions are all directly 
using the output of json_decode, not just reporting whether it succeeded.


There's still plenty of other examples that make your point, but those 
three should probably be removed.



Regards,

--
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread Tim Düsterhus

Hi

On 8/25/22 21:11, David Gebler wrote:

There are many examples of userland code which could be faster and more
memory efficient if they were written in C and compiled in, so the mere
fact this proposal may introduce a somewhat faster way of validating a JSON
string over decoding it is not necessarily a sufficient reason to include
it.

Are there are examples of raising issues for frameworks or systems saying
they need to validate some JSON but the only existing solutions available
to them are causing memory limit errors, or taking too long? The Stack
Overflow question linked on the RFC says "I need a really, really fast
method of checking if a string is JSON or not."


The proposed function is meant to be used for validation. Validation 
processes by definition need to deal with untrusted data. So the input 
data might even be actively malicious in order to tie up resources on 
the server (DoS attack - single D there).



In most real world use cases [that I've encountered over the years] JSON
blobs tend to be quite small. I have dealt with much, much larger JSON


Yes well-formed JSON from a trusted source tends to be small-ish. But a 
validation function also needs to deal with non-well-formed JSON, 
otherwise you would not need to validate it.


I was able to use up an extra 100 MB of RAM with a 3 MB input that is 
invalid JSON when using json_decode(), just for it to reject the input.


For json_validate() the extra memory (as per memory_get_peak_usage()) 
required for the same operation effectively zero. It was able to deal 
with 60 MB of input just fine.


I've attached the script I used for the test. I left out the actual JSON 
string to not give script kiddies a loaded weapon, but you likely should 
be able to craft some input yourself.



blobs, up to a few hundred MB, and in those cases I've used a streaming
parser. If you're talking about JSON that size, a streaming parser is the
only realistic answer - you probably don't want to drop a 300MB string in
to this RFC's new function either, if performance and memory efficiency is
your concern.

So I'm curious as to whether a real world example can be given where the
efficiency difference between json_decode and a new json_validate function
would be important to the system, whether anyone's encountered a scenario
where this would have made a real difference to them.



While my example is not a real world example, I don't believe it's a 
stretch to say it can be applied as-is to the real world.


So IMO:

- The proposed function does exactly what it promises to do, not more, 
not less.
- If it's introduced, then it is going to be the obvious choice for JSON 
validation and at the same time it is going to be the best choice for 
JSON validation. I strongly believe it is a good thing if users are 
steered to make the correct choice by default without needing to invest 
brain cycles.
- The patch is pretty small, because the hard work of JSON parsing is 
already implemented.
- Userland implementations are non-obvious and non-trivial as evidenced 
by the examples in the RFC: They are all slightly different and one of 
them even mishandles a plain `false` input, because it does not check 
json_last_error().
- Userland implementations are also either less efficient (relying on 
json_decode()) or potentially inconsistent (hand-rolling a validating 
parser).


Best regards
Tim Düsterhus<>

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
Thanks for participating. Cheers.

RFC: https://wiki.php.net/rfc/json_validate

Implementation: https://github.com/php/php-src/pull/9399


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread juan carlos morales
Hello David, thanks for your feedback.

I believe that the answer to most of your concerns/questions related to
other projects that would benefit out of something like this is already
answered by me in the RFC. I mentioned major PHP projects that will
directly benefit out of something like this.

About the demand, well, that is also being written in the RFC.

Regarding the question in StackOverflow, was just mentioned to expose the
"demand" of validating a json string, and is one of the most ranked
questions, is in the top 10 of questions with php && json tags. I believe
that such a thing is talking by iself.

Providing real use cases wont make a difference, because the number of
examples are infinite honestly.

Yes there are userland packages that can do this for us, but in big
products on big companies, relying on 3rd party packages is not always
cool, I have being experiencing that in my company actually.

Regarding examples of raising issues ... there are a LOT in github, and the
only possible solution is always to increase the memory limit or use a 3rd
party package (which as I said, is not always suitable).

But ... lets be objective, or try to be at least ... PHP has a parser
already, this functionality is just allowing us to use it without the need
to consume memory.

I see it useful, actually, this RFC is the result of a real need we have in
my current company at the moment, and after digging around the internet I
found hat we were not the only ones.

Thanks for taking our time on this, I appreciate it.


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread Deleu
On Thu, Aug 25, 2022, 9:12 PM David Gebler  wrote:

>
> I agree that the number of userland implementations for a "is_valid_json"
> type function including in some widely used frameworks and systems
> indicates there's some degree of demand in the ecosystem for validating a
> JSON string.
>

I don't have an option about the rest of your email, but I wanted to point
out that the vast userland need for validating JSON right now is being
"hacked" for lack of a better functionality. Whether it's justifiable to
bring this to core is up for debate, but I at least take it as a
valid point that "asking a computer to do a task just to check whether it
failed" is odd and limiting to say the least

>


Re: [PHP-DEV] RFC json_validate() - status: Under Discussion

2022-08-25 Thread David Gebler
I'm not a voter on RFCs so my input may be largely irrelevant here but for
discussion purposes:

I remain unconvinced regarding the justification for this proposal. I'm not
saying there's a strong reason to NOT implement it, but I'm not convinced
it's really going to be a significant benefit to many people at all.

I agree that the number of userland implementations for a "is_valid_json"
type function including in some widely used frameworks and systems
indicates there's some degree of demand in the ecosystem for validating a
JSON string.

But the more salient question is whether there is a significant demand for
whatever memory and speed benefit the implementation of a new core ext_json
function delivers; that is, has it been established that the use of
json_decode or common userland solutions are in practice not good enough?

There are many examples of userland code which could be faster and more
memory efficient if they were written in C and compiled in, so the mere
fact this proposal may introduce a somewhat faster way of validating a JSON
string over decoding it is not necessarily a sufficient reason to include
it.

Are there are examples of raising issues for frameworks or systems saying
they need to validate some JSON but the only existing solutions available
to them are causing memory limit errors, or taking too long? The Stack
Overflow question linked on the RFC says "I need a really, really fast
method of checking if a string is JSON or not."

"Really, really fast" is subjective. No context or further information is
given about what that person would regard as an acceptable time to validate
what size blob of valid or invalid JSON, or why. Indeed that same page
offers a userland solution based around only going to json_decode if some
other much simpler checks on the input are indeterminate for validation
purposes. Haven't tested it personally but no doubt in the vast majority of
cases it is sufficiently performant.

In most real world use cases [that I've encountered over the years] JSON
blobs tend to be quite small. I have dealt with much, much larger JSON
blobs, up to a few hundred MB, and in those cases I've used a streaming
parser. If you're talking about JSON that size, a streaming parser is the
only realistic answer - you probably don't want to drop a 300MB string in
to this RFC's new function either, if performance and memory efficiency is
your concern.

So I'm curious as to whether a real world example can be given where the
efficiency difference between json_decode and a new json_validate function
would be important to the system, whether anyone's encountered a scenario
where this would have made a real difference to them.