Hi Yasuo,

> On 9 Sep 2016, at 06:12, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:
> 
> On Thu, Sep 8, 2016 at 9:13 PM, Stephen Reay <php-li...@koalephant.com> wrote:
>>> On 8 Sep 2016, at 17:49, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:
>>> 
>>> Hi Stephen,
>>> 
>>> On Thu, Sep 8, 2016 at 7:34 PM, Stephen Reay <php-li...@koalephant.com> 
>>> wrote:
>>>> Adding a bunch of new functions is IMO the wrong approach to this type of 
>>>> thing.
>>>> The existing filter_var/filter_input infrastructure works well, if you 
>>>> want to define more rules I would definitely encourage building 
>>>> on/improving that system not adding a bunch of extra functions.
>>> 
>>> Do you really think filter module works well as optimal validator?
>> 
>> It’s not perfect, but nothing is. As I said, I believe the issues can 
>> largely be resolved by building on the existing functionality.
>> 
>>> It cannot enforce even whitelisting well…
>> 
>> VALIDATE_INT already accepts $max and $min options. Those options could be 
>> applied to VALIDATE_FLOAT, and $charset, $accepted_chars, $max_len, $min_len 
>> could be implemented on a new VALIDATE_STRING filter.
> 
> But it trims input because it is filter based. For example, int input like
> '                                                 1234                      '
> must be invalid with whitelisting approach, but it's allowed with
> current implementation.
> 
>> 
>> I understand the use-case for multiple validation per input, and for 
>> validating multiple inputs, but frankly the way this implements that is both 
>> confusing to use, and has a less than ideal error-mode.
>> 
> 
> I agree.
> As filter, it works well mostly. As validator, it's unuseable due to
> filter like behavior/non whitelisting behaviours.

As you mentioned, there isn’t *much* space left for new flags (about 8 slots 
left by my count?) but surely a FILTER_FLAG_ALLOW_WHITESPACE could be utilised 
by multiple filters to achieve the same thing, while still allowing for legacy 
behaviour.

>> The “filter spec” input is an array of arrays of arrays, most of which will 
>> also contain an array for ‘options’. To me that’s getting dangerously close 
>> to JavaScript’s callback hell for impossible to read code.
>> 
> I can understand your concern. Issue would be callback validator, but
> callback nesting would not be needed unlike JS callback hell. Since
> it's simple array, content/rule can be viewed easily also.
> 

Sorry, maybe I wasn’t clear. My issue wasn’t with the callback filter 
specifically. Callback hell in Javascript is the issue where you have a lot of 
nested callbacks, which can make the code quite difficult to read, even with 
indenting, matching brace highlighting etc.

My concern here is that you’d  have a very deeply nested data structure 
essentially to avoid a user-space loop.

>> The error mode is also not ideal in a real world use case in my opinion. If 
>> I am validating a dozen input fields, I do *not* want to know just the first 
>> one that failed. Can you imagine using a web form that made you submit 12 
>> times to tell you each time you got a field wrong, rather than trying to 
>> validate them ALL and telling you ALL the errors at once?
> 
> You can omit validation where you would like. So your concern wouldn't
> be problem. (I strongly suggest  to validate all inputs for all entry
> points, though)

Again, I apologise, maybe I wasn’t clear.

Let’s say I expect to get 5 posted parameters (from a form, a direct http api 
call, etc). 3 are required, of those 1 must be an email, and the other two must 
match (e.g. password/confirm) and have a custom callback to match (e.g. to 
prevent ‘aaaaaa’). the other two are optional, but have a maximum length of 45 
characters each.

Now, I want to validate all those fields, and give meaningful error messages 
back to the user.

With your proposal, if they have made provided somehow invalid data in all five 
fields, they will have to make at least 6 http requests (assuming each time 
they get an error response they manage to fix that field on their first 
re-try). 

I can definitely see benefit in allowing exceptions when validation fails. I 
use a validation system that does something similar internally, but it includes 
catches so effectively, validating an request with 5 inputs will potentially 
give you a single exception, which then gives access to the error encountered 
for each input (which is also expressed as an exception)



> 
>> 
>> Personally I think a better approach is:
>> 1. improve/adding to the filters available, and if desired, add extra 
>> flags/options e.g, to throw an exception on failure (which, btw was 
>> requested via bugs.php.net 6 years ago), to set min/max values for 
>> FILTER_VALIDATE_FLOAT, etc.
>> 
>> 2a. Leave the multiple rules per input to userland (e.g. dev uses foreach, 
>> array_walk, etc on a rules array or what have you)
>> 2b. *maybe* add an alternative to filter_(input/var)_array where it’s 1 
>> input and multiple rules, e.g. filter_(input|var)_multiple
>> 
>> If you wanted to follow 2b, I’d suggest perhaps tackling it as a separate 
>> RFC - improving *what* can be validated isn’t necessarily tied to *how* you 
>> define what you want validated.
> 
> Thank you for the suggestion. There are issues this approach.
> 
> 1. Existing validator does not perform strict validations. We need new
> validator rules. e.g.
> 
> FILTER_VALIDATE_STRICT_FLOAT
>  - Do not allow white space prefix/postfix, raise exception.
> FILTER_VALIDATE_STRICT_INT
>  - Do not allow white space prefix/postfix, raise exception.
> 
> Having
> FILTER_VALIDATE_STRICT_FLOAT
> and
> FILTER_VALIDATE_FLOAT
> would be problematic.


As I mentioned above, I would imagine a single new flag could be used to make 
“all” filters have the strict behaviour. Presumably FILTER_VALIDATE_BOOLEAN 
should treat ‘    true    ‘ as an error too?

> 
> Current filter functions fallback to FILTER_UNSAFE_RAW when something
> goes wrong which is unacceptable for validator.

I’m not sure I understand this. What do you mean by “when something goes 
wrong”. Can you elaborate on that?

> 
> 2a. Letting user to use foreach will results in the same, at least
> similar, rule definition array. Please take a look at the last reply
> to Rowan's post.
> 
> 2b. This sounds good. I should think about implementation. 2a issue
> that result in the same or similar array definition remains though.

Not necessarily. While they *could* be in the format you specified, they could 
also be defined individually or defined inline.


I really believe that combining multiple fields *and* multiple rule definitions 
complicates things, a *lot*.
Making improvements to the filter system but only for this new “complicated” 
function feels like it goes against your intent of trying to make it easier for 
developers to make safe(r) applications.


Cheers

Stephen

> 
> Thank you for suggestions!
> 
> Regards,
> 
> --
> Yasuo Ohgaki
> yohg...@ohgaki.net
> 
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
> 


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to