Hello Vasilii,

It’s okay to have different opinion I hope.

You are missing an important point here - beside my comments, the current
way this is developed brings confusion.

It would be great if you share your experience on this matter.

Regards,
Dimitar

On Sun, 2 Oct 2022 at 9:31, Vasilii Shpilchin <vasilii.b.shpilc...@gmail.com>
wrote:

> All right if you are writing on PHP for 25 years, you noticed the PHP was
> always about high-order web-focused functionality out-of-box. This is one
> of basic benefits of PHP to other general-purpose languages where you can
> write everything you want and you also have to write it since the language
> itself is very basic. I'm for PHP to keep built-in solutions for most
> common problems in the context of the web. Having passe ZCE exam and
> writing just 15 years on php.
>
> On Sun, Oct 2, 2022, 2:19 AM Lokrain <lokr...@gmail.com> wrote:
>
>> Hello Kamil,
>>
>> I believe that PHP should not try to act as a “framework” that provides
>> you
>> with ready solutions for such cases.
>>
>> Being able to actually modify the default behaviour of some functions
>> through the ini .. is even scarier.
>>
>> For 25 year writing in PHP I never relied on this “magic” for security:)
>>
>> Regards,
>> Dimitar
>>
>> On Sat, 1 Oct 2022 at 18:39, Kamil Tekiela <tekiela...@gmail.com> wrote:
>>
>> > Hi Internals,
>> >
>> > For quite some time now, PHP's sanitize filters have "Rustled My
>> Jimmies".
>> > These filters bother me because I can't really justify their existence.
>> I
>> > can understand that a few of them are sensible and may come in handy,
>> but I
>> > would like to talk about some of these in particular.
>> >
>> > In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to
>> be
>> > a priority due to its confusing name and behaviour. The rest is slightly
>> > less dangerous, but as was pointed out to me in a recent conversation
>> with
>> > a PHP developer, these filters are all very confusing.
>> >
>> > I would like to have some opinions on the following filters. What do you
>> > think we should do with them? Deprecate? Fix? Provide better
>> documentation?
>> >
>> > ---
>> >
>> > *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or
>> encode
>> > special characters."
>> > Now, what does that mean? PHP has two functions for URL encoding:
>> urlencode
>> > used for encoding query-string parts, and rawurlencode used for encoding
>> > any other URL part (two different RFCs are followed by these functions).
>> > Which of these RFCs is applied in this filter? Furthermore, the
>> description
>> > says that "special characters" can be stripped or encoded. Is one of
>> these
>> > actions the default and the other can be selected by a flag or are both
>> > optional? What are these special characters? Are they special in the
>> > context of URL? If so, why did we encode them first? If these are HTML
>> > special characters (there's no single definition of special HTML chars),
>> > then why does this filter encode them if the filter is for URL
>> > sanitization? What does backtick have to do with any of this
>> > (FILTER_FLAG_STRIP_BACKTICK)?
>> >
>> > *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of
>> PHP
>> > 7.3.0)"
>> > This filter was added as a replacement for magic_quotes filter.
>> According
>> > to PHP documentation, addslashes is supposed to be used when injecting
>> PHP
>> > variables into eval'd string. Real-life showed that this function is
>> used
>> > in a lot of places that have nothing to do with PHP's eval. I am not
>> sure
>> > if the sanitize filter is misused in a similar fashion, but judging from
>> > the fact that it was meant as a replacement for magic_quotes, my guess
>> is
>> > that it's very likely still abused.
>> >
>> > *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits
>> and
>> > !#$%&'*+-=?^_`{|}~@.[]."
>> > Which RFC does this adhere to? It strips slashes and quoted parts,
>> doesn't
>> > allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
>> > filter is ok for simple usage, but it isn't true to any known
>> specification
>> > AFAIK.
>> >
>> > *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
>> > ASCII value less than 32, optionally strip or encode other special
>> > characters."
>> > What's the intended purpose of this filter? "Special characters" are
>> still
>> > not clearly defined, but at least it's more clear than
>> > the FILTER_SANITIZE_ENCODED description. Same question about backticks
>> > though: why? Why encode ASCII <32 chars?
>> >
>> > *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
>> > htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled
>> by
>> > setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this
>> filter
>> > is aware of the default_charset and if a sequence of bytes is detected
>> that
>> > makes up an invalid character in the current character set then the
>> entire
>> > string is rejected resulting in a 0-length string. When using this
>> filter
>> > as a default filter, see the warning below about setting the default
>> flags
>> > to 0."
>> > Not to be mistaken with FILTER_SANITIZE_SPECIAL_CHARS. As long as it's
>> not
>> > used with filter_input(), it's the least problematic. We
>> > have htmlspecialchars() though, so how useful is this filter?
>> >
>> > *FILTER_UNSAFE_RAW *- What makes it unsafe? Why isn't this just
>> > called FILTER_RAW_STRING? If the value being filtered is something other
>> > than a string, what will this filter return? Integers, floats, booleans
>> and
>> > nulls are converted to a string, Arrays and objects make the filter
>> fail.
>> >
>> > ---
>> >
>> > Let's quickly mention the filter flags.
>> >
>> > The FILTER_FLAG_STRIP_LOW flag will also remove tabs, carriage returns
>> and
>> > newlines as these are all less than 32 ASCII codes. When is this useful
>> and
>> > expected?
>> >
>> > The FILTER_FLAG_ENCODE_LOW flag "encodes" ASCII <32 codes presumably
>> into
>> > HTML entities, although that's not specified anywhere in the PHP manual.
>> > The word HTML does not appear on the
>> > https://www.php.net/manual/en/filter.filters.flags.php page. What do
>> these
>> > characters look like when presented by HTML? When is it ever useful to
>> use
>> > this flag?
>> >
>> > FILTER_FLAG_ENCODE_AMP & FILTER_FLAG_STRIP_BACKTICK - why is this even a
>> > thing?
>> >
>> > Due to flags, FILTER_VALIDATE_EMAIL will happily validate email
>> addresses
>> > that would be otherwise mangled by FILTER_SANITIZE_EMAIL.
>> >
>> > These are just the things I found confusing and strange about the
>> sanitize
>> > filters. Let's try to put ourselves in the shoes of an average PHP
>> > developer trying to comprehend these filters. It's quite easy to shoot
>> > yourself in the foot if you try to use them. The PHP manual doesn't do a
>> > good job of explaining them, but that's probably because they are not
>> easy
>> > to explain. I can't come up with good examples of when they should be
>> used.
>> >
>> > Regards,
>> > Kamil
>> >
>>
>

Reply via email to