Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread David Gebler
On Sun, Oct 2, 2022 at 4:10 PM Larry Garfield 
wrote:

> The filter extension has always been a stillborn mess.  Its API is an
> absolute disaster and, as you note, its functionality is unclear at best,
> misleading at worst.  Frankly it's worse than SPL.
>
> I'd be entirely on board with jettisoning the entire thing, but baring
> that, ripping out large swaths of it that are misleading suits me fine.
>
>
The whole thing is seriously grim. Looking at the documentation for
filter_var for example, look at what it says for the third parameter,
$options

>  Associative array of options or bitwise disjunction of flags. If filter
accepts options, flags can be provided in "flags" field of array. For the
"callback" filter, callable type should be passed.

At a glance, I think all the examples mentioned in this thread have better
existing alternatives already in core and could just be deprecated then
removed. But it's worth asking, is that what we're talking about here, or
is there a suggestion of replacing the filter API with a more modern,
object API?


Re: [PHP-DEV] RFC: StreamWrapper Support for glob()

2022-10-02 Thread Timmy Almroth
>
> I had a quick look to that PoC and it's basically just a quick wrapper
> that depends on GLOB_ALTDIRFUNC. Unfortunately that's a non standard
> extension that might be missing on some platform (e.g. alpine won't
> probably work because from a quick look the musl libc doesn't seeem to
> implement it - https://git.musl-libc.org/cgit/musl/tree/include/glob.h ).
> The code obviously needs more work and we will need bunch of tests for
> this. Well obviously it's just a PoC but what I want to say is that
> implementation is really main thing here and should also help to provide
> more details in RFC. For example it's not currently clear that you would
> change underlaying glob implementation on some platforms - quite important
> thing to mention though. To be honest if there is a good implementation, I
> think it's quite unlikely that the RFC will fail.
>

Hi Jakub. Indeed, it looks like Alpine users would not benefit from this. I
made notes of this in the RFC, thank you.

I understand you would have wanted to see the final PR. I do too. I will
not be involved in producing the final PR and these were the conditions I
agreed on. It's hard to find someone who will invest the time not knowing
if all their work will carry through. I wish there were two steps in the
RFC process. One voting for qualifying the idea/concept, and a second part
to vote for the final PR. The first qualifying part could sort out the vast
undesired ideas. And the second could attract coders and maybe allow for
several PR proposals. Do the php team have any future intentions to maybe
implement an RFC management system that could optimize both the creation,
user experience, and voting process?


Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread Hans Henrik Bergan
FILTER_SANITIZE_EMAIL should burn. If you have a bad email address, i can't
imagine the correct solution is to remove characters until it becomes
valid, short of a trim()

On Sun, Oct 2, 2022, 17:10 Larry Garfield  wrote:

> On Sat, Oct 1, 2022, at 10:39 AM, Kamil Tekiela wrote:
> > Hi Internals,
> >
> > For quite some time now, PHP's sanitize filters have "Rustled My
> Jimmies".
> > These filters bother me because I can't really justify their existence. I
> > can understand that a few of them are sensible and may come in handy,
> but I
> > would like to talk about some of these in particular.
> >
> > In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to
> be
> > a priority due to its confusing name and behaviour. The rest is slightly
> > less dangerous, but as was pointed out to me in a recent conversation
> with
> > a PHP developer, these filters are all very confusing.
> >
> > I would like to have some opinions on the following filters. What do you
> > think we should do with them? Deprecate? Fix? Provide better
> documentation?
> >
> > ---
> >
> > *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or
> encode
> > special characters."
> > Now, what does that mean? PHP has two functions for URL encoding:
> urlencode
> > used for encoding query-string parts, and rawurlencode used for encoding
> > any other URL part (two different RFCs are followed by these functions).
> > Which of these RFCs is applied in this filter? Furthermore, the
> description
> > says that "special characters" can be stripped or encoded. Is one of
> these
> > actions the default and the other can be selected by a flag or are both
> > optional? What are these special characters? Are they special in the
> > context of URL? If so, why did we encode them first? If these are HTML
> > special characters (there's no single definition of special HTML chars),
> > then why does this filter encode them if the filter is for URL
> > sanitization? What does backtick have to do with any of this
> > (FILTER_FLAG_STRIP_BACKTICK)?
> >
> > *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of PHP
> > 7.3.0)"
> > This filter was added as a replacement for magic_quotes filter. According
> > to PHP documentation, addslashes is supposed to be used when injecting
> PHP
> > variables into eval'd string. Real-life showed that this function is used
> > in a lot of places that have nothing to do with PHP's eval. I am not sure
> > if the sanitize filter is misused in a similar fashion, but judging from
> > the fact that it was meant as a replacement for magic_quotes, my guess is
> > that it's very likely still abused.
> >
> > *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits
> and
> > !#$%&'*+-=?^_`{|}~@.[]."
> > Which RFC does this adhere to? It strips slashes and quoted parts,
> doesn't
> > allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
> > filter is ok for simple usage, but it isn't true to any known
> specification
> > AFAIK.
> >
> > *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
> > ASCII value less than 32, optionally strip or encode other special
> > characters."
> > What's the intended purpose of this filter? "Special characters" are
> still
> > not clearly defined, but at least it's more clear than
> > the FILTER_SANITIZE_ENCODED description. Same question about backticks
> > though: why? Why encode ASCII <32 chars?
> >
> > *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
> > htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled
> by
> > setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this
> filter
> > is aware of the default_charset and if a sequence of bytes is detected
> that
> > makes up an invalid character in the current character set then the
> entire
> > string is rejected resulting in a 0-length string. When using this filter
> > as a default filter, see the warning below about setting the default
> flags
> > to 0."
> > Not to be mistaken with FILTER_SANITIZE_SPECIAL_CHARS. As long as it's
> not
> > used with filter_input(), it's the least problematic. We
> > have htmlspecialchars() though, so how useful is this filter?
> >
> > *FILTER_UNSAFE_RAW *- What makes it unsafe? Why isn't this just
> > called FILTER_RAW_STRING? If the value being filtered is something other
> > than a string, what will this filter return? Integers, floats, booleans
> and
> > nulls are converted to a string, Arrays and objects make the filter fail.
> >
> > ---
> >
> > Let's quickly mention the filter flags.
> >
> > The FILTER_FLAG_STRIP_LOW flag will also remove tabs, carriage returns
> and
> > newlines as these are all less than 32 ASCII codes. When is this useful
> and
> > expected?
> >
> > The FILTER_FLAG_ENCODE_LOW flag "encodes" ASCII <32 codes presumably into
> > HTML entities, although that's not specified anywhere in the PHP manual.
> > The word HTML does not appear on the
> > 

Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread Larry Garfield
On Sat, Oct 1, 2022, at 10:39 AM, Kamil Tekiela wrote:
> Hi Internals,
>
> For quite some time now, PHP's sanitize filters have "Rustled My Jimmies".
> These filters bother me because I can't really justify their existence. I
> can understand that a few of them are sensible and may come in handy, but I
> would like to talk about some of these in particular.
>
> In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to be
> a priority due to its confusing name and behaviour. The rest is slightly
> less dangerous, but as was pointed out to me in a recent conversation with
> a PHP developer, these filters are all very confusing.
>
> I would like to have some opinions on the following filters. What do you
> think we should do with them? Deprecate? Fix? Provide better documentation?
>
> ---
>
> *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or encode
> special characters."
> Now, what does that mean? PHP has two functions for URL encoding: urlencode
> used for encoding query-string parts, and rawurlencode used for encoding
> any other URL part (two different RFCs are followed by these functions).
> Which of these RFCs is applied in this filter? Furthermore, the description
> says that "special characters" can be stripped or encoded. Is one of these
> actions the default and the other can be selected by a flag or are both
> optional? What are these special characters? Are they special in the
> context of URL? If so, why did we encode them first? If these are HTML
> special characters (there's no single definition of special HTML chars),
> then why does this filter encode them if the filter is for URL
> sanitization? What does backtick have to do with any of this
> (FILTER_FLAG_STRIP_BACKTICK)?
>
> *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of PHP
> 7.3.0)"
> This filter was added as a replacement for magic_quotes filter. According
> to PHP documentation, addslashes is supposed to be used when injecting PHP
> variables into eval'd string. Real-life showed that this function is used
> in a lot of places that have nothing to do with PHP's eval. I am not sure
> if the sanitize filter is misused in a similar fashion, but judging from
> the fact that it was meant as a replacement for magic_quotes, my guess is
> that it's very likely still abused.
>
> *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits and
> !#$%&'*+-=?^_`{|}~@.[]."
> Which RFC does this adhere to? It strips slashes and quoted parts, doesn't
> allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
> filter is ok for simple usage, but it isn't true to any known specification
> AFAIK.
>
> *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
> ASCII value less than 32, optionally strip or encode other special
> characters."
> What's the intended purpose of this filter? "Special characters" are still
> not clearly defined, but at least it's more clear than
> the FILTER_SANITIZE_ENCODED description. Same question about backticks
> though: why? Why encode ASCII <32 chars?
>
> *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
> htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled by
> setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this filter
> is aware of the default_charset and if a sequence of bytes is detected that
> makes up an invalid character in the current character set then the entire
> string is rejected resulting in a 0-length string. When using this filter
> as a default filter, see the warning below about setting the default flags
> to 0."
> Not to be mistaken with FILTER_SANITIZE_SPECIAL_CHARS. As long as it's not
> used with filter_input(), it's the least problematic. We
> have htmlspecialchars() though, so how useful is this filter?
>
> *FILTER_UNSAFE_RAW *- What makes it unsafe? Why isn't this just
> called FILTER_RAW_STRING? If the value being filtered is something other
> than a string, what will this filter return? Integers, floats, booleans and
> nulls are converted to a string, Arrays and objects make the filter fail.
>
> ---
>
> Let's quickly mention the filter flags.
>
> The FILTER_FLAG_STRIP_LOW flag will also remove tabs, carriage returns and
> newlines as these are all less than 32 ASCII codes. When is this useful and
> expected?
>
> The FILTER_FLAG_ENCODE_LOW flag "encodes" ASCII <32 codes presumably into
> HTML entities, although that's not specified anywhere in the PHP manual.
> The word HTML does not appear on the
> https://www.php.net/manual/en/filter.filters.flags.php page. What do these
> characters look like when presented by HTML? When is it ever useful to use
> this flag?
>
> FILTER_FLAG_ENCODE_AMP & FILTER_FLAG_STRIP_BACKTICK - why is this even a
> thing?
>
> Due to flags, FILTER_VALIDATE_EMAIL will happily validate email addresses
> that would be otherwise mangled by FILTER_SANITIZE_EMAIL.
>
> These are just the things I found confusing and strange about the 

Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread Lokrain
Hello Vasilii,

It’s okay to have different opinion I hope.

You are missing an important point here - beside my comments, the current
way this is developed brings confusion.

It would be great if you share your experience on this matter.

Regards,
Dimitar

On Sun, 2 Oct 2022 at 9:31, Vasilii Shpilchin 
wrote:

> All right if you are writing on PHP for 25 years, you noticed the PHP was
> always about high-order web-focused functionality out-of-box. This is one
> of basic benefits of PHP to other general-purpose languages where you can
> write everything you want and you also have to write it since the language
> itself is very basic. I'm for PHP to keep built-in solutions for most
> common problems in the context of the web. Having passe ZCE exam and
> writing just 15 years on php.
>
> On Sun, Oct 2, 2022, 2:19 AM Lokrain  wrote:
>
>> Hello Kamil,
>>
>> I believe that PHP should not try to act as a “framework” that provides
>> you
>> with ready solutions for such cases.
>>
>> Being able to actually modify the default behaviour of some functions
>> through the ini .. is even scarier.
>>
>> For 25 year writing in PHP I never relied on this “magic” for security:)
>>
>> Regards,
>> Dimitar
>>
>> On Sat, 1 Oct 2022 at 18:39, Kamil Tekiela  wrote:
>>
>> > Hi Internals,
>> >
>> > For quite some time now, PHP's sanitize filters have "Rustled My
>> Jimmies".
>> > These filters bother me because I can't really justify their existence.
>> I
>> > can understand that a few of them are sensible and may come in handy,
>> but I
>> > would like to talk about some of these in particular.
>> >
>> > In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to
>> be
>> > a priority due to its confusing name and behaviour. The rest is slightly
>> > less dangerous, but as was pointed out to me in a recent conversation
>> with
>> > a PHP developer, these filters are all very confusing.
>> >
>> > I would like to have some opinions on the following filters. What do you
>> > think we should do with them? Deprecate? Fix? Provide better
>> documentation?
>> >
>> > ---
>> >
>> > *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or
>> encode
>> > special characters."
>> > Now, what does that mean? PHP has two functions for URL encoding:
>> urlencode
>> > used for encoding query-string parts, and rawurlencode used for encoding
>> > any other URL part (two different RFCs are followed by these functions).
>> > Which of these RFCs is applied in this filter? Furthermore, the
>> description
>> > says that "special characters" can be stripped or encoded. Is one of
>> these
>> > actions the default and the other can be selected by a flag or are both
>> > optional? What are these special characters? Are they special in the
>> > context of URL? If so, why did we encode them first? If these are HTML
>> > special characters (there's no single definition of special HTML chars),
>> > then why does this filter encode them if the filter is for URL
>> > sanitization? What does backtick have to do with any of this
>> > (FILTER_FLAG_STRIP_BACKTICK)?
>> >
>> > *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of
>> PHP
>> > 7.3.0)"
>> > This filter was added as a replacement for magic_quotes filter.
>> According
>> > to PHP documentation, addslashes is supposed to be used when injecting
>> PHP
>> > variables into eval'd string. Real-life showed that this function is
>> used
>> > in a lot of places that have nothing to do with PHP's eval. I am not
>> sure
>> > if the sanitize filter is misused in a similar fashion, but judging from
>> > the fact that it was meant as a replacement for magic_quotes, my guess
>> is
>> > that it's very likely still abused.
>> >
>> > *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits
>> and
>> > !#$%&'*+-=?^_`{|}~@.[]."
>> > Which RFC does this adhere to? It strips slashes and quoted parts,
>> doesn't
>> > allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
>> > filter is ok for simple usage, but it isn't true to any known
>> specification
>> > AFAIK.
>> >
>> > *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
>> > ASCII value less than 32, optionally strip or encode other special
>> > characters."
>> > What's the intended purpose of this filter? "Special characters" are
>> still
>> > not clearly defined, but at least it's more clear than
>> > the FILTER_SANITIZE_ENCODED description. Same question about backticks
>> > though: why? Why encode ASCII <32 chars?
>> >
>> > *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
>> > htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled
>> by
>> > setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this
>> filter
>> > is aware of the default_charset and if a sequence of bytes is detected
>> that
>> > makes up an invalid character in the current character set then the
>> entire
>> > string is rejected resulting in a 0-length string. When using 

Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread Vasilii Shpilchin
All right if you are writing on PHP for 25 years, you noticed the PHP was
always about high-order web-focused functionality out-of-box. This is one
of basic benefits of PHP to other general-purpose languages where you can
write everything you want and you also have to write it since the language
itself is very basic. I'm for PHP to keep built-in solutions for most
common problems in the context of the web. Having passe ZCE exam and
writing just 15 years on php.

On Sun, Oct 2, 2022, 2:19 AM Lokrain  wrote:

> Hello Kamil,
>
> I believe that PHP should not try to act as a “framework” that provides you
> with ready solutions for such cases.
>
> Being able to actually modify the default behaviour of some functions
> through the ini .. is even scarier.
>
> For 25 year writing in PHP I never relied on this “magic” for security:)
>
> Regards,
> Dimitar
>
> On Sat, 1 Oct 2022 at 18:39, Kamil Tekiela  wrote:
>
> > Hi Internals,
> >
> > For quite some time now, PHP's sanitize filters have "Rustled My
> Jimmies".
> > These filters bother me because I can't really justify their existence. I
> > can understand that a few of them are sensible and may come in handy,
> but I
> > would like to talk about some of these in particular.
> >
> > In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to
> be
> > a priority due to its confusing name and behaviour. The rest is slightly
> > less dangerous, but as was pointed out to me in a recent conversation
> with
> > a PHP developer, these filters are all very confusing.
> >
> > I would like to have some opinions on the following filters. What do you
> > think we should do with them? Deprecate? Fix? Provide better
> documentation?
> >
> > ---
> >
> > *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or
> encode
> > special characters."
> > Now, what does that mean? PHP has two functions for URL encoding:
> urlencode
> > used for encoding query-string parts, and rawurlencode used for encoding
> > any other URL part (two different RFCs are followed by these functions).
> > Which of these RFCs is applied in this filter? Furthermore, the
> description
> > says that "special characters" can be stripped or encoded. Is one of
> these
> > actions the default and the other can be selected by a flag or are both
> > optional? What are these special characters? Are they special in the
> > context of URL? If so, why did we encode them first? If these are HTML
> > special characters (there's no single definition of special HTML chars),
> > then why does this filter encode them if the filter is for URL
> > sanitization? What does backtick have to do with any of this
> > (FILTER_FLAG_STRIP_BACKTICK)?
> >
> > *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of PHP
> > 7.3.0)"
> > This filter was added as a replacement for magic_quotes filter. According
> > to PHP documentation, addslashes is supposed to be used when injecting
> PHP
> > variables into eval'd string. Real-life showed that this function is used
> > in a lot of places that have nothing to do with PHP's eval. I am not sure
> > if the sanitize filter is misused in a similar fashion, but judging from
> > the fact that it was meant as a replacement for magic_quotes, my guess is
> > that it's very likely still abused.
> >
> > *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits
> and
> > !#$%&'*+-=?^_`{|}~@.[]."
> > Which RFC does this adhere to? It strips slashes and quoted parts,
> doesn't
> > allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
> > filter is ok for simple usage, but it isn't true to any known
> specification
> > AFAIK.
> >
> > *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
> > ASCII value less than 32, optionally strip or encode other special
> > characters."
> > What's the intended purpose of this filter? "Special characters" are
> still
> > not clearly defined, but at least it's more clear than
> > the FILTER_SANITIZE_ENCODED description. Same question about backticks
> > though: why? Why encode ASCII <32 chars?
> >
> > *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
> > htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled
> by
> > setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this
> filter
> > is aware of the default_charset and if a sequence of bytes is detected
> that
> > makes up an invalid character in the current character set then the
> entire
> > string is rejected resulting in a 0-length string. When using this filter
> > as a default filter, see the warning below about setting the default
> flags
> > to 0."
> > Not to be mistaken with FILTER_SANITIZE_SPECIAL_CHARS. As long as it's
> not
> > used with filter_input(), it's the least problematic. We
> > have htmlspecialchars() though, so how useful is this filter?
> >
> > *FILTER_UNSAFE_RAW *- What makes it unsafe? Why isn't this just
> > called FILTER_RAW_STRING? If the value being filtered is 

Re: [PHP-DEV] Sanitize filters

2022-10-02 Thread Lokrain
Hello Kamil,

I believe that PHP should not try to act as a “framework” that provides you
with ready solutions for such cases.

Being able to actually modify the default behaviour of some functions
through the ini .. is even scarier.

For 25 year writing in PHP I never relied on this “magic” for security:)

Regards,
Dimitar

On Sat, 1 Oct 2022 at 18:39, Kamil Tekiela  wrote:

> Hi Internals,
>
> For quite some time now, PHP's sanitize filters have "Rustled My Jimmies".
> These filters bother me because I can't really justify their existence. I
> can understand that a few of them are sensible and may come in handy, but I
> would like to talk about some of these in particular.
>
> In PHP 8.1, we have deprecated FILTER_SANITIZE_STRING which I deemed to be
> a priority due to its confusing name and behaviour. The rest is slightly
> less dangerous, but as was pointed out to me in a recent conversation with
> a PHP developer, these filters are all very confusing.
>
> I would like to have some opinions on the following filters. What do you
> think we should do with them? Deprecate? Fix? Provide better documentation?
>
> ---
>
> *FILTER_SANITIZE_ENCODED *- "URL-encode string, optionally strip or encode
> special characters."
> Now, what does that mean? PHP has two functions for URL encoding: urlencode
> used for encoding query-string parts, and rawurlencode used for encoding
> any other URL part (two different RFCs are followed by these functions).
> Which of these RFCs is applied in this filter? Furthermore, the description
> says that "special characters" can be stripped or encoded. Is one of these
> actions the default and the other can be selected by a flag or are both
> optional? What are these special characters? Are they special in the
> context of URL? If so, why did we encode them first? If these are HTML
> special characters (there's no single definition of special HTML chars),
> then why does this filter encode them if the filter is for URL
> sanitization? What does backtick have to do with any of this
> (FILTER_FLAG_STRIP_BACKTICK)?
>
> *FILTER_SANITIZE_ADD_SLASHES - "*Apply addslashes(). (Available as of PHP
> 7.3.0)"
> This filter was added as a replacement for magic_quotes filter. According
> to PHP documentation, addslashes is supposed to be used when injecting PHP
> variables into eval'd string. Real-life showed that this function is used
> in a lot of places that have nothing to do with PHP's eval. I am not sure
> if the sanitize filter is misused in a similar fashion, but judging from
> the fact that it was meant as a replacement for magic_quotes, my guess is
> that it's very likely still abused.
>
> *FILTER_SANITIZE_EMAIL *- "Remove all characters except letters, digits and
> !#$%&'*+-=?^_`{|}~@.[]."
> Which RFC does this adhere to? It strips slashes and quoted parts, doesn't
> allow IPv6 addresses and doesn't accept RFC 6530 email addresses. This
> filter is ok for simple usage, but it isn't true to any known specification
> AFAIK.
>
> *FILTER_SANITIZE_SPECIAL_CHARS *- "HTML-encode '"<>& and characters with
> ASCII value less than 32, optionally strip or encode other special
> characters."
> What's the intended purpose of this filter? "Special characters" are still
> not clearly defined, but at least it's more clear than
> the FILTER_SANITIZE_ENCODED description. Same question about backticks
> though: why? Why encode ASCII <32 chars?
>
> *FILTER_SANITIZE_FULL_SPECIAL_CHARS *- "Equivalent to calling
> htmlspecialchars() with ENT_QUOTES set. Encoding quotes can be disabled by
> setting FILTER_FLAG_NO_ENCODE_QUOTES. Like htmlspecialchars(), this filter
> is aware of the default_charset and if a sequence of bytes is detected that
> makes up an invalid character in the current character set then the entire
> string is rejected resulting in a 0-length string. When using this filter
> as a default filter, see the warning below about setting the default flags
> to 0."
> Not to be mistaken with FILTER_SANITIZE_SPECIAL_CHARS. As long as it's not
> used with filter_input(), it's the least problematic. We
> have htmlspecialchars() though, so how useful is this filter?
>
> *FILTER_UNSAFE_RAW *- What makes it unsafe? Why isn't this just
> called FILTER_RAW_STRING? If the value being filtered is something other
> than a string, what will this filter return? Integers, floats, booleans and
> nulls are converted to a string, Arrays and objects make the filter fail.
>
> ---
>
> Let's quickly mention the filter flags.
>
> The FILTER_FLAG_STRIP_LOW flag will also remove tabs, carriage returns and
> newlines as these are all less than 32 ASCII codes. When is this useful and
> expected?
>
> The FILTER_FLAG_ENCODE_LOW flag "encodes" ASCII <32 codes presumably into
> HTML entities, although that's not specified anywhere in the PHP manual.
> The word HTML does not appear on the
> https://www.php.net/manual/en/filter.filters.flags.php page. What do these
> characters look like when presented by HTML? When is it ever