-=| p...@cpan.org, 14.11.2019 09:51:20 +0100 |=-
> On Wednesday 13 November 2019 20:37:06 Damyan Ivanov wrote:
> >            my($buffer, $string) = ("", "");
> >            while (read($fh, $buffer, 256, length($buffer))) {
> >                $string .= decode($encoding, $buffer, Encode::FB_QUIET);
> >                # $buffer now contains the unprocessed partial character
> >            }
> 
> This code is dangerous. It can enter into endless loop. Once you read
> invalid UTF-8 sequence, above loop never finish. So if buffer input is
> under user/attacker control you introduce DoS issues.

Sure. A check to prevent that would be in order. I must admit that 
I was very happy to find a solution to the problem that was even in 
the official documentation.

> Instead of FB_QUIET, you should use Encode::STOP_AT_PARTIAL flag. This
> is the flag which you want to use. Encode::decode stops decoding when
> valid UTF-8 sequence is not complete and needs more bytes to read. And
> by default invalid UTF-8 sequences are mapped to Unicode replacement
> character.
> 
> Btw, PerlIO::encoding uses also Encode::STOP_AT_PARTIAL flag to handle
> this situation.
> 
> PS: I know that Encode::STOP_AT_PARTIAL is undocumented, but it is only
> because nobody found time to write documentation for it. It is part of
> Encode API and ready to use...

That would be https://rt.cpan.org/Public/Bug/Display.html?id=67065 
(filed 8 years ago, still open). 

Reply via email to