Pali, thank you very much for your answer. I am using the
Encode::decode('UTF-8', ...) function now instead of touching the flag.
Though I am not sure if a routine becomes better (more robust) if it
accepts utf8 instead the stricter utf-8. Or if it is better if it only
accepts strict utf-8?

On 09.11.2016 16:20, p...@cpan.org wrote:

> Fix is really simple. Either decode utf8 octets in $html back to wide
> characters (via utf8::decode($html)) or tell STDOUT that it does not
> expect wide strings, but raw octets (= remove binmode STDOUT, ":utf8";)
> line.
> 
> Again... think about it, why both my proposed fixes are working.

I am near to understand it. But I wonder why I have to think about utf-8
in this case? I expected that perl is doing it right automagically:

I open the filehandle to write into the variable using :encoding(UTF-8).
So perl should know what it is storing inside the variable. If I print
this to STDOUT (binmoded to utf-8) it should automatically print the
content of the variable the right way.

So why does it not know about the content being utf-8? If I am using
"use utf8" and define an utf-8 data containing variable in the source
code, perl knows to handle this the correct way, too, without the need
to decode anything manually.

Probably perl does not know about the content of the variable. Only the
filehandle is set to write utf-8 data. The content of the variable is
only bytes, similar to a file that I am writing bytes into. If I read
the file again, I have to open it as utf-8. Alternatively I guess I can
open it as raw bytes and decode the data afterwards to utf-8? The latter
way would be the same as the decoding of the variable content?

Ciao
Gert



Reply via email to