Let's say I live in a completely ISO 8859/etc.-free world, that I don't care about the existance of any other character representation than UTF-8, and that I am therefore absolutely not interested in any form of character encoding conversion function.
How can I then switch between a "byte string" and a "character string" in Perl without ever actually touching the stored bytes of the string? All I want to change is the UTF-8 flag associated with a string that tells the regular expression engine, for example, whether /./ matches just a single byte or an entire UTF-8 character? It seems the low-level Perl functions utf8::upgrade(), utf8::downgrade(), utf8::encode(), and utf8::decode() (see "man 3 utf8") are not usable, because they interpret and convert any binary string as if it was an ISO 8859-1 string. I don't want to load any huge encoding packages such as "use encode 'utf8';" or "use Encoding;", because I don't need and want any character encoding conversion functions. All I want to change is a simple flag. Unfortunately, the documentation is far from clear on how to do this, and my experimentation leads to strange results that look like strings going through several ISO 8859-1 to UTF-8 conversion steps (whereas I want zero of these). Any help? Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain