On Tuesday 26 September 2017 14:20:33 Night Light wrote:
> That's a nifty function. Good to know that it can be reversed.

UTF-8 encode is a function which for any number from the range
0..1114111 assign unique sequence of the numbers 0..255.

Therefore this function has a well defined inverse - UTF-8 decode
function.

As a sequence of numbers from the range 0..1114111 via UTF-8 encode
function produce sequence of the numbers in range 0..255 (length of
sequence would be larger) it can be again used as as input for the UTF-8
encode function.

And because output from the UTF-8 encode has well defined inverse, you
can easily reconstruct also inverse of the composition of the more UTF-8
functions.

Take string $str and following pass:

decode('UTF-8', decode('UTF-8', encode('UTF-8', encode('UTF-8', $str)))) eq 
$str;

To have exactly correct result, you just need to know how many times you
composed repeated call to UTF-8 encode function.

Reply via email to