On Tuesday 26 September 2017 14:20:33 Night Light wrote: > That's a nifty function. Good to know that it can be reversed.
UTF-8 encode is a function which for any number from the range 0..1114111 assign unique sequence of the numbers 0..255. Therefore this function has a well defined inverse - UTF-8 decode function. As a sequence of numbers from the range 0..1114111 via UTF-8 encode function produce sequence of the numbers in range 0..255 (length of sequence would be larger) it can be again used as as input for the UTF-8 encode function. And because output from the UTF-8 encode has well defined inverse, you can easily reconstruct also inverse of the composition of the more UTF-8 functions. Take string $str and following pass: decode('UTF-8', decode('UTF-8', encode('UTF-8', encode('UTF-8', $str)))) eq $str; To have exactly correct result, you just need to know how many times you composed repeated call to UTF-8 encode function.