On 12/02/20 05:37, Pavel Stehule wrote: > 2. there can be optional parameter "prefix" with default "\". But with "\u" > it can be compatible with Java or Python.
Java's unicode escape form is one of those early ones that lack a six-digit form, and where any character outside of the basic multilingual plane has to be represented by two four-digit escapes in a row, encoding the two surrogates that would make up the character's representation in UTF-16. Obviously that's an existing form that's out there, so it's not a bad thing to have some kind of support for it, but it's not a great representation to encourage people to use. Python, by contrast, has both \uxxxx and \Uxxxxxxxx where you would use the latter to represent a non-BMP character directly. So the Java and Python schemes should be considered distinct. In Perl, there is a useful extension to regexp substitution where you specify the replacement not as a string or even a string with & and \1 \2 ... magic, but as essentially a lambda that is passed the match and returns a computed replacement. That makes conversions of the sort discussed here generally trivial to implement. Would it be worth considering to add something of general utility like that, and then there could be a small library of pure SQL functions (or a wiki page or GitHub gist) covering a bunch of the two dozen representations on that page linked above? Regards, -Chap