Now it's entirely possible that the underlying support is a few bricks shy of a load --- for instance I see that pg_utf_mblen thinks there are no UTF8 codes longer than 3 bytes whereas your code goes to 4. I'm not an expert on this stuff, so I don't know what the UTF8 spec actually says. But I do think you are fixing the code at the wrong level.
Surely there are UTF-8 codes that are at least 3 bytes. I have a _vague_ recollection that you have to keep escaping and escaping to get up to like 4 bytes for some asian code points?
Chris
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend