Re: [Wikitech-l] iconv/mb_convert expert needed to advise on a patch for T39665

2015-05-06 Thread Stas Malyshev
Hi! Don't have a good solution, but some ideas: 1. There's http://php.net/manual/en/class.uconverter.php which uses ICU convertor. It can recognize tons of charsets/encodings (http://site.icu-project.org/charts/charset) and can filter out bad characters, though the way to achieve it may be a bit

[Wikitech-l] iconv/mb_convert expert needed to advise on a patch for T39665

2015-05-05 Thread Bryan Davis
I made a patch [0] for T39665 [1] about 6 months ago. It has been rotting in gerrit since. The core bug is related to glibc's iconv implementation and PHP (and HHVM as well I think). To work around the iconv bug I wrote a little helper function that will use mb_convert_encoding() instead if it is