On Mon, Dec 19, 2016 at 3:04 PM, Karl Williamson <[email protected]> wrote:
> It seems counterintuitive to me that the two byte sequence C0 80 should be > replaced by 2 replacement characters under best practices, or that E0 80 80 > should also be replaced by 2. Each sequence was legal in early Unicode > versions, and it seems that it would be best to treat them as each a single > sequence, replacing by a single replacement character. > Looks like the ICU converters and string-iteration macros do what you expect (if I understand your expectations). markus

