bug#36718: uniq treats distinct Korean characters equal
uniq just calls strcoll, and if strcoll (A, B) returns 0 then uniq assumes the lines are equal. So my guess is that your problem has something to do with strcoll, not with coreutils per se.
bug#36718: uniq treats distinct Korean characters equal
Dear all, I found that, when performing uniq on some Korean characters, it treats them as equal (counts as duplicate) although the characters aren't equal. To be precise, it happened to me on the Characters 프 (U+D504) and 틀 (U+D2C0). An example (input, expected output, actual output) can be