Re: Utility to report and repair broken surrogate pairs in UTF-16 text

Martin J. Dürst Thu, 04 Nov 2010 04:30:13 -0700

There is charlint (http://www.w3.org/International/charlint/), which isbased on UTF-8. It may be possible to adapt it to UTF-16/32.


Regards,   Martin.


On 2010/11/04 4:37, Jim Monty wrote:

Is there a utility, preferably open source and written in C, that inspects
UTF-16/UTF-16BE/UTF-16LE text and identifies broken surrogate pairs and illegal
characters? Ideally, the utility can both report illegal code units and "repair"
them by replacing them with U+FFFD.

Jim Monty


--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:[email protected]

Re: Utility to report and repair broken surrogate pairs in UTF-16 text

Reply via email to