Thanks for the reply.  I downloaded charlint.pl along with the Unicode
data file, and tried charlint with several options like -U, -u, -e and -E
but it fails with:

Line 13: Non-Existing codepoints.
Giving up!


Ok- I just realized that the charlint.pl repair UTF-8 options arent actually implemented!
Im sure there are other programs which can do this.
Ive attached an example one. Remember though, its only going to drop
corrupted sequences, it doesnt try to guess what the original undamaged output
looked like.


Attachment: utf8scrub.tgz
Description: application/compressed

Reply via email to