Ok- I just realized that the charlint.pl repair UTF-8 options arent actually implemented!
Thanks for the reply. I downloaded charlint.pl along with the Unicode data file, and tried charlint with several options like -U, -u, -e and -E but it fails with:
Line 13: Non-Existing codepoints.
Giving up!
Im sure there are other programs which can do this.
Ive attached an example one. Remember though, its only going to drop
corrupted sequences, it doesnt try to guess what the original undamaged output
looked like.
utf8scrub.tgz
Description: application/compressed
