I'm editing UTF-8 files on linux and got bitten by the unexpected BOM character being inserted at the front of the file. As others [1][2] have said, "UTF-8 with BOM" and "UTF-8 without BOM" would be less confusing. (As SciTE doesn't actually write the cookie there seems no need to mention it.)
Now I've stripped the BOM character out, when I reopen the file it opens as 8-bit, not UTF-8 (though it seems to be understanding it as UTF-8, and doesn't corrupt if I then re-save). So it seems 8-bit means UTF-8 as well, at least with my settings [3]. I think this is what someone meant in the Dec 2006 thread when they suggested "8-bit" should be called "default"? Darren [1]: http://www.lyra.org/pipermail/scite-interest/2006-December/008325.html [2]: http://www.mail-archive.com/[email protected]/msg02649.html [3]: # Internationalisation # Japanese input code page 932 and ShiftJIS character set 128 #code.page=932 #character.set=128 # Unicode code.page=65001 #code.page=0 #character.set=204 # Required for Unicode to work on GTK+: LC_CTYPE=en_US.UTF-8 output.code.page=65001 _______________________________________________ Scite-interest mailing list [email protected] http://mailman.lyra.org/mailman/listinfo/scite-interest
