I'm editing UTF-8 files on linux and got bitten by the unexpected BOM
character being inserted at the front of the file. As others [1][2] have
said, "UTF-8 with BOM" and "UTF-8 without BOM" would be less confusing.
(As SciTE doesn't actually write the cookie there seems no need to
mention it.)

Now I've stripped the BOM character out, when I reopen the file it opens
as 8-bit, not UTF-8 (though it seems to be understanding it as UTF-8,
and doesn't corrupt if I then re-save). So it seems 8-bit means UTF-8 as
well, at least with my settings [3]. I think this is what someone meant
in the Dec 2006 thread when they suggested "8-bit" should be called
"default"?

Darren

[1]:
http://www.lyra.org/pipermail/scite-interest/2006-December/008325.html

[2]:
http://www.mail-archive.com/[email protected]/msg02649.html

[3]:

# Internationalisation
# Japanese input code page 932 and ShiftJIS character set 128
#code.page=932
#character.set=128
# Unicode
code.page=65001
#code.page=0
#character.set=204
# Required for Unicode to work on GTK+:
LC_CTYPE=en_US.UTF-8
output.code.page=65001
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest

Reply via email to