I first try to load the file as utf8 (or some 8kb at the start of
it) with encoding exceptions turned on, if I catch an exception,
I reload it as ansi, otherwise I assume it's valid utf8.
On Tuesday, 22 July 2014 at 09:50:00 UTC, Sam Hu wrote:
Greetings!
As subjected,how can I know whether a file is in UTF8 encoding
or ansi?
Thanks for the help in advance.
Regards,
Sam
Sorry,I mean by by code,for example,when I try to read a file
content and printed to a text control in
Read the BOM ?
module main;
import std.stdio;
enum Encoding
{
UTF7,
UTF8,
UTF32,
Unicode,
BigEndianUnicode,
ASCII
};
Encoding GetFileEncoding(string fileName)
{
import std.file;
auto bom = cast(ubyte[]) read(fileName, 4);
On Tuesday, 22 July 2014 at 11:59:34 UTC, Alexandre wrote:
Read the BOM ?
module main;
import std.stdio;
enum Encoding
{
UTF7,
UTF8,
UTF32,
Unicode,
BigEndianUnicode,
ASCII
};
Encoding GetFileEncoding(string fileName)
{
import std.file;
Note that BOMs are optional and may be not presented in Unicode
file. Also presence of leading bytes which look BOM does not
necessarily mean that file is encoded in some kind of Unicode.
http://www.architectshack.com/TextFileEncodingDetector.ashx
On Tuesday, 22 July 2014 at 15:53:23 UTC, FreeSlave wrote:
Note that BOMs are optional and may be not presented in Unicode
file. Also presence of leading bytes which look BOM does not
necessarily mean that file is encoded in some kind