Re: How to know whether a file's encoding is ansi or utf8?

2014-07-24 Thread Kagamin via Digitalmars-d-learn
I first try to load the file as utf8 (or some 8kb at the start of it) with encoding exceptions turned on, if I catch an exception, I reload it as ansi, otherwise I assume it's valid utf8.

Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Sam Hu via Digitalmars-d-learn
On Tuesday, 22 July 2014 at 09:50:00 UTC, Sam Hu wrote: Greetings! As subjected,how can I know whether a file is in UTF8 encoding or ansi? Thanks for the help in advance. Regards, Sam Sorry,I mean by by code,for example,when I try to read a file content and printed to a text control in

Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Alexandre via Digitalmars-d-learn
Read the BOM ? module main; import std.stdio; enum Encoding { UTF7, UTF8, UTF32, Unicode, BigEndianUnicode, ASCII }; Encoding GetFileEncoding(string fileName) { import std.file; auto bom = cast(ubyte[]) read(fileName, 4);

Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Sam Hu via Digitalmars-d-learn
On Tuesday, 22 July 2014 at 11:59:34 UTC, Alexandre wrote: Read the BOM ? module main; import std.stdio; enum Encoding { UTF7, UTF8, UTF32, Unicode, BigEndianUnicode, ASCII }; Encoding GetFileEncoding(string fileName) { import std.file;

Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread FreeSlave via Digitalmars-d-learn
Note that BOMs are optional and may be not presented in Unicode file. Also presence of leading bytes which look BOM does not necessarily mean that file is encoded in some kind of Unicode.

Re: How to know whether a file's encoding is ansi or utf8?

2014-07-22 Thread Alexandre via Digitalmars-d-learn
http://www.architectshack.com/TextFileEncodingDetector.ashx On Tuesday, 22 July 2014 at 15:53:23 UTC, FreeSlave wrote: Note that BOMs are optional and may be not presented in Unicode file. Also presence of leading bytes which look BOM does not necessarily mean that file is encoded in some kind