On Tuesday, 17 June 2014 at 06:44:40 UTC, Jacob Carlborg wrote:
On 17/06/14 04:27, jicman wrote:

Greetings!

I have a bunch of files plain ASCII, UTF8 and UTF16 with and without BOM (Byte Order Mark). I had, "I thought", a nice way of figuring out what type of encoding the file was (ASCII, UTF8 or UTF16) when the BOM was missing, by reading the content and applying the std.utf.validate function to the char[] or, wchar[] string. The problem is that lately, I am hitting into a wall with the "array cast misalignment" when casting
wchar[].  ie.

auto text = cast(string) file.read();
wchar[] temp = cast(wchar[]) text;

How about casting to "wchar[]" directory, instead of going through "string".

What would be the correct process to find out a text file encoding?

Any help would be greatly appreciated. This is the code that I have
right now...

I don't know if you use Tango [1], but it has a module [2] to help with this sort of things.

[1] http://dsource.org/projects/tango
[2] http://dsource.org/projects/tango/docs/stable/tango.io.UnicodeFile.html

Thanks, but can't use Tango. Historically, Tango (originally Mango) and Phobos did not play well, and by the time Tango came along, my project was done totally using Phobos, so I have to continue to use Phobos.

josé

Reply via email to