On Friday, 11 January 2019 at 19:45:05 UTC, Head Scratcher wrote:
How can I read the file and convert the string into proper UTF-8 in memory without an exception?

Use regular read() instead of readText, and then convert it use another function.

Phobos has std.encoding which offers a transcode function:

http://dpldocs.info/experimental-docs/std.encoding.transcode.html

you would cast to the input type:

---
import std.encoding;
import std.file;

void main() {
        string s;
        // the read here replaces your readText
        // and the cast tells what encoding it has now
        transcode(cast(Latin1String) read("ooooo.d"), s);
        import std.stdio;
        // and after that, the utf-8 string is in s
        writeln(s);
}
---


Or, since I didn't like the Phobos module for my web scrape needs, I made my own:

https://github.com/adamdruppe/arsd/blob/master/characterencodings.d

Just drop that file in your build and call this function:

http://dpldocs.info/experimental-docs/arsd.characterencodings.convertToUtf8Lossy.html

---
import arsd.characterencodings;
import std.file;

void main() {
     string s = convertToUtf8Lossy(read("ooooo.d"), "iso_8859-1");
     // you can now use s
}
---

just changing the encoding string to whatever it happens to be right now.



But it is possible neither my module nor the Phobos one has the encoding you need...

Reply via email to