On Friday, 11 January 2019 at 19:45:05 UTC, Head Scratcher wrote:
How can I read the file and convert the string into proper
UTF-8 in memory without an exception?
Use regular read() instead of readText, and then convert it use
another function.
Phobos has std.encoding which offers a transcode function:
http://dpldocs.info/experimental-docs/std.encoding.transcode.html
you would cast to the input type:
---
import std.encoding;
import std.file;
void main() {
string s;
// the read here replaces your readText
// and the cast tells what encoding it has now
transcode(cast(Latin1String) read("ooooo.d"), s);
import std.stdio;
// and after that, the utf-8 string is in s
writeln(s);
}
---
Or, since I didn't like the Phobos module for my web scrape
needs, I made my own:
https://github.com/adamdruppe/arsd/blob/master/characterencodings.d
Just drop that file in your build and call this function:
http://dpldocs.info/experimental-docs/arsd.characterencodings.convertToUtf8Lossy.html
---
import arsd.characterencodings;
import std.file;
void main() {
string s = convertToUtf8Lossy(read("ooooo.d"), "iso_8859-1");
// you can now use s
}
---
just changing the encoding string to whatever it happens to be
right now.
But it is possible neither my module nor the Phobos one has the
encoding you need...