On Monday, 16 January 2017 at 14:47:23 UTC, Era Scarecrow wrote:
On Sunday, 15 January 2017 at 19:48:04 UTC, Nestor wrote:
I see. So correcting my original doubt:
How could I parse an UTF16LE file line by line (producing a
proper string in each iteration) without loading the entire
file into memory?
Could... roll your own? Although if you wanted it to be UTF-8
output instead would require a second pass or better yet
changing how the i iterated.
char[] getLine16LE(File inp = stdin) {
static char[1024*4] buffer; //4k reusable buffer, NOT
thread safe
int i;
while(inp.rawRead(buffer[i .. i+2]) != null) {
if (buffer[i] == '\n')
break;
i+=2;
}
return buffer[0 .. i];
}
Thanks, but unfortunately this function does not produce proper
UTF8 strings, as a matter of fact the output even starts with the
BOM. Also it doen't handle CRLF, and even for LF terminated lines
it doesn't seem to work for lines other than the first.
I guess I have to code encoding detection, buffered read, and
transcoding by hand, the only problem is that the result could be
sub-optimal, which is why I was looking for a built-in solution.