On 2015-03-23 at 12:59, krzaq wrote:
I'd argue that joiner is intuitive enough, but I agree on byChunk. I am also 
baffled why this byLine/byChunk madness is necessary at all, it should be 
something like
File("path").startsWith(s)
or
File("path").data.startswith(s)

Yeah, that would be useful for example to test magic values at the beginning of 
files:

    string[] scripts;
    foreach (string path; dirEntries(topDir, SpanMode.depth))
        if (isFile(path) && File(path).startsWith("#!"))
            scripts ~= path;

but that's the simplest case of a bigger problem, because here you just need 
the first few bytes, and you don't want to read the whole file, nor anything 
more than a sector.

OTOH, there are also file formats like ZIP that put the meta information at the 
end of the file and scatter the rest of the data all over the place using 
offset information. You don't need to read everything just to grab the 
metadata. But, when I had a look at the sources of libzip, I went crazy seeing 
all the code performing tons of file seeking, reading into buffers and handling 
them[1].

D's std.zip took a simple approach and doesn't deal with that at all; it reads 
the whole file into the memory. That makes the algorithm more clearly visible, 
but at the same time it makes the module completely useless if you want to 
handle archives that are larger than the available memory, and over-the-top if 
all you wanted was to extract a single file from the archive or only read the 
directory structure.

So, how do you envision something representing a file, i.e. a mix of "BufferedRange" and 
"SeekableRange", that would neatly handle buffering and seeking, without you dropping to 
stdc IO or wanting to shoot yourself when you look at the code?


[1] for your amusement: http://hg.nih.at/libzip/file/78b8e3fa72a0/lib/zip_open.c

Reply via email to