On 2015-03-23 at 12:59, krzaq wrote:
I'd argue that joiner is intuitive enough, but I agree on byChunk. I am also
baffled why this byLine/byChunk madness is necessary at all, it should be
something like
File("path").startsWith(s)
or
File("path").data.startswith(s)
Yeah, that would be useful for example to test magic values at the beginning of
files:
string[] scripts;
foreach (string path; dirEntries(topDir, SpanMode.depth))
if (isFile(path) && File(path).startsWith("#!"))
scripts ~= path;
but that's the simplest case of a bigger problem, because here you just need
the first few bytes, and you don't want to read the whole file, nor anything
more than a sector.
OTOH, there are also file formats like ZIP that put the meta information at the
end of the file and scatter the rest of the data all over the place using
offset information. You don't need to read everything just to grab the
metadata. But, when I had a look at the sources of libzip, I went crazy seeing
all the code performing tons of file seeking, reading into buffers and handling
them[1].
D's std.zip took a simple approach and doesn't deal with that at all; it reads
the whole file into the memory. That makes the algorithm more clearly visible,
but at the same time it makes the module completely useless if you want to
handle archives that are larger than the available memory, and over-the-top if
all you wanted was to extract a single file from the archive or only read the
directory structure.
So, how do you envision something representing a file, i.e. a mix of "BufferedRange" and
"SeekableRange", that would neatly handle buffering and seeking, without you dropping to
stdc IO or wanting to shoot yourself when you look at the code?
[1] for your amusement: http://hg.nih.at/libzip/file/78b8e3fa72a0/lib/zip_open.c