Marco Leise wrote:
>I split the discussion with Andrei about the benefit of a
>multi-threaded file copy routine to its own thread.
>This is about copying a file from and to the same HDD - a mechanical
>disk with seek times.
>
>My testing showed that Andrei is correct with the assumption that the  
>kernel can optimize the small reads and writes in a multi-threaded  
>application. I had to use large buffers up to 64 MB with my  
>"single-threaded 100% synchronized writes" version to see the simple  
>multi-threaded version from Johannes Pfau add 4,3% overhead during a
>512 MB copy operation.
>
>Some more things I've experimented with:
>
>- using only system API calls instead of D wrappers:
>   The difference is close to background noise
>
>- direct I/O for writing as used by databases:
>   This worked pretty well, but you may not want to use it for
>   reading as it bypasses the file cache. A file that is already
>   cached would be copied slower as a result.
>
>- memory maps:
>   Kernel memory is shared with userspace. This approach does
>   not allocate memory in the application. It just makes pages
>   of files directly accessible in user space. Once mapped, the
>   whole copy operation comes down to a single 'memcpy' call.
>
>- splice (zero-copy):
>   This is a Linux command that allows memory operations inside
>   the kernel to be controlled from user space. The benefit is
>   that the CPU never copies this memory from kernel to
>   user space. Unfortunately the copy operation goes like this:
>   "source file -> pipe , pipe -> destination file"
>   A pipe is a hard-coded 64KB buffer. So it is not easy to move
>   large chunks of data in a single call to splice(). 512 MB are
>   still divided into 16.000+ calls.
>
>Although splice looks promising it suffers from too many context
>switches. I had the best results with direct I/O and using
>synchronized writes for buffer sizes from 8 MB onwards, but I found
>this to be too complex and probably system dependent. So I settled
>with the memory mapped version, that I rewrote using Phobos instead of
>POSIX calls, so it should run equally well on all platforms and is 5
>lines of code at it's core:
>
>----------------------------------------------------------------------
>
>import std.datetime, std.exception, std.stdio, std.mmfile;
>
>void main(string[] args)
>{
>     if (!enforce(args.length == 3, {
>         stderr.writefln("%s SOURCE DEST", args[0]);
>     })) return;
>
>     auto sw = StopWatch();
>     sw.start();
>
>     auto src = new MmFile(args[1], MmFile.Mode.Read, 0, null, 0);
>     auto dst = new MmFile(args[2], MmFile.Mode.ReadWriteNew,
> src.length,  
>null, src.length);
>     auto data = dst[];
>     data[] = src[];
>     dst.flush();
>
>     sw.stop();
>     writefln("Copied %s bytes in %s msec (%s kB/s)", src.length,  
>sw.peek().msecs,
>             1_000_000 * src.length / (1024 * sw.peek().usecs));
>}
>
>----------------------------------------------------------------------
>
>This leaves it up to the kernel how to interleave disk reads and
>writes.
>
>- Marco
Related link:
http://www.devshed.com/c/a/BrainDump/Advising-the-Linux-Kernel-on-File-IO/

More related information:

Linux maximum readahead buffer is 128KB (but I think that can be
overwritten).

It seems like there's no 'per file' limit for the write buffer. The
only limit seems to be the memory available for caching (for example,
in my case with 3GB of ram 1118MB are available for the write cache)

-- 
Johannes Pfau

Reply via email to