Greetings, Many years ago, I developed a set of patches to add a number of features to cp and md5sum including multi-threading, partial copies, direct i/o, asynchronous read/writes, checksum during copy, multi-host ssh-/MPI-based copies, Lustre support, preallocation, files over stdin, and stats output. These offer significant performance benefits along with greater flexibility for use in other purposes (in particular, the partial copy and files over stdin features). You can see details here:
https://pkolano.github.io/projects/mutil.html The code is stable and has been used for almost 10 years in production at the NASA Advanced Supercomputing division to transfer many, many PBs of scientific data. It is also used as one of the underlying transports in a separate project (https://pkolano.github.io/projects/shift.html) to provide high performance tar creation/extraction and integrity verification/rectification. I do not have time to keep it in sync with every coreutils release so it is still based on 8.22, but is usually straightforward to bring it up to date. Just wanted to inquire if there was any interest in incorporating some/all of these patches into the mainline cp/md5sum code so that the greater coreutils base of users can benefit from them. I can assist in updating the code to the latest coreutils, pruning out features of interest, etc. Please let me know if there is any interest. thanks, --Paul