On 01/24/2014 02:41 AM, Rodrigo Campos wrote: > On Fri, Jan 24, 2014 at 01:07:21AM +0000, Pádraig Brady wrote: >> On 01/24/2014 12:47 AM, Bernhard Voelker wrote: >>> Inspired by a recent post on util-linux ML [1], talking about turning >>> a file into a sparse file in-place, i.e. not using a 2-step approach >>> like `cp --sparse file file2 && mv file2 file`), I thought, hey, don't >>> we have this in coreutils already? >> >>> b) >>> Then, I tried >>> $ dd if=file of=file conv=sparse,notrunc >>> to avoid truncating the output file. That didn't corrupt the data, >>> but the file still was not sparse afterward. >>> What's the reason for conv=sparse not to work in this situation? >>> BTW: generally, writing to the same file seems to work, e.g.: >>> dd if=file of=file conv=ucase,notrunc >> >> To deallocate the zeros we'd have to use fallocate(FALLOC_FL_PUNCH_HOLE). >> Also for efficiency reasons it would be nice to detect holes efficiently. >> We can do this with the current fiemap code, but really we should try >> and use the new SEEK_HOLE functionality available in the kernel. > > I looked into this, but I think it won't. I even tried (maybe I did it wrong > ?) > when implementing the tool to make a file sparse in-place, but it didn't > report > the '\0's already allocated.
Right, you need to manually detect those, which dd does with is_nul(): http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/system.h;h=39750e82f#l499 Then detected runs of zeros could be sparsified with FALLOC_FL_PUNCH_HOLE ? I've not tried this myself but would be optimistic is works on some file systems. If this wasn't supported then we'd stop immediately when if=of, or otherwise revert to seeking. Note also theh caveats noted for the conv=sparse option: Be careful when using this option in conjunction with `conv=notrunc' or `oflag=append'. With `conv=notrunc', existing data in the output file corresponding to NUL blocks from the input, will be untouched. With `oflag=append' the seeks performed will be ineffective. Similarly, when the output is a device rather than a file, NUL input blocks are not copied, and therefore this option is most useful with virtual or pre zeroed devices. dd specifying if=of without conv=notrunc couldn't error() for example when dealing with devices. thanks, Pádraig.
