On Fri, Jan 24, 2014 at 02:59:41AM +0000, Pádraig Brady wrote: > On 01/24/2014 02:41 AM, Rodrigo Campos wrote: > > On Fri, Jan 24, 2014 at 01:07:21AM +0000, Pádraig Brady wrote: > >> On 01/24/2014 12:47 AM, Bernhard Voelker wrote: > >>> Inspired by a recent post on util-linux ML [1], talking about turning > >>> a file into a sparse file in-place, i.e. not using a 2-step approach > >>> like `cp --sparse file file2 && mv file2 file`), I thought, hey, don't > >>> we have this in coreutils already? > >> > >>> b) > >>> Then, I tried > >>> $ dd if=file of=file conv=sparse,notrunc > >>> to avoid truncating the output file. That didn't corrupt the data, > >>> but the file still was not sparse afterward. > >>> What's the reason for conv=sparse not to work in this situation? > >>> BTW: generally, writing to the same file seems to work, e.g.: > >>> dd if=file of=file conv=ucase,notrunc > >> > >> To deallocate the zeros we'd have to use fallocate(FALLOC_FL_PUNCH_HOLE). > >> Also for efficiency reasons it would be nice to detect holes efficiently. > >> We can do this with the current fiemap code, but really we should try > >> and use the new SEEK_HOLE functionality available in the kernel. > > > > I looked into this, but I think it won't. I even tried (maybe I did it > > wrong ?) > > when implementing the tool to make a file sparse in-place, but it didn't > > report > > the '\0's already allocated. > > Right, you need to manually detect those, which dd does with is_nul(): > http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/system.h;h=39750e82f#l499 > > Then detected runs of zeros could be sparsified with FALLOC_FL_PUNCH_HOLE ?
You can use FALLOC_FL_PUNCH_HOLE where you know there are zeros, yes. But FALLOC_FL_PUNCH_HOLE is not portable. It's not a problem to use this only on linux in dd ? > I've not tried this myself but would be optimistic is works on some file > systems. > If this wasn't supported then we'd stop immediately when if=of, > or otherwise revert to seeking. > > Note also theh caveats noted for the conv=sparse option: > > Be careful when using > this option in conjunction with `conv=notrunc' or > `oflag=append'. With `conv=notrunc', existing data in the > output file corresponding to NUL blocks from the input, will > be untouched. With `oflag=append' the seeks performed will Well, I think this promise of them being untouched might block to implement in-place "sparsify" of files on dd with both flags active. But, a question about policy: is it okay to implement linux-only extensions here ? Thanks a lot, Rodrigo
