On 01/24/2014 03:12 AM, Rodrigo Campos wrote: > On Fri, Jan 24, 2014 at 02:59:41AM +0000, Pádraig Brady wrote: >> On 01/24/2014 02:41 AM, Rodrigo Campos wrote: >>> On Fri, Jan 24, 2014 at 01:07:21AM +0000, Pádraig Brady wrote: >>>> On 01/24/2014 12:47 AM, Bernhard Voelker wrote: >>>>> Inspired by a recent post on util-linux ML [1], talking about turning >>>>> a file into a sparse file in-place, i.e. not using a 2-step approach >>>>> like `cp --sparse file file2 && mv file2 file`), I thought, hey, don't >>>>> we have this in coreutils already? >>>> >>>>> b) >>>>> Then, I tried >>>>> $ dd if=file of=file conv=sparse,notrunc >>>>> to avoid truncating the output file. That didn't corrupt the data, >>>>> but the file still was not sparse afterward. >>>>> What's the reason for conv=sparse not to work in this situation? >>>>> BTW: generally, writing to the same file seems to work, e.g.: >>>>> dd if=file of=file conv=ucase,notrunc >>>> >>>> To deallocate the zeros we'd have to use fallocate(FALLOC_FL_PUNCH_HOLE). >>>> Also for efficiency reasons it would be nice to detect holes efficiently. >>>> We can do this with the current fiemap code, but really we should try >>>> and use the new SEEK_HOLE functionality available in the kernel. >>> >>> I looked into this, but I think it won't. I even tried (maybe I did it >>> wrong ?) >>> when implementing the tool to make a file sparse in-place, but it didn't >>> report >>> the '\0's already allocated. >> >> Right, you need to manually detect those, which dd does with is_nul(): >> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/system.h;h=39750e82f#l499 >> >> Then detected runs of zeros could be sparsified with FALLOC_FL_PUNCH_HOLE ? > > You can use FALLOC_FL_PUNCH_HOLE where you know there are zeros, yes. > > But FALLOC_FL_PUNCH_HOLE is not portable. It's not a problem to use this only > on > linux in dd ? > >> I've not tried this myself but would be optimistic is works on some file >> systems. >> If this wasn't supported then we'd stop immediately when if=of, >> or otherwise revert to seeking. >> >> Note also theh caveats noted for the conv=sparse option: >> >> Be careful when using >> this option in conjunction with `conv=notrunc' or >> `oflag=append'. With `conv=notrunc', existing data in the >> output file corresponding to NUL blocks from the input, will >> be untouched. With `oflag=append' the seeks performed will > > Well, I think this promise of them being untouched might block to implement > in-place "sparsify" of files on dd with both flags active. > > > But, a question about policy: is it okay to implement linux-only extensions > here ?
If the current system doesn't support in place sparsify, then be could document that limitation along the same lines as the conv=notrunc case above. If one wanted more portable guarantees about sparsifying a file, then it would be best to use a temporary file anyway. If there are other methods to punch a hole in a file on other systems, they can be added as an option to coreutils without changing the interface. thanks, Pádraig.
