Hello Andreas, Samuel and list, sorry to pick up such an old thread, but I stumbled upon it while looking for an efficient way to "re-sparse" files that contain a lot of zero blocks but 1) had already been expanded or 2) are being expanded due to pipes.
On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote: > Samuel Thibault <[email protected]> writes: > > > Some time ago, I wrote a conv=sparse option for dd, attached is the > > patch. > > How is it different from cp --sparse=always? I'd say in enough ways to make such an option highly desirable. a) "dd" will maintain an existing of=target file including the inode number, thus respecting existing hard links. "cp" will depending on the other options given (e.g. "-a") maintain or break existing hard links to an existing target file. b) "dd" could read a stream from a device or stdin and write it directly to a sparse file. no need to "dd" from e.g. a block device to a file and afterwards do a "cp --sparse=always file sparse-file". this will save a lot of disk space, io operations and time. example transcript for a) : 1 hlan...@jukebox:~/sparse$ ls -lis 2 total 1984 3 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse 4 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse2 5 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse3 6 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse4 7 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse 8 hlan...@jukebox:~/sparse$ cp sparse non-sparse 9 hlan...@jukebox:~/sparse$ ls -lis 10 total 0 11 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse 12 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse2 13 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse3 14 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse4 15 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse 16 hlan...@jukebox:~/sparse$ dd if=/dev/zero bs=1 count=500000 of=non-sparse 17 500000+0 records in 18 500000+0 records out 19 500000 bytes (500 kB) copied, 3.96621 s, 126 kB/s 20 hlan...@jukebox:~/sparse$ ls -lis 21 total 1984 22 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse 23 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2 24 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3 25 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4 26 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse 27 hlan...@jukebox:~/sparse$ cp -a sparse non-sparse 28 hlan...@jukebox:~/sparse$ ls -lis 29 total 1488 30 114691 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 non-sparse 31 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2 32 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3 33 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4 34 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse 35 hlan...@jukebox:~/sparse$ As you see in line 30, a new "non-sparse" file has been created with a different inode number while the link count of the other "non-sparse*" files has be reduced. I'd very much like to see the patch make it into "dd", though I think it might be better to integrate that function as "oflag=sparse" instead of "conv=sparse". After all you don't convert data but change the way the output is done. cheers -henrik
