So I'm cleaning up an old user home directory, archiving it with tar 
--sparse -czf <tarball.tar.gz> <home_dir>, and verifying the archive 
with tar --compare -zf <tarball.tar.gz>.  This user has only about a gig 
or two of data.  I leave and let tar do it's archiving, and come back 
the next day when I have time to watch the compare.

But the compare is taking hours, and I suspect the archive took a lot of 
time as well.  I do an strace, and all I see is something like:

read(4,"\0\0\0\0\0\0\0\0\<repeated>",512)=512
read(4,"\0\0\0\0\0\0\0\0\<repeated>",512)=512
read(4,"\0\0\0\0\0\0\0\0\<repeated>",512)=512

These lines are repeated over and over, and <repeated> meaning there was 
a bunch of NULL characters here, but I don't remember the exact count.  
Probably enough for a 512 byte buffer of NULLs.

I stop the compare and restart it with verbose, and I find it stops on 
the firefox profile's cookies.sqlite file.  du shows the file is only 
about 200MB, but du --apparent-size shows it at just over 1 TB.  "Well, 
there's your problem," I think to myself.
So I delete the file, re-archive and compare, and move on with life.  
But it occurs to me, if I ever have a similar situation where time is 
important and I can't just delete the ridiculously large sparse file, 
what can I do instead?

I'm thinking this was so terribly slow since it was doing 1TB / 512B 
slow read() system calls, and probably yielding to another runnable 
process, thus introducing at least two context switches and more 
latency  (it's a single processor system).  I'm thinking I might reduce 
the cost by increasing the --blocking-factor of the archive, though that 
means re-archiving and extra complexity when restoring the archive.

Does anyone have any better ideas on archiving/verifying very large 
sparse files quickly?

Grazie,
;-Daniel Fussell
--------------------
BYU Unix Users Group 
http://uug.byu.edu/ 

The opinions expressed in this message are the responsibility of their
author.  They are not endorsed by BYU, the BYU CS Department or BYU-UUG. 
___________________________________________________________________
List Info (unsubscribe here): http://uug.byu.edu/mailman/listinfo/uug-list

Reply via email to