Re: Backup Size

2009-08-11 Thread Roland Smith
On Mon, Aug 10, 2009 at 09:24:19PM -0500, Jay Hall wrote:
 
 On Aug 10, 2009, at 12:09 PM, Roland Smith wrote:
 
  The fact that you are using tar also plays a part. Tar has some  
  overhead to
  store information about the files it contains.
 
 Is it possible to calculate the amount of overhead tar will use?

Just execute the tar command, and dump the output to /dev/null through dd:
tar -cf - /etc |dd of=/dev/null
tar: Removing leading '/' from member names
3160+0 records in
3160+0 records out
1617920 bytes transferred in 0.057690 secs (28045115 bytes/sec)

This will give you the exact size without writing anything to disk.

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpl10oTLkN2E.pgp
Description: PGP signature


Re: Backup Size

2009-08-11 Thread Jay Hall


On Aug 11, 2009, at 12:09 PM, Roland Smith wrote:


Just execute the tar command, and dump the output to /dev/null  
through dd:

tar -cf - /etc |dd of=/dev/null
tar: Removing leading '/' from member names
3160+0 records in
3160+0 records out
1617920 bytes transferred in 0.057690 secs (28045115 bytes/sec)

This will give you the exact size without writing anything to disk.


Thanks.  I had not thought of that.


Jay
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Backup Size

2009-08-10 Thread Jay Hall

I am sure there is an easy explanation for this, but I cannot find it.

I am backing up my /etc directory using the following command.

tar -cvf - /etc | dd of=/dev/nsa1 obs=10240

When the command completes, I receive the following message.

3080+0 records in
154+0 records out
1576960 bytes transferred in 0.179921 secs (8764740 bytes/sec)

What concerns me is when running du -h /etc, the size of the folder is  
reported as 1.7M.


Is the number of bytes written to the tape less than the reported size  
of the directory because of the way the files are written to the  
tape?  If so, how can the amount of space used be calculated?


Thanks for your help.


Jay
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Backup Size

2009-08-10 Thread Roland Smith
On Mon, Aug 10, 2009 at 10:21:58AM -0500, Jay Hall wrote:
 I am sure there is an easy explanation for this, but I cannot find it.
 
 I am backing up my /etc directory using the following command.
 
 tar -cvf - /etc | dd of=/dev/nsa1 obs=10240

Why are you using dd? Tar was originally built to write to tape:

tar -cvf /dev/nsa1 /etc

 When the command completes, I receive the following message.
 
 3080+0 records in
 154+0 records out
 1576960 bytes transferred in 0.179921 secs (8764740 bytes/sec)
 
 What concerns me is when running du -h /etc, the size of the folder is  
 reported as 1.7M.

du rounds sizes up to the filesystem block size, which is 512 bytes by
default. So you'll bound to see differences. And see below.
 
 Is the number of bytes written to the tape less than the reported size  
 of the directory because of the way the files are written to the  
 tape?  If so, how can the amount of space used be calculated?
 
The fact that you are using tar also plays a part. Tar has some overhead to
store information about the files it contains.

If you want to know the total size of all files:

find /etc -type f -ls | awk \
'BEGIN {t=0; c=0}; END {print t  bytes in  c  files}; {t=t+$7; c++}'

This returns '1320254 bytes in 362 files' in my case, while the tar/dd combo
returns 1617920 bytes. The difference is the overhead for tar.

If you really want to check if tar does the right thing, restore the backup to
a different place (e.g. /tmp/etc) and check with diff:

# rewind your tape to the correct position (not shown)
cd /tmp; tar xvf /dev/nsa1
diff -ru /etc /tmp/etc

The diff command should give no output.

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgp0PgunzNVEO.pgp
Description: PGP signature


Re: Backup Size

2009-08-10 Thread Dan Nelson
In the last episode (Aug 10), Jay Hall said:
 I am sure there is an easy explanation for this, but I cannot find it.
 
 I am backing up my /etc directory using the following command.
 
 tar -cvf - /etc | dd of=/dev/nsa1 obs=10240
 
 When the command completes, I receive the following message.
 
 3080+0 records in
 154+0 records out
 1576960 bytes transferred in 0.179921 secs (8764740 bytes/sec)
 
 What concerns me is when running du -h /etc, the size of the folder is
 reported as 1.7M.
 
 Is the number of bytes written to the tape less than the reported size of
 the directory because of the way the files are written to the tape?  If
 so, how can the amount of space used be calculated?

du prints the number of disk blocks used by a directory tree.  Your
filesystem probably was formatted with 16k blocks and 2k fragment size; This
means that the minimum space du will report for each file is 2k.  Tar uses
512-byte blocks internally, so a directory with a lot of small files in it
(/etc for example) will take up less space as a tar file than on disk.
 
Try running du -ha /etc, to see what du reports for each file under /etc.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Backup Size

2009-08-10 Thread Roland Smith
On Mon, Aug 10, 2009 at 06:25:28PM +0200, Roland Smith wrote:
 On Mon, Aug 10, 2009 at 10:21:58AM -0500, Jay Hall wrote:
  I am sure there is an easy explanation for this, but I cannot find it.
  
  I am backing up my /etc directory using the following command.
  
  tar -cvf - /etc | dd of=/dev/nsa1 obs=10240
 
 Why are you using dd? Tar was originally built to write to tape:
 
 tar -cvf /dev/nsa1 /etc
 
  When the command completes, I receive the following message.
  
  3080+0 records in
  154+0 records out
  1576960 bytes transferred in 0.179921 secs (8764740 bytes/sec)
  
  What concerns me is when running du -h /etc, the size of the folder is  
  reported as 1.7M.
 
 du rounds sizes up to the filesystem block size, which is 512 bytes by
 default. So you'll bound to see differences. And see below.

Oops, scratch that. Brain fart. du -h uses kilo-, mega- etc. bytes according
to du(1).
  
  Is the number of bytes written to the tape less than the reported size  
  of the directory because of the way the files are written to the  
  tape?  If so, how can the amount of space used be calculated?
  
 The fact that you are using tar also plays a part. Tar has some overhead to
 store information about the files it contains.
 
 If you want to know the total size of all files:
 
 find /etc -type f -ls | awk \
 'BEGIN {t=0; c=0}; END {print t  bytes in  c  files}; {t=t+$7; c++}'
 
 This returns '1320254 bytes in 362 files' in my case, while the tar/dd combo
 returns 1617920 bytes. The difference is the overhead for tar.
 
 If you really want to check if tar does the right thing, restore the backup to
 a different place (e.g. /tmp/etc) and check with diff:
 
 # rewind your tape to the correct position (not shown)
 cd /tmp; tar xvf /dev/nsa1
 diff -ru /etc /tmp/etc
 
 The diff command should give no output.
 
 Roland



-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpng3afiFgwX.pgp
Description: PGP signature


Re: Backup Size

2009-08-10 Thread Polytropon
On Mon, 10 Aug 2009 10:21:58 -0500, Jay Hall jh...@socket.net wrote:
 What concerns me is when running du -h /etc, the size of the folder is  
 reported as 1.7M.

Excuse me for being pedantic, but please try to use the correct
terminology. There are no folders in FreeBSD. The concept you
are refering to is called a directory. You don't call the files
in /etc sheets of paper, do you? :-)



-- 
Polytropon
From Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Backup Size

2009-08-10 Thread Jay Hall


On Aug 10, 2009, at 12:09 PM, Roland Smith wrote:

The fact that you are using tar also plays a part. Tar has some  
overhead to

store information about the files it contains.


Is it possible to calculate the amount of overhead tar will use?

Thanks,


Jay
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Backup Size

2009-08-10 Thread Mel Flynn
On Monday 10 August 2009 18:24:19 Jay Hall wrote:
 On Aug 10, 2009, at 12:09 PM, Roland Smith wrote:
  The fact that you are using tar also plays a part. Tar has some
  overhead to
  store information about the files it contains.

 Is it possible to calculate the amount of overhead tar will use?

Difficult. 512 bytes per entry + 1024 (EOF). See man 5 tar. But since files 
will be padded there is some extra overhead. Also, it is hard to calculate 
hard links and sparse files. Tar will handle these correctly (i.e. preserve 
hard links and detect sparse files and try not archive blocks of nulls) but 
it is hard to calculate the size because of this before the archive operation 
because of this.
-- 
Mel
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Backup Size

2009-08-10 Thread Jay Hall




Difficult. 512 bytes per entry + 1024 (EOF). See man 5 tar. But  
since files
will be padded there is some extra overhead. Also, it is hard to  
calculate
hard links and sparse files. Tar will handle these correctly (i.e.  
preserve
hard links and detect sparse files and try not archive blocks of  
nulls) but
it is hard to calculate the size because of this before the archive  
operation

because of this.
--
Mel


Thanks.  I have been able to come close, but not exact.

Looks like close will have to be good enough.

Thanks again.


Jay

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org