As my primary OS is not Plan9 but FreeBSD, I am interested in venti for 
archiving data on UFS filesystems.

For most purposes, vac does a good job, but it cannot preserve the full 
semantics of a Unix filesystem, especially not symbolic links. Also, vac does 
not work on a filesystem, but on a live file tree, where mounted other 
filesystems may yield unwanted results. 

There are several ways to handle this drawback, one could put the data into a 
tar- or cpio-archive and vac this file. Other variants are copying the data 
into a temporary UFS filesystem and use vbackup from planports. Even the classic
Unix dump may be a solution.

To evaluate the deduplication performance, I did a little experiment. On an 
empty arenapartion, I started with a vac of a directory (in this case my clone 
of Noam's neoventi). The .git subdirectory contains a lot of uncompressible 
files of different sizes.

total arenas=5 active=1
total space=20,094,976 used=914,716
clumps=512 compressed clumps=292 data=1,068,725 compressed data=882,460

After that, I tried my candidates, and after each run I restarted with a 
refreshed arena partition.

1. dump (of course with blocksize 8k)
total space=20,094,976 used=1,749,986
clumps=694 compressed clumps=413 data=2,487,312 compressed data=1,706,264
 2. vbackup with UFS 8k /1k fragments
total space=20,094,976 used=1,198,550
clumps=645 compressed clumps=412 data=1,961,732 compressed data=1,157,915
3. vbackup with UFS 8k and no fragments
total space=20,094,976 used=960,511
clumps=659 compressed clumps=436 data=2,099,827 compressed data=918,994
4. tar also did not compare well, because it operates in 512 byte blocks, so 
the chance of proper alignment with 8k lumps on venti is about 1/16.

It may be noted as interesting that I needed 4m for the non-fragmented ufs 
filesystem, but only 2m for the fragmented version, but the 1st fares much 
better in used space.

Some more info on this on http://pkeus.de/~wb/bench.out
------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Taf5e3581b0a00ee0-M3eadec60bbc905f2ee67789c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to