Hi Nathan,
Thanks so much for doing the testing. I had not expected those
results. Of course, the more interesting test is when you do a backup
in which there are dozens of changes scattered across the files. That
is where the rsync algorithm is supposed to be helpful -- it trades
network bandwidth for CPU. With rdiff-backup running over SSH, only
the changes to the files will be transmitted over the network, whereas
with rdiff-backup running over SMB or AFP, the whole file will be sent
over the network. See: http://en.wikipedia.org/wiki/Rsync
Also, when you do the initial backup, what happens if you add the --
ssh-no-compression option to rdiff-backup?
Thanks!
Andrew
On Mar 30, 2009, at 3:35 AM, Nathan Aschbacher wrote:
Well I managed to get SMB access working somehow. A combination of
"unix extensions = no" in the global space of my samba
configuration, building the proper driver for my NIC on the linux
server, and luck.
I also got netatalk setup and running for AFP access, and installed
rdiff-backup on the server to try the direct SSH method.
My findings were interesting:
Client:
Apple MacBook (unibody) 2.4Ghz 4gig RAM w/ 320GB Hitachi
HTS543232L9A300 drive; Mac OS X 10.5.6
Server:
MSI Wind Nettop (2x core Atom 330) 1.6Ghz 2gig RAM w/ 2x 1TB Samsung
Spinpoint F1's, mirrored via mdadm, ext4 filesystem; Ubuntu 9.04 beta
Network:
Gigabit ethernet on both ends connected via Netgear Gig-E switch,
MTU on both machines set to 6144 (the highest the MSI's NIC will
handle)
Test File Set:
2.91 Gigs, 1748 files, a mix of CD image ISO's, media files (audo/
video), and tons of very small documents/resources/source-files
Results: (times are in min:sec)
Initial Backup Peak Throughput Next Backup (no changed
files)
SMB - 8:02 ~12 MB/s
0:25
AFP - 3:28 ~38 MB/s
0:10
SSH - 8:36 ~8 MB/s
0:21
local - 3:33 ~27 MB/s
0:09
In standard file copying Samba is typically much closer in
performance than that, it's usually only a few MB/s slower than
netatalk. Typical data rate for direct copying a large contiguous
1 GB file with this setup is ~43 MB/s for AFP, ~39 MB/s for SMB, and
~17 MB/s for SSH (scp). Going to the AFP share was essentially as
fast as performing the backup where the target was the same local
disk as the source (of course performance would be better if using
an external drive, but still).
In that sense I can understand why SSH is so slow, it has all the
encrypt/decrypt overhead going on, as well as compression being
performed. The only thing I can think to explain the massive
disparity between SMB and AFP using rdiff-backup is how each one
deals with Mac OS X specific data like resource forks, etc. When
using AFP, netatalk is handling all of the Mac-specific data, when
using SMB, rdiff-backup is handling it. Not surprisingly netatalk
is more efficient in that regard.
I suspect these massive discrepancies in performance are largely Mac-
specific because of the odd filesystem features that have to be
recreated via one means or another on the target filesystem.
However, it was interesting to see the massive difference in
performance. Needless to say, based on these tests, I'll be
sticking with AFP for my rdiff-backups, and finally ditching Time
Machine!
I hope this information is helpful to somebody someday.
Cheers!
-Nathan
On Mar 29, 2009, at 2:25 PM, Andrew Ferguson wrote:
On Mar 29, 2009, at 12:33 PM, Nathan Aschbacher wrote:
The "rm tgt" fails, well not exactly. It runs without an error
the first time, but some kind of problem occurs on OS X where the
file won't stay deleted, it immediately reappears. Once it
reappears two problems occur. First, if you try to delete it
again the system claims it doesn't exist, secondly any check that
looks to see if the directory is empty will fail (because "ls"
still returns tgt). The symlink "tgt" has to be delete by logging
into the Samba server. If I connect to the same share from an
Ubuntu desktop and try the same set of operations through the GUI
it won't let me even create the symlink, claiming that the target
volume doesn't support it. It seems like there's some very
strange interaction occurring between Mac OS X and the Samba
share. You gotta love Apple's spectacularly bad Samba
implementation.
That description is consistent with the stack trace from rdiff-
backup. The trace points to a file deletion error when attempting
to clear out the temporary directory in which it had tested
filesystem abilities, such as symlinks.
It's certainly a strange situation... I've never heard of that
sequence of events happening before. Of course, it doesn't help
that there are so many ways in which the Samba server can be
configured.
Have you considered other network filesystems such as NFS or
Appleshare/Netatalk? Of course, installing rdiff-backup on the
server and running over SSH will be the fastest, performance-wise.
Andrew
_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki