Am 2008-01-18 um 20:56 schrieb David Miller:

On Jan 18, 2008, at 8:14 AM, Moritz Heckscher wrote:

Hello all,

I'm new to the list, but have done quite a bit of researching before regarding the support of Mac OS X specific features (resource forks, extended attributes, ACLs, file creation & modification date).

By reading the archives, I get the impression that the current version of rsync 3.0.0pre8 is quite far in this respect. At least it sounds so, and I thank the developers very much for this! I like your approach much more than the (very buggy) one originally pursued by Apple (store metadata in separate ._ file).


Be careful and test, test, test. I tried using pre8 to sync two local Xserve RAID's(about 2TB of data) and I'm seeing these errors.

rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32)
[receiver] internal abbrev error!
rsync error: error in rsync protocol data stream (code 12) at xattrs.c(565) [receiver=3.0.0pre8] rsync: connection unexpectedly closed (175959 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c (600) [sender=3.0.0pre8]

I have another Xserve RAID(about 1.3TB) and I don't get those errors when syncing with pre8. I'm trying to pin down what files/ folders are causing the problem now.

Thanks for your feedback. I will certainly test a lot before going into deployment. (Currently I'm waiting for a hard disk so my machine is not even physically built yet.) I have found some interesting resources regarding the metadata problem which you might find useful for isolating the problem with your machines also:

1) A detailed post about which kinds of metadata exist on Mac OS X and about how poorly almost all programs handle them:

<http://blog.plasticsfuture.org/2006/03/05/the-state-of-backup-and- cloning-tools-under-mac-os-x/>

The post is rather old (March 2006, i.e. OS X 10.4.5/10.4.6), however from my research it seems that basically nothing has changed since then (a real shame, I say!). (I am fairly certain nothing substatial has changed in 10.4. I am also not sure if things have been fixed in 10.5 in the cp and rsync programs of OS X or if Apple has invested all their energy to get things right in their own "Time Machine" program, at the cost of neglecting the standard tools.)

2) Almost a year later, someone has built a test set containing files will (almost?) all possible Max OS X metadata:

<http://inik.net/node/150>

One could transfer this small number of files from the set back and forth and check where the problems are.

3) To help with the checking/comparing, someone else built a little tool. It not only creates a collection of test files but can also compare the original and transferred versions afterwards:

<http://www.n8gray.org/blog/2007/04/27/introducing-backup-bouncer/>

I haven't tested these yet, but will once my hardware is ready.

All in all, the metadata situation on Mac OS X is and has been a total mess. It's almost impossible to generate true backups (true meaning keeping all metadata intact with the data). Sure, usually you don't need all metadata and people are happy if they can at least recover the data fork. But on the other hand there is all this metadata, used by all sorts of old and new programs, so keeping it should be possible.

I plan to do the following:

* Run a Linux server (Ubuntu, I guess, on a ext3 partition) with two separate internal ATA hard disks formatted in XFS and configured in software RAID to store the actual backup data. (As I understand, I should use XFS rather than ext3 because XFS supports extended attributes large enough to hold also larger converted Mac resource forks.)

* Back up from different Mac OS X clients (cuurently all on 10.4, but I might upgrade them to 10.5 later) to the server using rsync over ssh. This should hopefully preserve (most of) the Mac- specifif metadata on the server. (Actually I plan to use rsnapshot, but I believe if I have rsync installed in the newest version and possibly tell rsnaphot to use the appropriate rsync options, things will be the same.)

Now my question is the following:

1) What would I have to do to ensure the metadata is also restored correctly? I assume I will have to use rsync for restoring also, and if I just copy over data (using, e.g, scp or over an AFP or CIFS or NFS network mount), I will lose this metadata. Is this correct?


Why not use rsync3 for both backup and restore. Either use ssh (rsync -azXA --delete /path/to/source server:/path/to/target) or setup an rsync daemon server. This way you let rsync handle the metadata.

Yes, that's what I meant, using rsync in both directions should (!) keep things intact. I was asking about the other alternatives because while I will set up the server, it might be necessary for users to restore their own backups. And I don't think they'd be able to successfully use rsync on the command line.

So I was thinking of maybe publishing the backup directory on the server as a read-only share on the network that users could mount on their client machines. If they then restore files (over AFP, SMB, ...), will this destroy the metadata that rsync had previously stored? I assume so. (Yes, I know, I can test myself, and will when possible, I'm just hoping to get advice from seasoned rsync/network/ Mac gurus here beforehand...)

Another problem I'm thinking about is that rsnapshot should be run on the server to "pull" the backups over the network. One cannot run it on the clients and "push" the data to the server over the network -- which is what I'd prefer because I plan to not leave the server on all day but rather have it woken up by the (laptop) clients when needed who'll take care of the scheduling of the backups (using anachron or launchd etc.) One could, however, run rsnapshot on the clients to backup onto a locally attached storage device.


You don't need rsnapshot. Use the --link-dist option to create incremental backups.

Thanks for pointing this out. I had looked through the man page of rsync a few times, but, you know, it's a little complicated in the beginning to understand all the options...

Anyway, I think rsnapshot would still be a good (better) solution for me because it handles all the rotating of daily/weekly/monthly backups. I browsed through the Perl code a few days ago and saw that it was more than 5000 lines long. I think if several smart people have worked on this problem for years and produced a script heavily tested, this should be more reliable than rolling out my own 20 lines shell script.

My clients are laptop machines which are not always on, so I expect a lot of interrupted or skipped backups. That's difficult to deal with.

This leads me to the second question:

2) If I mount the server as a network drive on the clients using AFP, SMB/CIFS, NFS, ..., and then backup to this 'locally attached' drive with rsync (using rsnapshot), will I lose the metadata because of the transfer via the SMB/... layer?

See above. If anyone can confirm I will lose the metadata, I'd be grateful.

-Moritz

Thanks a lot for a great program!
-Moritz

--
To unsubscribe or change options: https://lists.samba.org/mailman/ listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart- questions.html


David Miller.
--
To unsubscribe or change options: https://lists.samba.org/mailman/ listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart- questions.html

--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to