On Fri, Aug 27, 2004 at 04:41:56PM +0100, Jamie Lokier wrote: > > It is helpful for the OS, or a naming convention, to indicate what > _is information_ though. > > It makes no sense to backup two or more copies of the _same > information_, and it makes even less sense to try to restore them as > it'll either be slow, fail (you can't always write to alternative > presentations), or cause unwanted side effects.
You sure you don't mean what is data (on disk) and what is information, ie. interpretted data? Following this whole threat is quite interesting, but I think there are two things that need to be pointed out (which I might've missed) is that hardlinks would be backed up multiple times if the archiver doesn't know the semantics of the filesystem. This is the reason why the <fs-of-choice>dump usually are faster than tar/cpio/etc. as they understand the low level and can optimize at that level. Also, a user might only have access to a certain subset of information, which he might to backup but not the rest of the stream/data, and such the choice should be the user's. > Just like when you backup a dynamic web site. You store the files > which the server is using. You don't use "wget" to store the > generated pages, that's not a useful backup and you can't restore from it. That depends ;^) But in the general case it's true.
