Hi, Jeffrey J. Kosowsky wrote on 2009-06-02 14:26:44 -0400 [Re: [BackupPC-users] why hard links?]: > Les Mikesell wrote at about 12:32:14 -0500 on Tuesday, June 2, 2009: > > Jeffrey J. Kosowsky wrote: > > > [...] > > > > If you have to add an extra system call to lock/unlock around some > > > > other operation you'll triple the overhead. > > > > > > I'm not sure how you definitively get to the number "triple". Maybe > > > more maybe less.
I agree. It's probably more. > > Ummm, link(), vs. lock(),link(),unlock() equivalents, looks like 3x the > > operations to me - and at least the lock/unlock parts have to involve > > system calls even if you convert the link operation to something else. > > 3x operations != 3x worse performance > Given that disk seeks times and input bandwidth are typical > bottlenecks, I'm not particularly worried about the added > computational bandwidth of lock/unlock. Since you can't lock() the file you are about to create (can you?), you'll probably need a different file - either one big global lock file or one on the directory level. I'm not familiar with the kernel code, but I wouldn't be surprised if that got you the disk seeks you are worried about. > > > Les - I'm really not sure why you seem so intent on picking apart a > > > database approach. > > > > I'm not. I'm encouraging you to show that something more than black > > magic is involved. [...] > > I never claimed performance. My claims have been around flexibility, > extendability, and transportability. And I'm worried about complexity and robustness: 1. Complexity What additional skills do you need to set up the BackupPC version you are imagining and keep it running? 2. Complexity Who is going to write and, more importantly, debug the code? How do you test all the new cases that can go wrong? How do people feel about entrusting vital data to a system they no longer have a basic understanding of? 3. Complexity When everything goes wrong, what can you still do with the data? Currently, you can locate a file in the file system (file mangling is not that complicated) or even with an FS debugging tool in an image of an unmountable FS and BackupPC_zcat it to get the contents. Attributes are lost that way, but for regaining the contents of a few crucial files, this can work quite well. It could be made to even restore the attributes with only slightly more requirements (intact attribs file). With a database, can you do anything at all without a completely running BackupPC system? What are the exact requirements? Database file? Database engine? Accessible pool file system? 4. Robustness, points of failure How do you handle losing single files, on-disk corruption of a few files? Losing/corrupting many files? Your database? > I think all (or nearly all) of my 7 claimed advantages are > self-evident. Yes, mostly, though they were claimed in a different thread. I hope everyone has multiple MUAs open ... 1. I don't see how "platform and filesystem independence" fits together with the use of a database, though. You are currently dependent on a POSIX file system. How is depending on one of a set of databases any better? 4. How does backing up the database and *a portion of the pool* work? Sure, you can make anything fault-tolerant, but are missing files faults of which you *want* to be tolerant? But yes, backing up the complete pool would be easier, though it's your responsibility to get it right (i.e. consistent), and there's probably no sane way to check. 5.1. Why is file name mangling a kludge, and in what way is storing file names in a database better? 5.2. What is non-standard about defining a file format any way you like? It's not like compressed pool files would otherwise adhere to a particular known file format. But yes, treating compressed and uncompressed files alike would be nice. 5.3. I'm not really sure encrypting files *on the server* does much, unless you are thinking of a remote storage pool. In particular, you need to be able to decrypt files not only for restoration, but also for pooling (unless you want an intermediate copy and an extra comparison). 5.5. Configuration stored in the database? Is that supposed to be an advantage? 6. If you mean access controlled by the database (different database users), I don't really see why you are worried about access to the *meta data* when the actual contents remain readable (you're not saying that it being such a huge amount of data is a security feature, are you?). If you mean that a database will make it easier to implement file level access control, I honestly don't see how. 7. How that? If you are less concerned about how much space you use, you can store things in a way that they can be accessed faster. But I still think you are mistaken in that multiple attrib files would need to be read. I've had to read so much discussion on this today that I won't check the code now, but I'd reason that for attrib file pooling to make any sense, the default would be an identical attrib file (compared to the reference backup) if no files in the directory were changed. Or, differently, if BackupPC *would* need to scan multiple attrib files, your delete-file-from-backups script would only ever need to modify one attrib file for any file it deletes, right? ;-) > Plus, I don't want my backup system to be > filesystem dependent because I might have other reasons for picking > other filesystems or my OS of the future (or of today) might not even > support the filesystem features required. The same arguments hold against incorporating a database. > I think good system design calls for abstracting the backup software from > the underlying filesystem. Well, the only thing you are abstracting from are hardlinks, which are POSIX standard. I wouldn't be surprised if there were other POSIX dependencies. BackupPC currently makes no other assumptions about the file system, does it? Well, file size maybe - you need a file system capable of storing large enough files. And long enough paths. I look forward to the introduction of $Conf{PathSeparator} ... Regards, Holger ------------------------------------------------------------------------------ OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/