Re: [rdiff-backup-users] Restarting development ... or starting over
I wasn't really prepared to make this announcement so soon, but now seems like a good time to let the community know. I've been working on a new implementation of rdiff-backup since about a month ago when [snip] I'm not a coder... [snip] Now, on top of that I'd like to have all the fanciness present in all the other programs, particularly a SQL backend for the metadata storage... Out of curiosity, if you're not a coder what do you plan to do with a SQL backend? I actually considered using SQLite as a backend for storing metadata, but I just couldn't justify it since rdiff-backup would have no real benefit from the features of SQL (complex queries, joins, etc.). The only real benefit would be simpler random-access (for restoring selected files, for example). However, random access in a database either requires an index, which takes extra HDD space, or it involves a sequential scan, which is exactly what rdiff-backup already does. Why add another layer of complexity if it's not needed? It would likely just slow things down and add bloat to the repository. Having said all that, I did write the new version with a well-defined database interface, which means other backends (possibly MongoDB) could be added without affecting the frontend. ~ Daniel ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Development Status Git/SVN
Josh, FWIW, I have been very impressed with Git and have found this resources extremely helpful: http://progit.org/book/ That being said, Windows support is second-class (or worse) and its probably my ownly concern about git. I have only used mercurial a few times, but don't have a problem with it. I agree with you that either option is better than cvs and probably subversion, at least for a project like this. So I would encourage you to go for it. The sooner the better. :) -- Randy Syring Intelicom 502-644-4776 Whether, then, you eat or drink or whatever you do, do all to the glory of God. 1 Cor 10:31 Josh Nisly wrote: Cygwin does not count as good support. :-) I'm biased toward Mercurial for several reasons: * It appears to value ease of use over flexibility. rdiff-backup does not have the same needs as the linux kernel; the development of rdiff-backup has long been one or two core maintainers, with others contributing small patches on and off. Most of these contributes are one or two line patches for bugs in certain situations. * I understand that there are projects to use git on other platforms, but frankly they still seem second-tier. I really don't want to start yet another git-vs-hg debate; I think most of the developers would prefer either one over the current setup. Unless there are good reasons why hg wouldn't work for specific rdiff-backup workflows, I suggest that we go with it and move on to other things (more about this in another email.) Thanks, JoshN Jernej Simončič wrote: On Monday, April 5, 2010, 15:43:07, Josh Nisly wrote: The last time I checked, git didn't have great support for Windows. There's msysgit at http://code.google.com/p/msysgit/ and TortoiseGIT for Explorer integration: http://code.google.com/p/tortoisegit/, and as far as I can see, they work decently (or, you could always use git from cygwin). ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development ... or starting over
On 04/06/2010 02:34 PM, Daniel Miller wrote: I wasn't really prepared to make this announcement so soon, but now seems like a good time to let the community know. I've been working on a new implementation of rdiff-backup since about a month ago when [snip] I'm not a coder... [snip] Now, on top of that I'd like to have all the fanciness present in all the other programs, particularly a SQL backend for the metadata storage... Out of curiosity, if you're not a coder what do you plan to do with a SQL backend? I used Bacula for a while and the possibilities of the SQL backend seduced me. Anyway, decoupling the content and all the metadatas including storage path, slicing, encryption, checksum really appeals to me. So it's not really the SQL backend that I want, but the ability to query the repository with the needed criterion, even to simply shows management what the backup systems does. Now my use of rdiff-backup is in no way limited by space but by bandwidth, I do two remote copies of the repository in addition to the local one. I actually considered using SQLite as a backend for storing metadata, but I just couldn't justify it since rdiff-backup would have no real benefit from the features of SQL (complex queries, joins, etc.). The only real benefit would be simpler random-access (for restoring selected files, for example). However, random access in a database either requires an index, which takes extra HDD space, or it involves a sequential scan, which is exactly what rdiff-backup already does. Why add another layer of complexity if it's not needed? It would likely just slow things down and add bloat to the repository. Having said all that, I did write the new version with a well-defined database interface, which means other backends (possibly MongoDB) could be added without affecting the frontend. Out of curiosity, is your code accessible and usable? Nicolas ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development ... or starting over
Daniel Miller wrote: I wasn't really prepared to make this announcement so soon, but now seems like a good time to let the community know. I've been working on a new implementation of rdiff-backup since about a month ago when I dug into the current codebase and discovered its disappointing quality. While what I have right now is functional and works on simple cases, it does not cover the broad range of features currently offered by rdiff-backup. I could use some help in bringing it up to par if others are interested in the path I have taken. While I have used the current codebase for direction and inspiration, I have started with a clean slate for several reasons: I'm interested and am looking forward to seeing the code. - An automated test suite makes adding new features and long-term maintenance much easier. Adding this to the current codebase is both hard and boring. One thing that makes it very hard to write tests for the current codebase is the widespread use of globals. My new implementation has been developed using TDD and minimal use of globals (e.g. for loggers and constants). YAY TDD! :) - The current repository layout has a critical design flaw that causes performance degradation as a repository grows. Most difference information is stored in a single file tree (rdiff-backup-data/increments), that has a very similar structure to the mirror. The problem is that as files get added/deleted/changed the directories in the increments tree are always growing in size, meaning it takes longer and longer to list the contents of directories in the tree. This performance problem is negligible in small-to-medium sized backup sets, but becomes apparent in very large backup sets as the number of increments grows. I have redesigned the repository layout in my new implementation to eliminate this performance issue. Note that I do not know for sure if my new layout will completely eliminate this problem since I have not tested it yet with a very large backup set over a long period of time. Can this be tested further? It would suck to get further down the road with this repository structure and find out it didn't really help the problem. -- Randy Syring Intelicom 502-644-4776 Whether, then, you eat or drink or whatever you do, do all to the glory of God. 1 Cor 10:31 ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
Hi Josh, Development on rdiff-backup has stagnated for the last while. I think that this is attributable to several reasons: * Andrew has dropped of the face of the earth, (he's working on graduate studies, IIRC) and I've been busy with other things. Since we're the two core maintainers, that tends to slow things down. * I started work on unicode support, but realized that it requires a transition period. In current CVS, it's broken just enough to make life difficult. * The code isn't structured all that well. This is at least partially because we don't have good automated tests, so it's hard to refactor without breaking things. * We're still using CVS :-) Thanks for bringing up these issues to get development started again. I didn't intend to hijack your thread... I'm interested in collaborating with you going forward if you're interested in the same. I will attempt to migrate my git repo to hg if that's what you want, although I haven't used hg before, so it will be new for me. I am reluctant to publish the code under the rdiff-backup name if you are not interested in working on the new codebase. How would you like me to handle that? Would you like a copy of the code so you can have a look over it before you make any decisions? It's still pretty new, and there is definitely more work to do to implement the features I have planned and to bring it up to par on all of the currently supported platforms. A little background on why I went down this road: you may remember my posts back in Feb on the full-verify patch. I had a working patch that I had contributed and started using on my system. I was very careful when I developed that patch because I wanted it to be useful on a production system. Then, the first time I tried to start a new repository with that patch applied I got all kinds of strange errors. This really made me worried because I had no idea why it happened--the errors just didn't make any sense. Since then, I tried again with the same setup, and it worked the second time! That's even more frightening... to me it says there is something non-deterministic in the design of rdiff-backup (it could be something in my setup too, although I've gone over it and even run it by others many times). Since then I've been using a version of rdiff-backup 1.2.8 with my full-verify patch (and some minor tweaks), and its been working well. Surprisingly, and somewhat unbelievably, the new verify mechanism detected a hard drive going bad on my system, so I'm glad I'm using it. Soon after the verify started failing I had other warnings as well (rsync started failing too) so it's not like I wouldn't have known if I didn't have the full-verify, but it gave me the earliest warning. After that experience I decided that I would try a redesign to see how far I could get. If it never goes anywhere, no problem, it has been fun to solve the problems and learn how the rsync algorithm works and is used in rdiff-backup. But I think this new design is much cleaner, and opens up a lot of potential for long-awaited features to be added to rdiff-backup. I'm be interested to hear your thoughts. ~ Daniel ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
Daniel Miller wrote: Hi Josh, Development on rdiff-backup has stagnated for the last while. I think that this is attributable to several reasons: * Andrew has dropped of the face of the earth, (he's working on graduate studies, IIRC) and I've been busy with other things. Since we're the two core maintainers, that tends to slow things down. * I started work on unicode support, but realized that it requires a transition period. In current CVS, it's broken just enough to make life difficult. * The code isn't structured all that well. This is at least partially because we don't have good automated tests, so it's hard to refactor without breaking things. * We're still using CVS :-) Thanks for bringing up these issues to get development started again. I didn't intend to hijack your thread... I'm interested in collaborating with you going forward if you're interested in the same. I will attempt to migrate my git repo to hg if that's what you want, although I haven't used hg before, so it will be new for me. I am reluctant to publish the code under the rdiff-backup name if you are not interested in working on the new codebase. How would you like me to handle that? Would you like a copy of the code so you can have a look over it before you make any decisions? It's still pretty new, and there is definitely more work to do to implement the features I have planned and to bring it up to par on all of the currently supported platforms. A little background on why I went down this road: you may remember my posts back in Feb on the full-verify patch. I had a working patch that I had contributed and started using on my system. I was very careful when I developed that patch because I wanted it to be useful on a production system. Then, the first time I tried to start a new repository with that patch applied I got all kinds of strange errors. This really made me worried because I had no idea why it happened--the errors just didn't make any sense. Since then, I tried again with the same setup, and it worked the second time! That's even more frightening... to me it says there is something non-deterministic in the design of rdiff-backup (it could be something in my setup too, although I've gone over it and even run it by others many times). Since then I've been using a version of rdiff-backup 1.2.8 with my full-verify patch (and some minor tweaks), and its been working well. Surprisingly, and somewhat unbelievably, the new verify mechanism detected a hard drive going bad on my system, so I'm glad I'm using it. Soon after the verify started failing I had other warnings as well (rsync started failing too) so it's not like I wouldn't have known if I didn't have the full-verify, but it gave me the earliest warning. After that experience I decided that I would try a redesign to see how far I could get. If it never goes anywhere, no problem, it has been fun to solve the problems and learn how the rsync algorithm works and is used in rdiff-backup. But I think this new design is much cleaner, and opens up a lot of potential for long-awaited features to be added to rdiff-backup. I'm be interested to hear your thoughts. First, I'm delighted that you're taking an interested in the project! More creativity (and competition) is a good thing. Here's my personal perspective. Our current users get grumpy when we change the wire protocol across major versions - I suspect that losing backwards compatibility at the repository level would be a deal-killer for many. (I know it would be for me, since I have hundreds of repositories with multiple years of history.) Because of that, I doubt that I'll contribute meaningfully to a new project. I'm also a little hesitant to call it rdiff-backup, since it is a complete rewrite. Maybe rdiff-backup-ng (or something like that) would be a good compromise? I'm a little torn - I don't want to discourage you from starting from scratch at all, but I think that the current codebase has lots of value in it that make it worth salvaging. I can understand where you're coming from to start a new version, yet I wonder if you might not be underestimating the amount of work required to bring it to parity with the current codebase. Supporting OS X resource forks and Windows ACLs, for example. A lot of value that I see in rdiff-backup is in its myriad of workarounds for handling all sorts of situations, from simple things like chmod'ing unreadable files temporarily to be able to back them up, to handling backups from a unix (case-sensitive) file system to a windows (case-insensitive) one. Personally, I'd like to have your help developing tests for the current codebase. I really think that if we come up with good functionality tests, we can refactor the codebase to the point where we can start writing unit tests. However, that's certainly less enjoyable than starting from scratch(!), so I can't blame you for not getting excited about that.
Re: [rdiff-backup-users] Restarting development
Josh Nisly spake thusly on 04/06/2010 09:27 AM: I'm a little torn - I don't want to discourage you from starting from scratch at all, but I think that the current codebase has lots of value in it that make it worth salvaging. I can understand where you're coming from to start a new version, yet I wonder if you might not be underestimating the amount of work required to bring it to parity with the current codebase. Supporting OS X resource forks and Windows ACLs, for example. A lot of value that I see in rdiff-backup is in its myriad of workarounds for handling all sorts of situations, from simple things like chmod'ing unreadable files temporarily to be able to back them up, to handling backups from a unix (case-sensitive) file system to a windows (case-insensitive) one. Personally, I'd like to have your help developing tests for the current codebase. I really think that if we come up with good functionality tests, we can refactor the codebase to the point where we can start writing unit tests. However, that's certainly less enjoyable than starting from scratch(!), so I can't blame you for not getting excited about that. JoshN This discussion and your comments here remind me of this old post from Joel Spolsky: Things you should never do, part I http://www.joelonsoftware.com/articles/fog69.html I'll include some excerpts below, but first want to say I'm not necessarily advocating any particular approach to this. I'm a relatively new rdiff-backup user and I love it. It's so much nicer to use than my old approach of my own bash scripts + rsync + hard links. I'm just happy that people are talking about development and maintenance -- I want to see this project thrive. But in the role of a user, I don't want to vote for any development plans. (I will be happy to help with some testing for either version however.) But! I think Joel's points are valid and I always think of them when people talk about wanting to start over. From the article (with the acknowledgement that he may be talking about larger programs, but it seems that rdiff-backup has grown to be quite comprehensive...): ___ We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by incremental renovation: tinkering, improving, planting flower beds. There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming: It’s harder to read code than to write it. This is why code reuse is so hard. This is why everybody on your team has a different function they like to use for splitting strings into arrays of strings. They write their own function because it's easier and more fun than figuring out how the old function works. As a corollary of this axiom, you can ask almost any programmer today about the code they are working on. It's a big hairy mess, they will tell you. I'd like nothing better than to throw it out and start over. Why is it a mess? Well, they say, look at this function. It is two pages long! None of this stuff belongs in there! I don't know what half of these API calls are for. ... Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95. Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it's like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters. When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work. ... Is there an alternative? The consensus seems to be that the old Netscape code base was really bad. Well, it might have been bad, but, you know what? It worked pretty darn well on an awful lot of real world computer systems. When programmers say that their code is a holy mess (as they always do), there are three kinds of things that are wrong with it.
Re: [rdiff-backup-users] native VSS (Shadow Copy) support
Josh Nisly wrote: Two things: 1) The rdiff-backup project likely won't accept patches for features that are platform specific. There are exceptions for OS-specific filesystem metadata, but VSS doesn't fall into that category. Now that there is consideration of restarting development, any chance we could reconsider this? Any backup solution has to wrestle with the Windows user base and the key issue on Windows is whether or not a backup solution uses VSS. If rdiff-backup supported VSS, its uptake might be huge. At very best, maybe a plugin system so that VSS can be added easily without needing to touch the core code? 2) I've written a python extension module in C++ to implement VSS. Is that something that you'd be interested in? Very, please share. Have you integrated this with rdiff-backup at all? JoshN Thanks Josh! -- Randy Syring Intelicom 502-644-4776 Whether, then, you eat or drink or whatever you do, do all to the glory of God. 1 Cor 10:31 Randy Syring wrote: I was going to try to add support for Windows Shadow Copy Service based on this code/concept: http://markmail.org/thread/bmyioexeuvnwlmho and am wondering if anyone had suggestions related to this. The concept seems simple enough, make a shadow copy of each drive involved and change the location of the backed up files from the actual drive to the vss image. I was thinking a new flag could be implemented '--vss' in order to tell rdiff-backup that vss images are desired. ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
I'm looking forwarding to hearing your responses. Are you sure about that? :) Here's my personal perspective. Our current users get grumpy when we change the wire protocol across major versions - I suspect that losing backwards compatibility at the repository level would be a deal-killer for many. (I know it would be for me, since I have hundreds of repositories with multiple years of history.) Because of that, I doubt that I'll contribute meaningfully to a new project. I'm also a little hesitant to call it rdiff-backup, since it is a complete rewrite. Maybe rdiff-backup-ng (or something like that) would be a good compromise? Hmm, ok, I'll rename it. It will likely be a completely new project. I'd like to explore deduplication anyway, which requires an even bigger divergence from the rdiff-backup design (it will likely eliminate the current mirror in favor of a block-level database, but it solves a host of other cross-platform issues so its appealing to me). As it is, I believe the current repository design is flawed with a performance issue that gets worse with bigger backup sets and long-term use. I don't know if that can be fixed without changing the repository structure. You also mentioned that unicode support will require a transition period--I'm not sure if this implies changing the repository structure, but thats what it sounds like to me. I'm a little torn - I don't want to discourage you from starting from scratch at all, but I think that the current codebase has lots of value in it that make it worth salvaging. I can understand where you're coming from to start a new version, yet I wonder if you might not be underestimating the amount of work required to bring it to parity with the current codebase. Supporting OS X resource forks and Windows ACLs, for example. A lot of value that I see in rdiff-backup is in its myriad of workarounds for handling all sorts of situations, from simple things like chmod'ing unreadable files temporarily to be able to back them up, to handling backups from a unix (case-sensitive) file system to a windows (case-insensitive) one. I understand the value in supporting many different systems. I was not intending to drop that feature, although it may take some time to implement (I might point out here that it's going to take time to get tests written for the current system too, which is sorely needed to continue development beyond simple bug fixes). A notable thing that is missing is OS X ACLs. ACLs have been turned on by default since Mac OS 10.5, so technically rdiff-backup can no longer make a complete backup of a modern OS X system. One of the problems with the current design is that many of these platform-specific features were tacked on after the core design had hardened. That contributed to the rather poor quality of the current codebase. Things really need to be more modular. I'm not saying this can't be done with the current codebase, but it's going to be really hard, especially if the current system has to be fortified with tests before anything new can be written. Personally, I'd like to have your help developing tests for the current codebase. I'm not feeling very excited about that... I really think that if we come up with good functionality tests, we can refactor the codebase to the point where we can start writing unit tests. However, that's certainly less enjoyable than starting from scratch(!), so I can't blame you for not getting excited about that. Maybe, but I probably won't be the one getting it to that point. Oh yeah, I've read a lot of Joel Spolsky (including the one on Things you should never do). While he makes some great points in that article, Joel doesn't have all the right answers (did you read his more recent article on distributed version control). I know the current codebase could be refactored and have tests written and everything, but I just don't have the drive to do that right now. There are times when what you're starting with takes more work to fix than starting something new. ~ Daniel ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] native VSS (Shadow Copy) support
On Tue, Apr 6, 2010 at 11:27 AM, Randy Syring rsyr...@inteli-com.com wrote: Josh Nisly wrote: Two things: 1) The rdiff-backup project likely won't accept patches for features that are platform specific. There are exceptions for OS-specific filesystem metadata, but VSS doesn't fall into that category. Now that there is consideration of restarting development, any chance we could reconsider this? Any backup solution has to wrestle with the Windows user base and the key issue on Windows is whether or not a backup solution uses VSS. If rdiff-backup supported VSS, its uptake might be huge. At very best, maybe a plugin system so that VSS can be added easily without needing to touch the core code? I don't post here often, but I strongly agree that snapshot technology should be leveraged by the rdiff-backup environment. I'm not as positive that it needs to be in the tool itself. I can't remember the name of the package, but I believe there is a linux package that wraps rdiff-backup and uses LVM snapshots from which it runs rdiff-backup. As to VSS, it is supported in Windows XP SP3, 2003, Vista, 2008, etc. So it is a long term technology that it here to stay. To me the only thing to really discuss is how vss support should be integrated into a windows rdiff-backup environment, not if it should be. 2) I've written a python extension module in C++ to implement VSS. Is that something that you'd be interested in? Very, please share. Have you integrated this with rdiff-backup at all? The great thing about C code, etc. (as opposed to just scripts calling the Vista provided CLI commands) is that VSS CLI commands are not included in XP SP3 iirc. And also with XP SP3 you only have 20 or 30 seconds to mount the VSS snapshot after you initialize it. (again, iirc) So having it in a real program as opposed to just a script makes it easier to have the error / retry logic built in. === And for those that don't know much about VSS snapshots, one really cool feature is that services like Exchange, MS-SQL, etc. have vss knowledge integrated into them. They register themselves with the vss service and whenever a snapshot is created, the registered services are notified to quiesce themselves prior to the snapshot being made. I don't know if people are backing up that class of service via rdiff-backup, but if they are it is pretty critical that vss snapshots be used in order to ensure you have quiesced data. vss also solves the open file issue that is so problematic with windows backups. Thanks for reading Greg ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
On Tue, Apr 06, 2010 at 11:59:55AM -0400, Daniel Miller wrote: Hmm, ok, I'll rename it. It will likely be a completely new project. I'd like to explore deduplication anyway, which requires an even bigger divergence from the rdiff-backup design (it will likely eliminate the current mirror in favor of a block-level database, but it solves a host of other cross-platform issues so its appealing to me). If you're doing that, what about encryption while you're at it? -- Matthew Miller mat...@mattdm.org http://mattdm.org/ ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
On 04/06/2010 05:59 PM, Daniel Miller wrote: Hmm, ok, I'll rename it. It will likely be a completely new project. I'd like to explore deduplication anyway, which requires an even bigger divergence from the rdiff-backup design (it will likely eliminate the current mirror in favor of a block-level database, but it solves a host of other cross-platform issues so its appealing to me). Well, if it's open season for wishes, then encryption, spare files (not for databases but for virtual disks), a flexible exclusion mechanism and a flexible metadata querying interface. Nicolas ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] Restarting development
On 04/06/2010 05:59 PM, Daniel Miller wrote: [snip] As it is, I believe the current repository design is flawed with a performance issue that gets worse with bigger backup sets and long-term use. I don't know if that can be fixed without changing the [snip] Without prejudging, is it practically true? Josh, you said that you years of repositories, did you experience performance issues due to the repository design? If yes, on which filesystem? Nicolas ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
[Fwd: Re: [rdiff-backup-users] Restarting development]
Oops, meant to include the list. ---BeginMessage--- Daniel Miller wrote: I'm looking forwarding to hearing your responses. Are you sure about that? :) Absolutely! I really would be excited about another project that uses some of the same algorithms. As I said, more competition is good... Here's my personal perspective. Our current users get grumpy when we change the wire protocol across major versions - I suspect that losing backwards compatibility at the repository level would be a deal-killer for many. (I know it would be for me, since I have hundreds of repositories with multiple years of history.) Because of that, I doubt that I'll contribute meaningfully to a new project. I'm also a little hesitant to call it rdiff-backup, since it is a complete rewrite. Maybe rdiff-backup-ng (or something like that) would be a good compromise? Hmm, ok, I'll rename it. It will likely be a completely new project. I'd like to explore deduplication anyway, which requires an even bigger divergence from the rdiff-backup design (it will likely eliminate the current mirror in favor of a block-level database, but it solves a host of other cross-platform issues so its appealing to me). As it is, I believe the current repository design is flawed with a performance issue that gets worse with bigger backup sets and long-term use. I don't know if that can be fixed without changing the repository structure. You also mentioned that unicode support will require a transition period--I'm not sure if this implies changing the repository structure, but thats what it sounds like to me. I've not had any problems with performance on large data sets that change often. My backups typically only run once a day though, so I rarely have more than a few million increment files per repository. The unicode changes will require some repository changes, but the changes will necessarily be backwards compatible, and won't entail structural changes. The more I think about it, the more I think starting another project isn't an all bad solution - I think we may have increasingly divergent goals. For myself, having a current mirror is well worth the cost in disk space; it means that it's much easier to recover from a file corruption or program bug. OTOH, loosing this requirement opens the door to other features. Thanks, JoshN ---End Message--- ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [Fwd: Re: [rdiff-backup-users] Restarting development]
On 04/06/2010 06:35 PM, Josh Nisly wrote: The more I think about it, the more I think starting another project isn't an all bad solution - I think we may have increasingly divergent goals. For myself, having a current mirror is well worth the cost in disk space; it means that it's much easier to recover from a file corruption or program bug. OTOH, loosing this requirement opens the door to other features. I don't see how relaxing that requirement shall save space, unless you're speaking about compressed baseline backup. OTOH, I agree that the immediate access to the last copy is a plus (oops recovery), but in the case of delayed recovery - several days or generation - I think it's more an exploration of the metadata that count. And then I'm not convinced that the filesystem is the best interface. Nicolas ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [rdiff-backup-users] native VSS (Shadow Copy) support
On 06/04/2010 17:03, Greg Freemyer wrote: On Tue, Apr 6, 2010 at 11:27 AM, Randy Syringrsyr...@inteli-com.com wrote: Josh Nisly wrote: Two things: 1) The rdiff-backup project likely won't accept patches for features that are platform specific. There are exceptions for OS-specific filesystem metadata, but VSS doesn't fall into that category. Now that there is consideration of restarting development, any chance we could reconsider this? Any backup solution has to wrestle with the Windows user base and the key issue on Windows is whether or not a backup solution uses VSS. If rdiff-backup supported VSS, its uptake might be huge. At very best, maybe a plugin system so that VSS can be added easily without needing to touch the core code? I don't post here often, but I strongly agree that snapshot technology should be leveraged by the rdiff-backup environment. I'm not as positive that it needs to be in the tool itself. I can't remember the name of the package, but I believe there is a linux package that wraps rdiff-backup and uses LVM snapshots from which it runs rdiff-backup. As to VSS, it is supported in Windows XP SP3, 2003, Vista, 2008, etc. So it is a long term technology that it here to stay. To me the only thing to really discuss is how vss support should be integrated into a windows rdiff-backup environment, not if it should be. 2) I've written a python extension module in C++ to implement VSS. Is that something that you'd be interested in? Very, please share. Have you integrated this with rdiff-backup at all? The great thing about C code, etc. (as opposed to just scripts calling the Vista provided CLI commands) is that VSS CLI commands are not included in XP SP3 iirc. And also with XP SP3 you only have 20 or 30 seconds to mount the VSS snapshot after you initialize it. (again, iirc) So having it in a real program as opposed to just a script makes it easier to have the error / retry logic built in. === And for those that don't know much about VSS snapshots, one really cool feature is that services like Exchange, MS-SQL, etc. have vss knowledge integrated into them. They register themselves with the vss service and whenever a snapshot is created, the registered services are notified to quiesce themselves prior to the snapshot being made. I don't know if people are backing up that class of service via rdiff-backup, but if they are it is pretty critical that vss snapshots be used in order to ensure you have quiesced data. vss also solves the open file issue that is so problematic with windows backups. Thanks for reading Greg I have a Windows utility called TimeDicer (www.timedicer.co.uk) which is a wrapper for rdiff-backup, backing up from Windows to Linux, and it uses VSS. So it is not essential to have VSS built into rdiff-backup, even though it would be nice. Dominic ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [Fwd: Re: [rdiff-backup-users] Restarting development]
Hi rdiff-backup user, part time programmer. I would love to see some more work on rdiff-backup, love to get some bugs fixed and see some performance increase, not going to comment on rewrite or fix the current code base - I haven't really looked at the code. But I would say on encryption and de duplication - why not leave that to the filesystem - stay focused on what rdiff-backup does best - differential backups, you can get de duplication, compression and encryption file systems why not leave it to them to do that (well atleast for linux and any os that accepts fuse filesystem). Alex ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Re: [Fwd: Re: [rdiff-backup-users] Restarting development]
On Wed, Apr 07, 2010 at 06:32:04AM +1000, Alexander Samad wrote: But I would say on encryption and de duplication - why not leave that to the filesystem - stay focused on what rdiff-backup does best - differential backups, you can get de duplication, compression and encryption file systems why not leave it to them to do that (well atleast for linux and any os that accepts fuse filesystem). For encryption, one could do something like encfs (perhaps over sshfs). That's pretty cool. There's a win for deduplication at the rdiff-backup level, though, because you can tranfer over the wire _after_ deduplication. One could of course rdiff-backup to a target deduplication filsystem, sync or snapshot that filesystem, and rsync the underlying store, but that has its own disadvantages. -- Matthew Miller mat...@mattdm.org http://mattdm.org/ ___ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki