Re: Preliminary Suggestion For Atomic Transactions
On Thu, Jan 06, 2005 at 02:59:44AM +0800, Jeff Pitman wrote: The attached patch (extremely ugly) attempts to resolve this by foregoing the rename step until the end. Thanks for the patch. It looks like it implements the basics of the idea, but I think that it will be better to use deterministic temp-file names like the --partial-dir logic gives us. I also chose to use an option name of --delay-renames. So, I've created my own version of this post-transfer renaming heuristic, which is available from here: http://rsync.samba.org/ftp/unpacked/rsync/patches/delay-renames.diff The patch is relative to the very latest CVS or nightly tar file. One unresolved issue I thought of is how rsync should deal with an absolute --partial-dir pathname: the above implementation woul put all the partial files into that one dir (a relative pathname, like the default, ensures that the files get sprinkled around). The first big [problem] in running the testsuite is that the parent dirs mtime is not getting preserved. [...] Next, hardlinks need some work. These problems stem from a bug in the synchronization between the receiver and the generator that I just discovered: the generator does not wait until all the redo delete-after work is done before it begins the final hard-link pass and the final directory-time-tweaking pass. Because of this, the extra renaming loop added at the end of the receiver's work didn't get done before the generator got going on the hard-linking and directory-time tweaking. I have checked-in a fix for this, so the above delay-renames.diff works fine with a test-suite that has been modified to use --delay-renames on all the possible commands. Here's some of the design items that the patch deals with: - Complain if the user combines --inplace with --delay-renames. - A client sender needs to pass the new option to a server receiver. - We use the --partial-dir logic to make all the temp-file names deterministic, using the default name of .~tmp~ if the user did not specify the --partial-dir option. Note that these dirs get auto-excluded by the normal --partial-dir handling, so they will not be transferred by default, and they will not be deleted by --delete or --delete-after if the prior transfer did not complete and some temp files got left around (they will also be used to speed up the transfer when it is restarted). The patch also fixes/changes a few things from your patch: - The rename loop must happen prior to the delete-after logic in the receiver. - It is good to avoid twiddling the flist-flags bits in the forked receiver because it will cause all the shared file-list memory (between the generator and the receiver) to become unshared (on systems that use copy-on-write memory when forking). - It is good to avoid adding large amounts of memory to the flist object (adding MAXPATHLEN chars of extra memory for each file transferred would bloat rsync's memory by a huge amount for large transfers). I've attached a 3-hunk patch that kluges in the use of the new --delay-renames option into all the (relevant) test-suite rsync commands (for temporary testing). Apply it after the main patch, if desired. ..wayne.. --- options.c 2005-01-10 10:16:54 + +++ options.c 2005-01-10 10:30:07 + @@ -456,6 +456,7 @@ {protocol, 0, POPT_ARG_INT,protocol_version, 0, 0, 0 }, {server, 0, POPT_ARG_NONE, am_server, 0, 0, 0 }, {help,'h', POPT_ARG_NONE, 0, 'h', 0, 0 }, + {delay-renames,0, POPT_ARG_NONE, 0, 0, 0, 0 }, {0,0,0,0, 0, 0, 0} }; @@ -947,6 +948,8 @@ bwlimit_writemax = 512; } + if (inplace) + delay_renames = 0; if (delay_renames !partial_dir) partial_dir = .~tmp~; --- runtests.sh 2004-02-04 05:34:29 + +++ runtests.sh 2005-01-10 10:27:33 + @@ -168,7 +168,7 @@ exit 2 fi -RSYNC=$rsync_bin +RSYNC=$rsync_bin --delay-renames export rsync_bin RSYNC -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Preliminary Suggestion For Atomic Transactions
In the past there's been a need to provide consistency between symbolic links or repository metadata during a sync. Currently, rsync renames files piecemeal. The attached patch (extremely ugly) attempts to resolve this by foregoing the rename step until the end. It adds a new option (if we didn't, ls might catch up). There are several issues to get over. The first big one in running the testsuite is that the parent dirs mtime is not getting preserved. I'll need some insight into how this might work. Next, hardlinks need some work. Finally, there are disk space considerations by leaving all tmp files open until the transaction completes. There might be security issues by leaving a larger timing window .. who knows, i'm sure many will make comments about this stuff. I give you (drum roll) ... atomic transactions ... (tada). (Faces with perplex looks go here.) -- -jeff ? .rsync.c.swp ? atomic.test Index: options.c === RCS file: /cvsroot/rsync/options.c,v retrieving revision 1.193 diff -u -3 -p -r1.193 options.c --- options.c 1 Jan 2005 21:08:14 - 1.193 +++ options.c 5 Jan 2005 18:46:01 - @@ -98,6 +98,7 @@ int blocking_io = -1; int checksum_seed = 0; int inplace = 0; long block_size = 0; /* long because popt can't set an int32. */ +int atomic = 0; /** Network address family. **/ @@ -276,6 +277,7 @@ void usage(enum logcode F) rprintf(F, --ignore-errors delete even if there are I/O errors\n); rprintf(F, --max-delete=NUMdon't delete more than NUM files\n); rprintf(F, --max-size=SIZE don't transfer any file larger than SIZE\n); + rprintf(F, --atomicensure atomicity over entire transaction\n); rprintf(F, --partial keep partially transferred files\n); rprintf(F, --partial-dir=DIR put a partially transferred file into DIR\n); rprintf(F, --force force deletion of directories even if not empty\n); @@ -388,6 +390,7 @@ static struct poptOption long_options[] {compress,'z', POPT_ARG_NONE, do_compression, 0, 0, 0 }, {stats,0, POPT_ARG_NONE, do_stats, 0, 0, 0 }, {progress, 0, POPT_ARG_NONE, do_progress, 0, 0, 0 }, + {atomic, 0, POPT_ARG_NONE, atomic, 0, 0, 0 }, {partial, 0, POPT_ARG_NONE, keep_partial, 0, 0, 0 }, {partial-dir, 0, POPT_ARG_STRING, partial_dir, 0, 0, 0 }, {ignore-errors,0, POPT_ARG_NONE, ignore_errors, 0, 0, 0 }, Index: proto.h === RCS file: /cvsroot/rsync/proto.h,v retrieving revision 1.220 diff -u -3 -p -r1.220 proto.h --- proto.h 1 Jan 2005 21:08:02 - 1.220 +++ proto.h 5 Jan 2005 18:46:01 - @@ -198,6 +198,7 @@ int set_perms(char *fname,struct file_st void sig_int(void); void finish_transfer(char *fname, char *fnametmp, struct file_struct *file, int ok_to_set_time); +void finish_transaction(struct file_list *flist, char *local_name); const char *who_am_i(void); void read_sum_head(int f, struct sum_struct *sum); void send_files(struct file_list *flist, int f_out, int f_in); Index: receiver.c === RCS file: /cvsroot/rsync/receiver.c,v retrieving revision 1.112 diff -u -3 -p -r1.112 receiver.c --- receiver.c 1 Jan 2005 21:08:17 - 1.112 +++ receiver.c 5 Jan 2005 18:46:02 - @@ -1,4 +1,4 @@ -/* -*- c-file-style: linux -*- +/* -*- c-file-style: linux -*- Copyright (C) 1996-2000 by Andrew Tridgell Copyright (C) Paul Mackerras 1996 @@ -52,6 +52,7 @@ extern int orig_umask; extern int keep_partial; extern int checksum_seed; extern int inplace; +extern int atomic; extern struct exclude_list_struct server_exclude_list; @@ -547,6 +548,7 @@ int recv_files(int f_in, struct file_lis if (partialptr) cleanup_set(fnametmp, partialptr, file, fd1, fd2); + } if (!am_server verbose) /* log the transfer */ @@ -566,6 +568,12 @@ int recv_files(int f_in, struct file_lis exit_cleanup(RERR_FILEIO); } + if (atomic) { + if (recv_ok) +file-flags |= FLAG_RECV_OK; + strlcpy(file-tmpname, fnametmp, MAXPATHLEN - 2); + } + if (recv_ok || inplace) finish_transfer(fname, fnametmp, file, recv_ok); else if (keep_partial partialptr @@ -618,6 +626,9 @@ int recv_files(int f_in, struct file_lis if (delete_after recurse !local_name flist-count 0) delete_files(flist); + if (atomic) + finish_transaction(flist, local_name); + if (verbose 2) rprintf(FINFO,recv_files finished\n); Index: rsync.c === RCS file: /cvsroot/rsync/rsync.c,v retrieving revision 1.150 diff -u -3 -p -r1.150 rsync.c --- rsync.c 3 Jan 2005 22:47:55 - 1.150 +++ rsync.c 5 Jan 2005 18:46:02 - @@ -36,6 +36,7 @@ extern int keep_dirlinks; extern int
Re: Preliminary Suggestion For Atomic Transactions
--On Thursday, January 06, 2005 02:59:44 +0800 Jeff Pitman [EMAIL PROTECTED] wrote: In the past there's been a need to provide consistency between symbolic links or repository metadata during a sync. Currently, rsync renames files piecemeal. The attached patch (extremely ugly) attempts to resolve this by foregoing the rename step until the end. It adds a new option (if we didn't, ls might catch up). There are several issues to get over. The first big one in running the testsuite is that the parent dirs mtime is not getting preserved. I'll need some insight into how this might work. Next, hardlinks need some work. Finally, there are disk space considerations by leaving all tmp files open until the transaction completes. There might be security issues by leaving a larger timing window .. who knows, i'm sure many will make comments about this stuff. I give you (drum roll) ... atomic transactions ... (tada). (Faces with perplex looks go here.) This is _not_ atomic. Please don't call it what it isn't. The only way I can think of to do an actual atomic update involves renaming the top level directory. -- Carson -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Preliminary Suggestion For Atomic Transactions
--On Wednesday, January 05, 2005 21:51:25 +0100 Dag Wieers [EMAIL PROTECTED] wrote: As you picked up correctly from the previous thread, it's not atomic, I called it near-atomic. But it's a trade-off between not having to hardlink a whole lot of files (in my case 300.000 files for each transaction), making it possible to sync with more then one client, not having to configure a directory and having it inside rsync (which could work both ways and does not require a wrapper). I have no objection to the option, just to the name - don't call things atomic if they aren't. Call it delayed-rename, or whatever. -- Carson -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Preliminary Suggestion For Atomic Transactions
On Thursday 06 January 2005 07:04, Carson Gaspar wrote: I have no objection to the option, just to the name - don't call things atomic if they aren't. Call it delayed-rename, or whatever. How about --rename-after? -- -jeff -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Preliminary Suggestion For Atomic Transactions
On Thu, 6 Jan 2005, Jeff Pitman wrote: On Thursday 06 January 2005 07:04, Carson Gaspar wrote: I have no objection to the option, just to the name - don't call things atomic if they aren't. Call it delayed-rename, or whatever. How about --rename-after? I'd rather see it called something like --near-atomic or something else abstract (that is explained in the manual), instead of some action that may be mis-interpreted. But I'm more interested to know if something like this is acceptable to get included or not. Not this patch specifically, but the same functionality. -- dag wieers, [EMAIL PROTECTED], http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power] -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html