Re: Preliminary Suggestion For Atomic Transactions

2005-01-10 Thread Wayne Davison
On Thu, Jan 06, 2005 at 02:59:44AM +0800, Jeff Pitman wrote:
 The attached patch (extremely ugly) attempts to resolve this by
 foregoing the rename step until the end.

Thanks for the patch.  It looks like it implements the basics of the
idea, but I think that it will be better to use deterministic temp-file
names like the --partial-dir logic gives us.  I also chose to use an
option name of --delay-renames.  So, I've created my own version of this
post-transfer renaming heuristic, which is available from here:

http://rsync.samba.org/ftp/unpacked/rsync/patches/delay-renames.diff

The patch is relative to the very latest CVS or nightly tar file.

One unresolved issue I thought of is how rsync should deal with an
absolute --partial-dir pathname:  the above implementation woul put
all the partial files into that one dir (a relative pathname, like
the default, ensures that the files get sprinkled around).

 The first big [problem] in running the testsuite is that the parent
 dirs mtime is not getting preserved.  [...] Next, hardlinks need some
 work.

These problems stem from a bug in the synchronization between the
receiver and the generator that I just discovered:  the generator does
not wait until all the redo  delete-after work is done before it begins
the final hard-link pass and the final directory-time-tweaking pass.
Because of this, the extra renaming loop added at the end of the
receiver's work didn't get done before the generator got going on the
hard-linking and directory-time tweaking.  I have checked-in a fix for
this, so the above delay-renames.diff works fine with a test-suite that
has been modified to use --delay-renames on all the possible commands.

Here's some of the design items that the patch deals with:

  - Complain if the user combines --inplace with --delay-renames.

  - A client sender needs to pass the new option to a server receiver.

  - We use the --partial-dir logic to make all the temp-file names
deterministic, using the default name of .~tmp~ if the user did
not specify the --partial-dir option.  Note that these dirs get
auto-excluded by the normal --partial-dir handling, so they will not
be transferred by default, and they will not be deleted by --delete
or --delete-after if the prior transfer did not complete and some
temp files got left around (they will also be used to speed up the
transfer when it is restarted).

The patch also fixes/changes a few things from your patch:

  - The rename loop must happen prior to the delete-after logic in the
receiver.

  - It is good to avoid twiddling the flist-flags bits in the forked
receiver because it will cause all the shared file-list memory
(between the generator and the receiver) to become unshared (on
systems that use copy-on-write memory when forking).

  - It is good to avoid adding large amounts of memory to the flist
object (adding MAXPATHLEN chars of extra memory for each file
transferred would bloat rsync's memory by a huge amount for large
transfers).

I've attached a 3-hunk patch that kluges in the use of the new
--delay-renames option into all the (relevant) test-suite rsync
commands (for temporary testing).  Apply it after the main patch,
if desired.

..wayne..
--- options.c   2005-01-10 10:16:54 +
+++ options.c   2005-01-10 10:30:07 +
@@ -456,6 +456,7 @@
   {protocol, 0,  POPT_ARG_INT,protocol_version, 0, 0, 0 },
   {server,   0,  POPT_ARG_NONE,   am_server, 0, 0, 0 },
   {help,'h', POPT_ARG_NONE,   0, 'h', 0, 0 },
+  {delay-renames,0,  POPT_ARG_NONE,   0, 0, 0, 0 },
   {0,0,0,0, 0, 0, 0}
 };
 
@@ -947,6 +948,8 @@
bwlimit_writemax = 512;
}
 
+   if (inplace)
+   delay_renames = 0;
if (delay_renames  !partial_dir)
partial_dir = .~tmp~;
 
--- runtests.sh 2004-02-04 05:34:29 +
+++ runtests.sh 2005-01-10 10:27:33 +
@@ -168,7 +168,7 @@
 exit 2
 fi
 
-RSYNC=$rsync_bin
+RSYNC=$rsync_bin --delay-renames
 
 export rsync_bin RSYNC
 
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Jeff Pitman
In the past there's been a need to provide consistency between symbolic 
links or repository metadata during a sync.  Currently, rsync renames 
files piecemeal.  The attached patch (extremely ugly) attempts to 
resolve this by foregoing the rename step until the end.  It adds a new 
option (if we didn't, ls might catch up).

There are several issues to get over.  The first big one in running the 
testsuite is that the parent dirs mtime is not getting preserved.  I'll 
need some insight into how this might work.  Next, hardlinks need some 
work.  Finally, there are disk space considerations by leaving all tmp 
files open until the transaction completes.  There might be security 
issues by leaving a larger timing window .. who knows, i'm sure many 
will make comments about this stuff.

I give you  (drum roll) ... atomic transactions ... (tada).  (Faces 
with perplex looks go here.)

-- 
-jeff
? .rsync.c.swp
? atomic.test
Index: options.c
===
RCS file: /cvsroot/rsync/options.c,v
retrieving revision 1.193
diff -u -3 -p -r1.193 options.c
--- options.c	1 Jan 2005 21:08:14 -	1.193
+++ options.c	5 Jan 2005 18:46:01 -
@@ -98,6 +98,7 @@ int blocking_io = -1;
 int checksum_seed = 0;
 int inplace = 0;
 long block_size = 0; /* long because popt can't set an int32. */
+int atomic = 0;
 
 
 /** Network address family. **/
@@ -276,6 +277,7 @@ void usage(enum logcode F)
   rprintf(F, --ignore-errors delete even if there are I/O errors\n);
   rprintf(F, --max-delete=NUMdon't delete more than NUM files\n);
   rprintf(F, --max-size=SIZE don't transfer any file larger than SIZE\n);
+  rprintf(F, --atomicensure atomicity over entire transaction\n);
   rprintf(F, --partial   keep partially transferred files\n);
   rprintf(F, --partial-dir=DIR   put a partially transferred file into DIR\n);
   rprintf(F, --force force deletion of directories even if not empty\n);
@@ -388,6 +390,7 @@ static struct poptOption long_options[] 
   {compress,'z', POPT_ARG_NONE,   do_compression, 0, 0, 0 },
   {stats,0,  POPT_ARG_NONE,   do_stats, 0, 0, 0 },
   {progress, 0,  POPT_ARG_NONE,   do_progress, 0, 0, 0 },
+  {atomic,   0,  POPT_ARG_NONE,   atomic, 0, 0, 0 },
   {partial,  0,  POPT_ARG_NONE,   keep_partial, 0, 0, 0 },
   {partial-dir,  0,  POPT_ARG_STRING, partial_dir, 0, 0, 0 },
   {ignore-errors,0,  POPT_ARG_NONE,   ignore_errors, 0, 0, 0 },
Index: proto.h
===
RCS file: /cvsroot/rsync/proto.h,v
retrieving revision 1.220
diff -u -3 -p -r1.220 proto.h
--- proto.h	1 Jan 2005 21:08:02 -	1.220
+++ proto.h	5 Jan 2005 18:46:01 -
@@ -198,6 +198,7 @@ int set_perms(char *fname,struct file_st
 void sig_int(void);
 void finish_transfer(char *fname, char *fnametmp, struct file_struct *file,
 		 int ok_to_set_time);
+void finish_transaction(struct file_list *flist, char *local_name);
 const char *who_am_i(void);
 void read_sum_head(int f, struct sum_struct *sum);
 void send_files(struct file_list *flist, int f_out, int f_in);
Index: receiver.c
===
RCS file: /cvsroot/rsync/receiver.c,v
retrieving revision 1.112
diff -u -3 -p -r1.112 receiver.c
--- receiver.c	1 Jan 2005 21:08:17 -	1.112
+++ receiver.c	5 Jan 2005 18:46:02 -
@@ -1,4 +1,4 @@
-/* -*- c-file-style: linux -*-
+/* -*- c-file-style: linux -*- 
 
Copyright (C) 1996-2000 by Andrew Tridgell
Copyright (C) Paul Mackerras 1996
@@ -52,6 +52,7 @@ extern int orig_umask;
 extern int keep_partial;
 extern int checksum_seed;
 extern int inplace;
+extern int atomic;
 
 extern struct exclude_list_struct server_exclude_list;
 
@@ -547,6 +548,7 @@ int recv_files(int f_in, struct file_lis
 
 			if (partialptr)
 cleanup_set(fnametmp, partialptr, file, fd1, fd2);
+
 		}
 
 		if (!am_server  verbose) /* log the transfer */
@@ -566,6 +568,12 @@ int recv_files(int f_in, struct file_lis
 			exit_cleanup(RERR_FILEIO);
 		}
 
+		if (atomic) {
+			if (recv_ok) 
+file-flags |= FLAG_RECV_OK;
+			strlcpy(file-tmpname, fnametmp, MAXPATHLEN - 2);
+		}
+
 		if (recv_ok || inplace)
 			finish_transfer(fname, fnametmp, file, recv_ok);
 		else if (keep_partial  partialptr
@@ -618,6 +626,9 @@ int recv_files(int f_in, struct file_lis
 	if (delete_after  recurse  !local_name  flist-count  0)
 		delete_files(flist);
 
+	if (atomic)
+		finish_transaction(flist, local_name);
+
 	if (verbose  2)
 		rprintf(FINFO,recv_files finished\n);
 
Index: rsync.c
===
RCS file: /cvsroot/rsync/rsync.c,v
retrieving revision 1.150
diff -u -3 -p -r1.150 rsync.c
--- rsync.c	3 Jan 2005 22:47:55 -	1.150
+++ rsync.c	5 Jan 2005 18:46:02 -
@@ -36,6 +36,7 @@ extern int keep_dirlinks;
 extern int 

Re: Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Carson Gaspar
--On Thursday, January 06, 2005 02:59:44 +0800 Jeff Pitman 
[EMAIL PROTECTED] wrote:

In the past there's been a need to provide consistency between symbolic
links or repository metadata during a sync.  Currently, rsync renames
files piecemeal.  The attached patch (extremely ugly) attempts to
resolve this by foregoing the rename step until the end.  It adds a new
option (if we didn't, ls might catch up).
There are several issues to get over.  The first big one in running the
testsuite is that the parent dirs mtime is not getting preserved.  I'll
need some insight into how this might work.  Next, hardlinks need some
work.  Finally, there are disk space considerations by leaving all tmp
files open until the transaction completes.  There might be security
issues by leaving a larger timing window .. who knows, i'm sure many
will make comments about this stuff.
I give you  (drum roll) ... atomic transactions ... (tada).  (Faces
with perplex looks go here.)
This is _not_ atomic. Please don't call it what it isn't. The only way I 
can think of to do an actual atomic update involves renaming the top level 
directory.

--
Carson
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Carson Gaspar

--On Wednesday, January 05, 2005 21:51:25 +0100 Dag Wieers [EMAIL PROTECTED] 
wrote:

As you picked up correctly from the previous thread, it's not atomic, I
called it near-atomic.
But it's a trade-off between not having to hardlink a whole lot of files
(in my case 300.000 files for each transaction), making it possible to
sync with more then one client, not having to configure a directory and
having it inside rsync (which could work both ways and does not require a
wrapper).
I have no objection to the option, just to the name - don't call things 
atomic if they aren't. Call it delayed-rename, or whatever.

--
Carson
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Jeff Pitman
On Thursday 06 January 2005 07:04, Carson Gaspar wrote:
 I have no objection to the option, just to the name - don't call
 things atomic if they aren't. Call it delayed-rename, or whatever.

How about --rename-after?

-- 
-jeff
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Dag Wieers
On Thu, 6 Jan 2005, Jeff Pitman wrote:

 On Thursday 06 January 2005 07:04, Carson Gaspar wrote:
  I have no objection to the option, just to the name - don't call
  things atomic if they aren't. Call it delayed-rename, or whatever.
 
 How about --rename-after?

I'd rather see it called something like --near-atomic or something else 
abstract (that is explained in the manual), instead of some action that 
may be mis-interpreted.

But I'm more interested to know if something like this is acceptable to 
get included or not. Not this patch specifically, but the same 
functionality.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html