Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On Mon, 15 Oct 2007 23:15:34 -0400 Matt McCutchen [EMAIL PROTECTED] wrote: On 10/15/07, Erik Jan Tromp [EMAIL PROTECTED] wrote: # The second error Invalid file index: -101 (-1 - 0) with iflags 0 [receiver] rsync error: protocol incompatibility (code 2) at rsync.c(273) [receiver=3.0.0pre2] rsync: connection unexpectedly closed (21 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(596) [generator=3.0.0pre2] # Sample commands, obfuscated to protect the guilty cp -alf $BACKUPPATH/fluorine/sunday/. $BACKUPPATH/fluorine/monday/ rsync [...] --delete-after [...] rsync://[EMAIL PROTECTED]:$BACKUPPORT/$BACKUPMODULE/ \ $BACKUPPATH/fluorine/monday/ [...] For the record, I no longer get this error after updating from cvs (Wed Oct 17 10:28:30 UTC 2007). Erik -- Failure is not an option. (It comes bundled with Windows.) If at first you don't succeed, redefine success. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On Tue, Oct 16, 2007 at 12:30:34AM -0400, Matt McCutchen wrote: In this situation, automatically disabling incremental recursion would be better than exiting with an error that really isn't the user's fault. Yeah. I hadn't wanted to add an extra round-trip delay that would be required by a direct exchange of bytes. After some more contemplation I've come up with a good way to convey the inc_recurse decision back and forth between the two sides without slowing things down. And the latest code does let the affected side make the determination of which options are not support in incremental recursive mode. ..wayne.. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On Mon, Oct 15, 2007 at 11:15:34PM -0400, Matt McCutchen wrote: However, a more discreet and much more robust solution would be for the client and server to negotiate whether to incrementally recurse just after negotiating the protocol version. Yeah, I agree that it is better for the client to explicitly tell the server what is going on (and allows a batch file to indicate what is happening too). That way there is no confusion. It also allows for future expansion in certain situations -- e.g. I can imagine making a future version of --prune-empty-dirs and/or --delay-updates compatible with inc-recursion, and this will allow a more modern rsync to try to tell a remote receiver to use that option and stay in inc_recurse mode, and the receiver can say no, I don't support that. The CVS version (and latest nightly) have this fixed. ..wayne.. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On 10/16/07, Wayne Davison [EMAIL PROTECTED] wrote: It also allows for future expansion in certain situations -- e.g. I can imagine making a future version of --prune-empty-dirs and/or --delay-updates compatible with inc-recursion, and this will allow a more modern rsync to try to tell a remote receiver to use that option and stay in inc_recurse mode, and the receiver can say no, I don't support that. In this situation, automatically disabling incremental recursion would be better than exiting with an error that really isn't the user's fault. (That's why I said negotiate.) Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On 10/16/07, Wayne Davison [EMAIL PROTECTED] wrote: Yeah, I agree that it is better for the client to explicitly tell the server what is going on (and allows a batch file to indicate what is happening too). Now you could revert the unconditional sending of --detect-renamed in detect-renamed.diff; the new detect-renamed.diff is attached. Matt This patch adds the --detect-renamed option which makes rsync notice files that either (1) match in size modify-time (plus the basename, if possible) or (2) match in size checksum (when --checksum was also specified) and use each match as an alternate basis file to speed up the transfer. The algorithm attempts to scan the receiving-side's files in an efficient manner. If --delete[-before] is enabled, we'll take advantage of the pre-transfer delete pass to prepare any alternate-basis-file matches we might find. If --delete-before is not enabled, rsync does the rename scan during the regular file-sending scan (scanning each directory right before the generator starts updating files from that dir). In this latter mode, rsync might delay the updating of a file (if no alternate-basis match was yet found) until the full scan of the receiving side is complete, at which point any delayed files are processed. I chose to hard-link the alternate-basis files into a .~tmp~ subdir that takes advantage of rsync's pre-existing partial-dir logic. This uses less memory than trying to keep track of the matches internally, and also allows any deletions or file-updates to occur normally without interfering with these alternate-basis discoveries. To use this patch, run these commands for a successful build: patch -p1 patches/detect-renamed.diff ./configure (optional if already run) make TODO: We need to never return a match from fattr_find() that has a basis file. This will ensure that we don't try to give a renamed file to a file that can't use it, while missing out on giving it to a file that could use it. --- old/compat.c +++ new/compat.c @@ -41,6 +41,7 @@ extern int checksum_seed; extern int basis_dir_cnt; extern int prune_empty_dirs; extern int protocol_version; +extern int detect_renamed; extern int protect_args; extern int preserve_uid; extern int preserve_gid; @@ -218,7 +219,7 @@ void setup_protocol(int f_out,int f_in) } else if (protocol_version = 30) { if (recurse allow_inc_recurse !delete_before !delete_after !delay_updates - !use_qsort !prune_empty_dirs) + !use_qsort !prune_empty_dirs !detect_renamed) inc_recurse = 1; if (am_server || read_batch) { int i_r = read_byte(f_in); --- old/flist.c +++ new/flist.c @@ -61,6 +61,7 @@ extern int non_perishable_cnt; extern int prune_empty_dirs; extern int copy_links; extern int copy_unsafe_links; +extern int detect_renamed; extern int protocol_version; extern int sanitize_paths; extern struct stats stats; @@ -113,6 +114,8 @@ static int64 tmp_dev, tmp_ino; #endif static char tmp_sum[MAX_DIGEST_LEN]; +struct file_list the_fattr_list; + static char empty_sum[MAX_DIGEST_LEN]; static int flist_count_offset; /* for --delete --progress */ static int dir_count = 0; @@ -252,6 +255,45 @@ static int is_excluded(char *fname, int return 0; } +static int fattr_compare(struct file_struct **file1, struct file_struct **file2) +{ + struct file_struct *f1 = *file1; + struct file_struct *f2 = *file2; + int64 len1 = F_LENGTH(f1), len2 = F_LENGTH(f2); + int diff; + + if (!f1-basename || !S_ISREG(f1-mode) || !len1) { + if (!f2-basename || !S_ISREG(f2-mode) || !len2) + return 0; + return 1; + } + if (!f2-basename || !S_ISREG(f2-mode) || !len2) + return -1; + + /* Don't use diff for values that are longer than an int. */ + if (len1 != len2) + return len1 len2 ? -1 : 1; + + if (always_checksum) { + diff = u_memcmp(F_SUM(f1), F_SUM(f2), checksum_len); + if (diff) + return diff; + } else if (f1-modtime != f2-modtime) + return f1-modtime f2-modtime ? -1 : 1; + + diff = u_strcmp(f1-basename, f2-basename); + if (diff) + return diff; + + if (f1-dirname == f2-dirname) + return 0; + if (!f1-dirname) + return -1; + if (!f2-dirname) + return 1; + return u_strcmp(f1-dirname, f2-dirname); +} + static void send_directory(int f, struct file_list *flist, char *fbuf, int len, int flags); @@ -2154,6 +2196,25 @@ struct file_list *recv_file_list(int f) clean_flist(flist, relative_paths); + if (detect_renamed) { + int j = flist-used; + the_fattr_list.used = j; + the_fattr_list.files = new_array(struct file_struct *, j); + if (!the_fattr_list.files) + out_of_memory(recv_file_list); + memcpy(the_fattr_list.files, flist-files, + j * sizeof (struct file_struct *)); + qsort(the_fattr_list.files, j, + sizeof the_fattr_list.files[0], (int (*)())fattr_compare); + the_fattr_list.low = 0; + while (j-- 0) { + struct file_struct *fp = the_fattr_list.files[j]; + if (fp-basename S_ISREG(fp-mode)
Re: To inc_recurse or not to inc_recurse? [Re: 3.0.0pre2: bookend breakage (2 different errors)]
On 10/16/07, Matt McCutchen [EMAIL PROTECTED] wrote: (That's why I said negotiate.) One more thing I want to point out in case you haven't already thought of it. Once two-way negotiation is in place, each side should refuse incremental recursion only if it can't fulfill its *own* duties under incremental recursion; the sides should not second-guess each other. This way, e.g., incremental recursion can be used on a push with --delay-updates if the server supports the combination, regardless of whether the client would if it were the receiver. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html