Re: --detect-renamed question
Matt, That was too quick! I think --trust-move is a really good thing and I'll test it out soon. Some thoughts on all this now that I've had my caffeine this morning. It would take one very crafty user to delete a file and create one with the same name, mtime, and size. The only issue I can see is if there were 2 files with the same name, mtime, and size but different data. Highly unlikely but still possible, right? Let's call these files fileA and fileB. If fileA is deleted and fileB is copied to another directory, what happens? Rsync would hard link the fileA as the new fileB when using --trust-move. We end up with fileA and fileB on the destination with fileB and fileB on the source. So, how do we fix this situation? Is there a way to check for duplicate entries? If rsync checks if the file it's about to hard link is a non-unique file, (same name, mtime, size as another file) then it should copy from the source fileB instead of hard linking from the deleted fileA. Does this make sense? It would require rsync to have a complete scan of the source prior to doing anything. This should help those situations where someone does an upper level directory move with lots of files and data underneath. I recall someone else was asking about this on the list. Greg On Oct 12, 2007, at 6:43 PM, Matt McCutchen wrote: On 10/12/07, Greg Siekas [EMAIL PROTECTED] wrote: The other option I thought of was to only do the move when the mtime, size, and filename match. Not really a 'detect-renamed' but a 'detected-moved' type operation. That's a good idea, and easy to implement too! I have improved the patch (attached) to provide separate --trust-rename and --trust-move options. Wayne, please consider adding this to patches/ . Matt trust-rename.diff -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: --detect-renamed question
On 10/11/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I've started testing the detect-renamed patch with 2.6.9 and soon 3.0.0pre1. I have an unique situation where I'm rsync'ing to a HSM based filesystem. I've found that the detect-renamed patch works but it appears to do a copy of the file to the new destination. This is particular slow since the file in the HSM based filesystem may only be a stub and all the data is only resident on tape. The copy waits for the datq to be recalled from tape which depending on the file size can take a long time. I've looked through the patch code and am wondering if there is an easy way to have rsync do a move from the ~.tmp. directory. This is easy to do, and I have implemented a --trust-detect-renamed option to do it in the attached patch to the current CVS rsync. However, it is risky because a false rename detection could cause rsync to substitute an unrelated but similar-looking destination file for a new source file. Don't use the option unless you are prepared for the consequences. Matt In combination with detect-renamed.diff, this patch adds an option --trust-detect-renamed that adopts an apparently unrenamed destination file without verifying that its data matches that of the source file. This is risky but it is what Greg [EMAIL PROTECTED] wanted: http://lists.samba.org/archive/rsync/2007-October/018827.html This patch is EXPERIMENTAL, though it did work correctly in my single test. -- Matt McCutchen [EMAIL PROTECTED] --- old/generator.c +++ new/generator.c @@ -80,6 +80,7 @@ extern int compare_dest; extern int copy_dest; extern int link_dest; extern int detect_renamed; +extern int trust_detect_renamed; extern int whole_file; extern int list_only; extern int read_batch; @@ -1828,6 +1829,22 @@ static void recv_generator(char *fname, fnamecmp = partialptr; fnamecmp_type = FNAMECMP_PARTIAL_DIR; statret = 0; + if (detect_renamed trust_detect_renamed + unchanged_file(fnamecmp, file, sx.st)) { + /* Adopt the partial file. */ + finish_transfer(fname, fnamecmp, NULL, NULL, file, 1, 1); + handle_partial_dir(partialptr, PDIR_DELETE); + if (itemizing) +itemize(fnamecmp, file, ndx, -1, sx, + ITEM_LOCAL_CHANGE, fnamecmp_type, NULL); +#ifdef SUPPORT_HARD_LINKS + if (preserve_hard_links F_IS_HLINKED(file)) +finish_hard_link(file, fname, ndx, sx.st, itemizing, code, -1); +#endif + if (remove_source_files == 1) +goto return_with_success; + goto cleanup; + } } if (!do_xfers) --- old/options.c +++ new/options.c @@ -81,6 +81,7 @@ int am_starting_up = 1; int relative_paths = -1; int implied_dirs = 1; int detect_renamed = 0; +int trust_detect_renamed = 0; int numeric_ids = 0; int allow_8bit_chars = 0; int force_delete = 0; @@ -385,6 +386,7 @@ void usage(enum logcode F) rprintf(F, -T, --temp-dir=DIR create temporary files in directory DIR\n); rprintf(F, -y, --fuzzy find similar file for basis if no dest file\n); rprintf(F, --detect-renamedtry to find renamed files to speed up the transfer\n); + rprintf(F, --trust-detect-renamed ... and assume identical to source files (risky!)\n); rprintf(F, --compare-dest=DIR also compare destination files relative to DIR\n); rprintf(F, --copy-dest=DIR ... and include copies of unchanged files\n); rprintf(F, --link-dest=DIR hardlink to files in DIR when unchanged\n); @@ -564,6 +566,7 @@ static struct poptOption long_options[] {copy-dest,0, POPT_ARG_STRING, 0, OPT_COPY_DEST, 0, 0 }, {link-dest,0, POPT_ARG_STRING, 0, OPT_LINK_DEST, 0, 0 }, {detect-renamed, 0, POPT_ARG_NONE, detect_renamed, 0, 0, 0 }, + {trust-detect-renamed,0,POPT_ARG_NONE, trust_detect_renamed, 0, 0, 0 }, {fuzzy, 'y', POPT_ARG_NONE, fuzzy_basis, 0, 0, 0 }, {compress,'z', POPT_ARG_NONE, 0, 'z', 0, 0 }, {no-compress, 0, POPT_ARG_VAL,do_compression, 0, 0, 0 }, @@ -1895,8 +1898,12 @@ void server_options(char **args, int *ar } } /* Both sides need to know in case this disables incremental recursion. */ - if (detect_renamed) + if (detect_renamed) { args[ac++] = --detect-renamed; + /* But the addition of --trust-detect-renamed is only the receiver's business. */ + if (am_sender trust_detect_renamed) + args[ac++] = --trust-detect-renamed; + } if (modify_window_set) { if (asprintf(arg, --modify-window=%d, modify_window) 0) --- old/rsync.yo +++ new/rsync.yo @@ -385,6 +385,7 @@ to the detailed description below for a -T, --temp-dir=DIR create temporary files in directory DIR -y, --fuzzy find similar file for basis if no dest file --detect-renamedtry to find renamed files to speed the xfer + --trust-detect-renamed .. assume identical to src files (risky!) --compare-dest=DIR also compare received files relative to DIR --copy-dest=DIR ...
Re: --detect-renamed question
On 10/12/07, Greg Siekas [EMAIL PROTECTED] wrote: The other option I thought of was to only do the move when the mtime, size, and filename match. Not really a 'detect-renamed' but a 'detected-moved' type operation. That's a good idea, and easy to implement too! I have improved the patch (attached) to provide separate --trust-rename and --trust-move options. Wayne, please consider adding this to patches/ . Matt In combination with detect-renamed.diff, this patch adds an option --trust-rename that adopts the pre-rename destination file found for a new source file without verifying that the data is actually the same. It also adds a variant --trust-move that requires that the basenames match. These options are somewhat risky but were what Greg Siekas wanted: http://lists.samba.org/archive/rsync/2007-October/018827.html This patch is EXPERIMENTAL, though it did work correctly in my light testing. FIXME: If a run with --trust-rename stages a different-basename destination file and then gets interrupted, a subsequent run with --trust-move trusts the staged file. -- Matt McCutchen [EMAIL PROTECTED] --- old/generator.c +++ new/generator.c @@ -80,6 +80,7 @@ extern int compare_dest; extern int copy_dest; extern int link_dest; extern int detect_renamed; +extern int trust_rename; extern int whole_file; extern int list_only; extern int read_batch; @@ -212,7 +213,9 @@ static int fattr_find(struct file_struct high = mid - 1; } - return good_match = 0 ? good_match : ok_match; + return good_match = 0 ? good_match : + /* --trust-move doesn't allow non-basename matches */ + (trust_rename == 1) ? -1 : ok_match; } static void look_for_rename(struct file_struct *file, char *fname) @@ -1826,6 +1829,22 @@ static void recv_generator(char *fname, fnamecmp = partialptr; fnamecmp_type = FNAMECMP_PARTIAL_DIR; statret = 0; + if (detect_renamed trust_rename + unchanged_file(fnamecmp, file, sx.st)) { + /* Adopt the partial file. */ + finish_transfer(fname, fnamecmp, NULL, NULL, file, 1, 1); + handle_partial_dir(partialptr, PDIR_DELETE); + if (itemizing) +itemize(fnamecmp, file, ndx, -1, sx, + ITEM_LOCAL_CHANGE, fnamecmp_type, NULL); +#ifdef SUPPORT_HARD_LINKS + if (preserve_hard_links F_IS_HLINKED(file)) +finish_hard_link(file, fname, ndx, sx.st, itemizing, code, -1); +#endif + if (remove_source_files == 1) +goto return_with_success; + goto cleanup; + } } if (!do_xfers) --- old/options.c +++ new/options.c @@ -81,6 +81,7 @@ int am_starting_up = 1; int relative_paths = -1; int implied_dirs = 1; int detect_renamed = 0; +int trust_rename = 0; int numeric_ids = 0; int allow_8bit_chars = 0; int force_delete = 0; @@ -385,6 +386,8 @@ void usage(enum logcode F) rprintf(F, -T, --temp-dir=DIR create temporary files in directory DIR\n); rprintf(F, -y, --fuzzy find similar file for basis if no dest file\n); rprintf(F, --detect-renamedtry to find renamed files to speed up the transfer\n); + rprintf(F, --trust-rename ... and assume identical to source files (risky!)\n); + rprintf(F, --trust-move... only if basenames match (less risky)\n); rprintf(F, --compare-dest=DIR also compare destination files relative to DIR\n); rprintf(F, --copy-dest=DIR ... and include copies of unchanged files\n); rprintf(F, --link-dest=DIR hardlink to files in DIR when unchanged\n); @@ -564,6 +567,8 @@ static struct poptOption long_options[] {copy-dest,0, POPT_ARG_STRING, 0, OPT_COPY_DEST, 0, 0 }, {link-dest,0, POPT_ARG_STRING, 0, OPT_LINK_DEST, 0, 0 }, {detect-renamed, 0, POPT_ARG_NONE, detect_renamed, 0, 0, 0 }, + {trust-rename, 0, POPT_ARG_VAL,trust_rename, 2, 0, 0 }, + {trust-move, 0, POPT_ARG_VAL,trust_rename, 1, 0, 0 }, {fuzzy, 'y', POPT_ARG_NONE, fuzzy_basis, 0, 0, 0 }, {compress,'z', POPT_ARG_NONE, 0, 'z', 0, 0 }, {no-compress, 0, POPT_ARG_VAL,do_compression, 0, 0, 0 }, @@ -1895,8 +1900,13 @@ void server_options(char **args, int *ar } } /* Both sides need to know in case this disables incremental recursion. */ - if (detect_renamed) + if (detect_renamed) { args[ac++] = --detect-renamed; + /* But the addition of --trust-* is only the receiver's business. */ + if (am_sender trust_rename) + args[ac++] = (trust_rename == 2) ? + --trust-rename : --trust-move; + } if (modify_window_set) { if (asprintf(arg, --modify-window=%d, modify_window) 0) --- old/rsync.yo +++ new/rsync.yo @@ -385,6 +385,8 @@ to the detailed description below for a -T, --temp-dir=DIR create temporary files in directory DIR -y, --fuzzy find similar file for basis if no dest file --detect-renamedtry to find renamed files to speed the xfer + --trust-rename ... assume identical to src files (risky!) + --trust-move
--detect-renamed question
I've started testing the detect-renamed patch with 2.6.9 and soon 3.0.0pre1. I have an unique situation where I'm rsync'ing to a HSM based filesystem. I've found that the detect-renamed patch works but it appears to do a copy of the file to the new destination. This is particular slow since the file in the HSM based filesystem may only be a stub and all the data is only resident on tape. The copy waits for the datq to be recalled from tape which depending on the file size can take a long time. I've looked through the patch code and am wondering if there is an easy way to have rsync do a move from the ~.tmp. directory. thanks, Greg -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html