Re: Behavior of git rm
On Wed, Apr 03, 2013 at 10:35:52AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: Of the two situations, I think the first one is less likely to be destructive (noticing that a file is already gone via ENOTDIR), as we are only proceeding with the index deletion, and we end up not touching the filesystem at all. Nice to see sound reasoning. Here's a patch series which I think covers what we've discussed. [1/3]: rm: do not complain about d/f conflicts during deletion [2/3]: t3600: test behavior of reverse-d/f conflict [3/3]: t3600: test rm of path with changed leading symlinks The first one is the code change, and the rest just documents the cases we discussed. The third one is a little subtle. For the most part is it just testing the normal changed content requires --force behavior of rm. But I think it is worth having because it also makes sure that after deleting d/f when d is a symlink to e, that we do not remove the new directory e nor the symlink d. I do not think this case was explicitly planned for, but it does do the right thing now, and given the subtlety, I'd rather somebody who changes it notice the breakage in the test suite. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Behavior of git rm
On Wed, Apr 03, 2013 at 07:50:24AM -0700, jpinheiro wrote: While experimenting with git we found an unexpected behavior with git rm. Here is a trace of the unexpected behavior: $ git init $ mkdir D $ echo Hi D/F $ git add D/F $ rm -r D $ echo Hey D $ git rm D/F warning: 'D/F': Not a directory rm 'D/F' fatal: git rm: 'D/F': Not a directory We drop the D/F entry from the index, but then fail to actually remove it from the filesystem, because it has already been replaced. It is impossible to tell from this toy example what the true intent was, but in such a situation, there is a reasonable chance that the user should have invoked rm --cached in the first place. That being said, we do try to handle files which have already gone missing; when unlink() fails, we do not consider it an error if we got ENOENT. We could perhaps add ENOTDIR to that list, as it also indicates that the file is gone (it just happens that one of its prefix directories was replaced with something else). The opposite case is also interesting: $ git init $ echo 1 D $ git add D $ rm D $ mkdir D $ echo 2 D/F $ git rm D rm 'D' fatal: git rm: 'D': Is a directory We expect to see 'D' as a file, but it is now a directory. We _could_ recursively remove the directory, but that has the potential to delete files that the user does not expect. So in both cases, git rm could certainly detect the situation and proceed with the destructive operation. But when there is such a conflict between what's in the working tree and what's in the index, I think we may be better off erring on the conservative side and bailing, and letting the user reconcile the differences themselves (using either git add or git rm --cached to update the index, or deciding how to handle the working tree contents themselves with regular rm). Of the two situations, I think the first one is less likely to be destructive (noticing that a file is already gone via ENOTDIR), as we are only proceeding with the index deletion, and we end up not touching the filesystem at all. That patch would look something like: diff --git a/builtin/rm.c b/builtin/rm.c index dabfcf6..7b91d52 100644 --- a/builtin/rm.c +++ b/builtin/rm.c @@ -110,7 +110,7 @@ static int check_local_mod(unsigned char *head, int index_only) ce = active_cache[pos]; if (lstat(ce-name, st) 0) { - if (errno != ENOENT) + if (errno != ENOENT errno != ENOTDIR) warning('%s': %s, ce-name, strerror(errno)); /* It already vanished from the working tree */ continue; diff --git a/dir.c b/dir.c index 57394e4..f9e7355 100644 --- a/dir.c +++ b/dir.c @@ -1603,7 +1603,7 @@ int remove_path(const char *name) { char *slash; - if (unlink(name) errno != ENOENT) + if (unlink(name) errno != ENOENT errno != ENOTDIR) return -1; slash = strrchr(name, '/'); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Behavior of git rm
Jeff King p...@peff.net writes: Of the two situations, I think the first one is less likely to be destructive (noticing that a file is already gone via ENOTDIR), as we are only proceeding with the index deletion, and we end up not touching the filesystem at all. Nice to see sound reasoning. diff --git a/builtin/rm.c b/builtin/rm.c index dabfcf6..7b91d52 100644 --- a/builtin/rm.c +++ b/builtin/rm.c @@ -110,7 +110,7 @@ static int check_local_mod(unsigned char *head, int index_only) ce = active_cache[pos]; if (lstat(ce-name, st) 0) { - if (errno != ENOENT) + if (errno != ENOENT errno != ENOTDIR) OK. We may be running lstat() on D/F but there may be D that is not a directory. If it is a file, we get ENOTDIR. By the way, if D is a dangling symlink, we get ENOENT; in such a case, we report rm 'D/F' on the output and remove the index entry. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ ln -s erewhon D git rm D/F git ls-files rm 'D/F' Also if D is a symlink that point at a directory E, git rm does something interesting. (1) Perhaps we want a complaint in this case. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ mkdir E ln -s E D git rm D/F (2) Perhaps we want to make sure D/F is not beyond a symlink in this case. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ mkdir E ln -s E D date E/F git rm D/F $ git rm -f D/F diff --git a/dir.c b/dir.c index 57394e4..f9e7355 100644 --- a/dir.c +++ b/dir.c @@ -1603,7 +1603,7 @@ int remove_path(const char *name) { char *slash; - if (unlink(name) errno != ENOENT) + if (unlink(name) errno != ENOENT errno != ENOTDIR) return -1; Ditto. slash = strrchr(name, '/'); -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Behavior of git rm
On Wed, Apr 03, 2013 at 10:35:52AM -0700, Junio C Hamano wrote: diff --git a/builtin/rm.c b/builtin/rm.c index dabfcf6..7b91d52 100644 --- a/builtin/rm.c +++ b/builtin/rm.c @@ -110,7 +110,7 @@ static int check_local_mod(unsigned char *head, int index_only) ce = active_cache[pos]; if (lstat(ce-name, st) 0) { - if (errno != ENOENT) + if (errno != ENOENT errno != ENOTDIR) OK. We may be running lstat() on D/F but there may be D that is not a directory. If it is a file, we get ENOTDIR. By the way, if D is a dangling symlink, we get ENOENT; in such a case, we report rm 'D/F' on the output and remove the index entry. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ ln -s erewhon D git rm D/F git ls-files rm 'D/F' That seems sane to me, and makes me feel like handling ENOTDIR here is the right direction. What that conditional is trying to say is if it is because the file is not there..., and so far we know of three conditions where it is not there: 1. There is no entry at that path. 2. There is a non-directory in the prefix of that path. 3. There is a dangling symlink in the prefix of that path. (1) and (3) we already handle via ENOENT. I think it is sane to handle (2) the same as (3), but we do not do so currently. Also if D is a symlink that point at a directory E, git rm does something interesting. (1) Perhaps we want a complaint in this case. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ mkdir E ln -s E D git rm D/F I think that is OK without complaint; the user asked to get rid of D/F, and it is indeed gone (as well as its index entry) after the call finishes. And we did not even need to delete anything, so we cannot be losing data. I am much more concerned about this case: (2) Perhaps we want to make sure D/F is not beyond a symlink in this case. $ rm -f .git/index rm -fr D E $ mkdir D D/F git add D rm -fr D $ mkdir E ln -s E D date E/F git rm D/F where the user is deleting something that may or may not be related to the original D/F. On the other hand, I don't have that much sympathy; rm would make the same deletion. But hmm...shouldn't we be doing an up-to-date check? Indeed: $ git rm D/F error: 'D/F' has staged content different from both the file and the HEAD (use -f to force removal) $ git commit -m foo git rm D/F $ git rm D/F error: 'D/F' has local modifications (use --cached to keep the file, or -f to force removal) So I do not think we need any extra safety; the content-level checks should be enough to make sure we are not losing anything. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html