Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-07-06 Thread Petr Baudis
  Hi!

  (tl;dr - I disagree but this issue is perhaps not so important
in practice)

On Sun, Jun 30, 2013 at 12:14:26PM -0700, Junio C Hamano wrote:
 I do not agree with your `git reset --hard` at all.  With the
 command, the user demands no matter what, I want get rid of any
 funny state in my working tree so that I can start my work from that
 specified commit (default to HEAD).

  Yeah, but this normally concerns only tracked files; `git reset
--hard` does not imply `git clean`. I'm worried when a tool normally
behaves in a way that follows an apparent rule but its behavior is
defined in such a way that in a corner case this rule is violated (but
it's ok since it's a - non-obvious - implication of the specification).

 Imagine that this is you did to arrive that funny state:
 
   $ git rm foo ;# foo used to be tracked and in HEAD
 $ cp /somewhere/else/foo foo
   $ cp /somewhere/else/bar bar ;# bar is not in HEAD
   $ cp /somewhere/else/bar baz ;# baz is in HEAD
 ... do various other things ...
 
 and then git reset --hard.  At that point, foo and bar are not
 tracked and completely unrelated to the project.  baz is tracked
 and have unrelated contents from that of HEAD.
 
 In order to satisfy your desire to go back to the state of HEAD with
 minimal collateral amage, we need to get rid of the updated foo
 and baz and replace them with those from HEAD.  We do not have to
 touch bar so we leave it as-is.

  Perhaps we misundertood each other here. I certainly don't care to
keep local changes as a whole - a command behaving like that wouldn't
be very useful for me; for me, the crucial distinction is between
tracked and untracked files. Therefore, from my viewpoint it's fine
to overwrite baz, but not to overwrite foo.

 And the killed case is just like foo and baz.  If the state
 you want to go back to with --hard has a directory (a file) where
 your working tree's funny state has a file (a directory), the local
 cruft needs to go away to satisify your request.
 
 I do not mind if you are proposing a different and new kind of reset
 that fails if it has to overwrite any local changes (be it tracked
 or untracked), but that is not reset --hard.  It is something else.

  Hmm, I suppose the assumption I would prefer is that the only command
that will destroy (currently) untracked data without warning is `git
clean`; even though (unlike in case of git stash) the current reset
--hard behavior wouldn't surprise me, I suspect it can be a bad surprise
for many Git users when they hit this situation; but since I didn't
notice any actual complaint yet, so I don't care enough to press this
further for now anyway. :-)

-- 
Petr Pasky Baudis
For every complex problem there is an answer that is clear,
simple, and wrong.  -- H. L. Mencken
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-06-30 Thread Petr Baudis
  Hi!

On Fri, Jun 28, 2013 at 11:39:16AM -0700, Junio C Hamano wrote:
 Thanks.  I'll queue it with a pair of fix-up commits on top, so that
 they can later be squashed in.
 
 The result of squashing the fix-ups would look like this.

  Thanks! I agree with all of your changes.

 -- 8 --
 From: Petr Baudis pa...@ucw.cz
 Date: Fri, 28 Jun 2013 17:05:32 +0200
 Subject: [PATCH] git stash: avoid data loss when git stash save kills a 
 directory

  Hmm, it's a pity that the note that `git reset --hard` itself should
perhaps also abort in that case got lost. I don't insist on mentioning
it in the commit message, though.


On Fri, Jun 28, 2013 at 02:30:15PM -0700, Junio C Hamano wrote:
 -- 8 --
 Subject: treat_directory(): do not declare submodules in index to be untracked

  Oh, you are truly awesome! I admit that properly reviewing this patch
is a little out of my depth right now as I'm not familiar with this
infrastructure. I'd just like to note...

   case index_gitdir:
   if (dir-flags  DIR_SHOW_OTHER_DIRECTORIES)
   return path_none;
 - return path_untracked;
 + return path_none;

...that the if-test can be removed now as both branches are the same.

Petr Pasky Baudis
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-06-30 Thread Junio C Hamano
Petr Baudis pa...@ucw.cz writes:

   Hi!

 On Fri, Jun 28, 2013 at 11:39:16AM -0700, Junio C Hamano wrote:
 Thanks.  I'll queue it with a pair of fix-up commits on top, so that
 they can later be squashed in.
 
 The result of squashing the fix-ups would look like this.

   Thanks! I agree with all of your changes.

 -- 8 --
 From: Petr Baudis pa...@ucw.cz
 Date: Fri, 28 Jun 2013 17:05:32 +0200
 Subject: [PATCH] git stash: avoid data loss when git stash save kills a 
 directory

   Hmm, it's a pity that the note that `git reset --hard` itself should
 perhaps also abort in that case got lost. I don't insist on mentioning
 it in the commit message, though.

I do not agree with your `git reset --hard` at all.  With the
command, the user demands no matter what, I want get rid of any
funny state in my working tree so that I can start my work from that
specified commit (default to HEAD).

Imagine that this is you did to arrive that funny state:

$ git rm foo ;# foo used to be tracked and in HEAD
$ cp /somewhere/else/foo foo
$ cp /somewhere/else/bar bar ;# bar is not in HEAD
$ cp /somewhere/else/bar baz ;# baz is in HEAD
... do various other things ...

and then git reset --hard.  At that point, foo and bar are not
tracked and completely unrelated to the project.  baz is tracked
and have unrelated contents from that of HEAD.

In order to satisfy your desire to go back to the state of HEAD with
minimal collateral amage, we need to get rid of the updated foo
and baz and replace them with those from HEAD.  We do not have to
touch bar so we leave it as-is.

And the killed case is just like foo and baz.  If the state
you want to go back to with --hard has a directory (a file) where
your working tree's funny state has a file (a directory), the local
cruft needs to go away to satisify your request.

I do not mind if you are proposing a different and new kind of reset
that fails if it has to overwrite any local changes (be it tracked
or untracked), but that is not reset --hard.  It is something else.

 On Fri, Jun 28, 2013 at 02:30:15PM -0700, Junio C Hamano wrote:
 -- 8 --
 Subject: treat_directory(): do not declare submodules in index to be 
 untracked

   Oh, you are truly awesome! I admit that properly reviewing this patch
 is a little out of my depth right now as I'm not familiar with this
 infrastructure. I'd just like to note...

  case index_gitdir:
  if (dir-flags  DIR_SHOW_OTHER_DIRECTORIES)
  return path_none;
 -return path_untracked;
 +return path_none;

 ...that the if-test can be removed now as both branches are the same.

Thanks for noticing.  What was queued on 'pu' should already have
fixed that one.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git stash: Avoid data loss when saving a stash

2013-06-28 Thread Petr Baudis
I have been recently again bitten by git stash versus an uncommitted
file-to-directory change that happily threw away few gigabytes of my
data. This has been recently discussed in the past, e.g.

http://thread.gmane.org/gmane.comp.version-control.git/202332/

and I implemented Junio's suggestion in this patch - if ls-files --killed
produced any output, the stash save is stopped and the user is required
to pass --force to really have the data removed.

A test case is included. I think that the (currently failing) tests
'stash file to directory' and 'stash directory to file' are related to
this, but I consider their expectation to be bogus - I would personally
be surprised by `git stash` suddenly importing the complete
never-meant-to-be-tracked contents of my directory to Git during a stash
save, and I think the approach of aborting in this situation is more
reasonable.

Other parts of Git also behave in a troublesome way in case of tracked
file being replaced by an untracked directory - I wouldn't expect `git
reset --hard` alone to remove the directory (i.e. touch untracked files)
either. However, this matter is much more complicated since `git reset
--hard` is assumed to never fail in ordinary circumstances (see e.g.
git-stash code ;-) and I'm unable to devote sufficient effort to seeing
such a change through.

Signed-off-by: Petr Baudis pa...@ucw.cz
---

Please Cc me, I'm currently not subscribed on the list.

 Documentation/git-stash.txt |   12 ++--
 git-stash.sh|   10 ++
 t/t3903-stash.sh|   16 
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index db7e803..52d4def 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -14,7 +14,8 @@ SYNOPSIS
 'git stash' ( pop | apply ) [--index] [-q|--quiet] [stash]
 'git stash' branch branchname [stash]
 'git stash' [save [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
-[-u|--include-untracked] [-a|--all] [message]]
+[-u|--include-untracked] [-a|--all] [-f|--force]
+[message]]
 'git stash' clear
 'git stash' create [message]
 'git stash' store [-m|--message message] [-q|--quiet] commit
@@ -44,7 +45,7 @@ is also possible).
 OPTIONS
 ---
 
-save [-p|--patch] [--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] [message]::
+save [-p|--patch] [--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] [-f|--force] [message]::
 
Save your local modifications to a new 'stash', and run `git reset
--hard` to revert them.  The message part is optional and gives
@@ -71,6 +72,13 @@ linkgit:git-add[1] to learn how to operate the `--patch` 
mode.
 +
 The `--patch` option implies `--keep-index`.  You can use
 `--no-keep-index` to override this.
++
+In some cases, saving a stash could mean irretrievably removing some
+data - if a directory with untracked files replaces a tracked file of
+the same name, the new untracked files are not saved (except in case
+of `--include-untracked`) but the original tracked file shall be restored.
+Normally, stash save will abort; `--force` will make it remove the
+untracked files.
 
 list [options]::
 
diff --git a/git-stash.sh b/git-stash.sh
index 1e541a2..3cb9b05 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -195,6 +195,7 @@ save_stash () {
keep_index=
patch_mode=
untracked=
+   force=
while test $# != 0
do
case $1 in
@@ -215,6 +216,9 @@ save_stash () {
-u|--include-untracked)
untracked=untracked
;;
+   -f|--force)
+   force=t
+   ;;
-a|--all)
untracked=all
;;
@@ -258,6 +262,12 @@ save_stash () {
say $(gettext No local changes to save)
exit 0
fi
+   if test -z $untracked$force -a -n $(git ls-files --killed | head -n 
1); then
+   say $(gettext The following untracked files would NOT be 
saved but need to be removed by stash save:)
+   test -n $GIT_QUIET || git ls-files --killed | sed 's/^/\t/'
+   say $(gettext Abording. Consider using either the --force or 
--include-untracked switches.) 2
+   exit 1
+   fi
test -f $GIT_DIR/logs/$ref_stash ||
clear_stash || die $(gettext Cannot initialize stash)
 
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index debda7a..4ac4ebe 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -673,4 +673,19 @@ test_expect_success 'store updates stash ref and reflog' '
grep quux bazzy
 '
 
+test_expect_success SYMLINKS 'stash symlink to non-empty directory' '
+   git reset --hard 
+   ln -s file2 linkdir 
+   git add linkdir 
+   git commit -m+linkdir as symlink 
+   rm linkdir  mkdir 

Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-06-28 Thread Junio C Hamano
Petr Baudis pa...@ucw.cz writes:

 Signed-off-by: Petr Baudis pa...@ucw.cz
 ---

 Please Cc me, I'm currently not subscribed on the list.

Heh, long time no see.

 @@ -71,6 +72,13 @@ linkgit:git-add[1] to learn how to operate the `--patch` 
 mode.
  +
  The `--patch` option implies `--keep-index`.  You can use
  `--no-keep-index` to override this.
 ++
 +In some cases, saving a stash could mean irretrievably removing some
 +data - if a directory with untracked files replaces a tracked file of
 +the same name, the new untracked files are not saved (except in case
 +of `--include-untracked`) but the original tracked file shall be restored.
 +Normally, stash save will abort; `--force` will make it remove the
 +untracked files.

It _might_ look obvious from the context after somebody spends
enough time thinking about this case (and only this case) that
in such a case is implied in Normally, ... will abort, but for a
casual reader who encounters this paragraph for the first time, I do
not think it is obvious.  I'd rephrase to:

By default, `stash save` will abort in such a case; `--force` will
allow it to remove the untracked files.

 @@ -258,6 +262,12 @@ save_stash () {
   say $(gettext No local changes to save)
   exit 0
   fi
 + if test -z $untracked$force -a -n $(git ls-files --killed | head -n 
 1); then

Split the line at the semicolon in ; then.  Also git grep will
tell us that we tend to avoid -a in test.

if test -z $untracked$force 
   test -n $(git ls-files --killed | head -n 1)
then
...

 + say $(gettext The following untracked files would NOT be 
 saved but need to be removed by stash save:)
 + test -n $GIT_QUIET || git ls-files --killed | sed 's/^/\t/'
 + say $(gettext Abording. Consider using either the --force or 
 --include-untracked switches.) 2

s/Abord/Abort/;

 + exit 1
 + fi
   test -f $GIT_DIR/logs/$ref_stash ||
   clear_stash || die $(gettext Cannot initialize stash)
  
 diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
 index debda7a..4ac4ebe 100755
 --- a/t/t3903-stash.sh
 +++ b/t/t3903-stash.sh
 @@ -673,4 +673,19 @@ test_expect_success 'store updates stash ref and reflog' 
 '
   grep quux bazzy
  '
  
 +test_expect_success SYMLINKS 'stash symlink to non-empty directory' '
 + git reset --hard 
 + ln -s file2 linkdir 
 + git add linkdir 
 + git commit -m+linkdir as symlink 
 + rm linkdir  mkdir linkdir  touch linkdir/file 

Use linkdir/file instead for clarity, when you are not interested
in modifying the timestamp of an existing file.

 + ! git stash save symlink to non-empty directory 

Use test_must_fail

 + [ -e linkdir/file ]

test -f linkdir/file  You not only want to see it exists, you know
it must be a regular file.

 +'
 +
 +test_expect_success SYMLINKS 'stash symlink to non-empty directory (forced)' 
 '
 + git stash save --force symlink to non-empty directory (forced) 
 + [ ! -e linkdir/file ]  [ -L linkdir ]
 +'

git grep will tell us that test -h is preferred over test -L
in our codebase.

 +
  test_done

Also I do not think you need to limit the tests on symlink-capable
platforms.  You can create the linkdir as a regular file and
commit, and make a local change to turn it into a directory, and try
to stash save to recover that original regular file.

Thanks.  I'll queue it with a pair of fix-up commits on top, so that
they can later be squashed in.

The result of squashing the fix-ups would look like this.

-- 8 --
From: Petr Baudis pa...@ucw.cz
Date: Fri, 28 Jun 2013 17:05:32 +0200
Subject: [PATCH] git stash: avoid data loss when git stash save kills a 
directory

stash save is about saving the local change to the working tree,
but also about restoring the state of the last commit to the working
tree.  When a local change is to turn a non-directory to a directory,
in order to restore the non-directory, everything in the directory
needs to be removed.

Which is fine when running git stash save --include-untracked,
but without that option, untracked, newly created files in the
directory will have to be discarded.

Introduce a safety valve to fail the operation in such case, using
the ls-files --killed which was designed for this exact purpose.

The stash save is stopped when untracked files need to be
discarded because their leading path ceased to be a directory, and
the user is required to pass --force to really have the data
removed.

Signed-off-by: Petr Baudis pa...@ucw.cz
Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/git-stash.txt | 12 ++--
 git-stash.sh| 12 
 t/t3903-stash.sh| 18 ++
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index db7e803..7c8b648 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ 

Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-06-28 Thread Junio C Hamano
Petr Baudis pa...@ucw.cz writes:

 diff --git a/git-stash.sh b/git-stash.sh
 index 1e541a2..3cb9b05 100755
 --- a/git-stash.sh
 +++ b/git-stash.sh
 @@ -258,6 +262,12 @@ save_stash () {
   say $(gettext No local changes to save)
   exit 0
   fi
 + if test -z $untracked$force -a -n $(git ls-files --killed | head -n 
 1); then
 + say $(gettext The following untracked files would NOT be 
 saved but need to be removed by stash save:)

I think ls-files --killed was not adjusted for the new world order
when submodules were introduced.  With this change, you see t7402
break, even though git status would say

# Changes not staged for commit:
#   (use git add file... to update what will be committed)
#   (use git checkout -- file... to discard changes in working
#   directory)
#
#   modified:   file
#   modified:   submodule (new commits)

The path submodule in HEAD and the index is already submodule, so
stash save that reverts to the original state will _not_ have to
kill it, but the new check triggers it.

Exactly the same breakage this patch introduces triggers in t7610,
too.

I think another patch to teach ls-files --killed what to do with
submodules is needed as a preliminary step before this patch.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git stash: Avoid data loss when saving a stash

2013-06-28 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Petr Baudis pa...@ucw.cz writes:

 diff --git a/git-stash.sh b/git-stash.sh
 index 1e541a2..3cb9b05 100755
 --- a/git-stash.sh
 +++ b/git-stash.sh
 @@ -258,6 +262,12 @@ save_stash () {
  say $(gettext No local changes to save)
  exit 0
  fi
 +if test -z $untracked$force -a -n $(git ls-files --killed | head -n 
 1); then
 +say $(gettext The following untracked files would NOT be 
 saved but need to be removed by stash save:)

 I think ls-files --killed was not adjusted for the new world order
 when submodules were introduced.  With this change, you see t7402
 break,...
 Exactly the same breakage this patch introduces triggers in t7610,
 too.

 I think another patch to teach ls-files --killed what to do with
 submodules is needed as a preliminary step before this patch.

Which may be just the matter of doing this.

-- 8 --
Subject: treat_directory(): do not declare submodules in index to be untracked

When the working tree walker encounters a directory, it asks this
function if it should descend into it, show it as an untracked
directory, or do something else.  When the directory is the top of
the submodule working tree, we used to say That is an untracked
directory, which was quite bogus.  It is an entity that is tracked
in the index of the repository we are looking at, and that is not to
be descended into it.  Return path_none.

The existing case that path_untracked is returned for a newly
discovered submodule that is not tracked in the index (this only
happens when DIR_NO_GITLINKS option is not used) is unchanged and
returns path_untracked, but that is exactly because the submodule is
not tracked in the index.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 897c874..b99c40e 100644
--- a/dir.c
+++ b/dir.c
@@ -1038,7 +1038,7 @@ static enum path_treatment treat_directory(struct 
dir_struct *dir,
case index_gitdir:
if (dir-flags  DIR_SHOW_OTHER_DIRECTORIES)
return path_none;
-   return path_untracked;
+   return path_none;
 
case index_nonexistent:
if (dir-flags  DIR_SHOW_OTHER_DIRECTORIES)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html