Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Johannes Schindelin
Hi Peff,

On Thu, 15 Nov 2018, Jeff King wrote:

> On Thu, Nov 15, 2018 at 05:32:15PM +0100, Johannes Schindelin wrote:
> 
> > I cannot claim that I wrapped my head around your explanation or your
> > diff (I am busy trying to prepare Git for Windows' `master` for
> > rebasing to v2.20.0-rc0), but it does fix the problem. Thank you so
> > much!
> > 
> > The line `test_expect_code 1 ...` needs to be adjusted to
> > `test_expect_code 128`, of course, and to discern from the fixed
> > problem (which also exits with code 128), the error output should be
> > verified, like so:
> > 
> > -- snip --
> > test_expect_success 'try to create a bundle with empty ref count' '
> > test_must_fail git bundle create foobar.bundle master..master 2>err &&
> > test_i18ngrep "Refusing to create empty bundle" err
> > '
> > -- snap --
> 
> It seems like we should be checking that the stale lockfile isn't left,
> which is the real problem (the warning is annoying, but is a symptom of
> the same thing). I.e., something like:
> 
>   test_must_fail git bundle create foobar.bundle master..master &&
>   test_path_is_missing foobar.bundle.lock
> 
> That would already pass on non-Windows platforms, but that's OK. It will
> never give a false failure.
> 
> If you don't mind, can you confirm that the test above fails without
> either of the two patches under discussion?

This test succeeds with your patch as well as with Gaël's, and fails when
neither patch is applied. So you're right, it is the much better test.

> > Do you want to integrate this test into your patch and run with it, or
> > do you want me to shepherd your patch?
> 
> I'll wrap it up with a commit message and a test.

Thank you so much!
Dscho

Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Duy Nguyen
On Thu, Nov 15, 2018 at 1:59 PM Johannes Schindelin
 wrote:
> So yes, we are trying to unlink the `.lock` file, and as far as I can tell 
> that
> `unlink()` call comes from the tempfile cleanup asked for by Martin. However, 
> as
> we still have a handle open to that file, that call fails.
>
> I do not think that there is any better way to fix this than to close the file
> explicitly.

I may be talking nonsense here but I remember someone mentioned
something about deleting a file while it's still open on Windows and
after a quick internet search, it may be FILE_SHARE_DELETE. Since _we_
open these files, can we just open them with this to be able to delete
them?
-- 
Duy


Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Jeff King
On Thu, Nov 15, 2018 at 05:32:15PM +0100, Johannes Schindelin wrote:

> I cannot claim that I wrapped my head around your explanation or your diff (I 
> am
> busy trying to prepare Git for Windows' `master` for rebasing to v2.20.0-rc0),
> but it does fix the problem. Thank you so much!
> 
> The line `test_expect_code 1 ...` needs to be adjusted to `test_expect_code
> 128`, of course, and to discern from the fixed problem (which also exits with
> code 128), the error output should be verified, like so:
> 
> -- snip --
> test_expect_success 'try to create a bundle with empty ref count' '
>   test_must_fail git bundle create foobar.bundle master..master 2>err &&
>   test_i18ngrep "Refusing to create empty bundle" err
> '
> -- snap --

It seems like we should be checking that the stale lockfile isn't left,
which is the real problem (the warning is annoying, but is a symptom of
the same thing). I.e., something like:

  test_must_fail git bundle create foobar.bundle master..master &&
  test_path_is_missing foobar.bundle.lock

That would already pass on non-Windows platforms, but that's OK. It will
never give a false failure.

If you don't mind, can you confirm that the test above fails without
either of the two patches under discussion?

> Do you want to integrate this test into your patch and run with it, or do you
> want me to shepherd your patch?

I'll wrap it up with a commit message and a test.

-Peff


Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Johannes Schindelin
Hi Peff,

On Thu, 15 Nov 2018, Jeff King wrote:

> On Thu, Nov 15, 2018 at 01:57:45PM +0100, Johannes Schindelin wrote:
> 
> > > I looked at the test to see if it would pass, but it is not even
> > > checking anything about lockfiles! It just checks that we exit 1 by
> > > returning up the callstack instead of calling die(). And of course it
> > > would not have a problem under Linux either way. But if I run something
> > > similar under strace, I see:
> > > 
> > >   $ strace ./git bundle create foobar.bundle HEAD..HEAD
> > >   [...]
> > >   openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", 
> > > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3
> > >   [...]
> > >   close(3)= 0
> > >   unlink("/home/peff/compile/git/foobar.bundle.lock") = 0
> > >   exit_group(128) = ?
> > > 
> > > which seems right.
> > 
> > Without the fix, the added regression test fails thusly:
> > 
> > -- snip --
> > [...]
> > ++ test_expect_code 1 git bundle create foobar.bundle master..master
> > ++ want_code=1
> > ++ shift
> > ++ git bundle create foobar.bundle master..master
> > fatal: Refusing to create empty bundle.
> > warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash 
> > directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied
> 
> Hmph. So who has it open, and why isn't the tempfile code working as
> designed?
> 
> Aha, I see the problem. We dup() the descriptor in create_bundle(). So
> the patch _is_ necessary and (fairly) correct. But the explanation
> probably ought to be something like:
> 
>   In create_bundle(), we duplicate the lockfile descriptor via dup().
>   This means that even though the lockfile code carefully calls close()
>   before unlinking the lockfile, we may still have the file open. Unix
>   systems don't care, but under Windows, this prevents the unlink
>   (causing an annoying warning and a stale lockfile).
> 
> But that also means that all of the other places we could die (e.g., in
> write_or_die) are going to have the same problem. We've fixed only one.
> Is there a way we can avoid doing the dup() in the first place?
> 
> The comment there explains that we duplicate because write_pack_data()
> will close the descriptor. Unfortunately, that's hard to change because
> it comes from run-command. But we don't actually need the descriptor
> ourselves after it's closed; we're just trying to appease the lockfile
> code; see e54c347c1c (create_bundle(): duplicate file descriptor to
> avoid closing it twice, 2015-08-10).
> 
> We just need some reasonable way of telling the lock code what's
> happening. Something like the patch below, which is a moral revert of
> e54c347c1c, but instead wrapping the "lock->fd = -1" in an official API.
> 
> Does this make your warning go away?
> 
> diff --git a/bundle.c b/bundle.c
> [...]

I cannot claim that I wrapped my head around your explanation or your diff (I am
busy trying to prepare Git for Windows' `master` for rebasing to v2.20.0-rc0),
but it does fix the problem. Thank you so much!

The line `test_expect_code 1 ...` needs to be adjusted to `test_expect_code
128`, of course, and to discern from the fixed problem (which also exits with
code 128), the error output should be verified, like so:

-- snip --
test_expect_success 'try to create a bundle with empty ref count' '
test_must_fail git bundle create foobar.bundle master..master 2>err &&
test_i18ngrep "Refusing to create empty bundle" err
'
-- snap --

Do you want to integrate this test into your patch and run with it, or do you
want me to shepherd your patch?

Ciao,
Dscho


Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Jeff King
On Thu, Nov 15, 2018 at 01:57:45PM +0100, Johannes Schindelin wrote:

> > I looked at the test to see if it would pass, but it is not even
> > checking anything about lockfiles! It just checks that we exit 1 by
> > returning up the callstack instead of calling die(). And of course it
> > would not have a problem under Linux either way. But if I run something
> > similar under strace, I see:
> > 
> >   $ strace ./git bundle create foobar.bundle HEAD..HEAD
> >   [...]
> >   openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", 
> > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3
> >   [...]
> >   close(3)= 0
> >   unlink("/home/peff/compile/git/foobar.bundle.lock") = 0
> >   exit_group(128) = ?
> > 
> > which seems right.
> 
> Without the fix, the added regression test fails thusly:
> 
> -- snip --
> [...]
> ++ test_expect_code 1 git bundle create foobar.bundle master..master
> ++ want_code=1
> ++ shift
> ++ git bundle create foobar.bundle master..master
> fatal: Refusing to create empty bundle.
> warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash 
> directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied

Hmph. So who has it open, and why isn't the tempfile code working as
designed?

Aha, I see the problem. We dup() the descriptor in create_bundle(). So
the patch _is_ necessary and (fairly) correct. But the explanation
probably ought to be something like:

  In create_bundle(), we duplicate the lockfile descriptor via dup().
  This means that even though the lockfile code carefully calls close()
  before unlinking the lockfile, we may still have the file open. Unix
  systems don't care, but under Windows, this prevents the unlink
  (causing an annoying warning and a stale lockfile).

But that also means that all of the other places we could die (e.g., in
write_or_die) are going to have the same problem. We've fixed only one.
Is there a way we can avoid doing the dup() in the first place?

The comment there explains that we duplicate because write_pack_data()
will close the descriptor. Unfortunately, that's hard to change because
it comes from run-command. But we don't actually need the descriptor
ourselves after it's closed; we're just trying to appease the lockfile
code; see e54c347c1c (create_bundle(): duplicate file descriptor to
avoid closing it twice, 2015-08-10).

We just need some reasonable way of telling the lock code what's
happening. Something like the patch below, which is a moral revert of
e54c347c1c, but instead wrapping the "lock->fd = -1" in an official API.

Does this make your warning go away?

diff --git a/bundle.c b/bundle.c
index 1ef584b93b..dc26551b83 100644
--- a/bundle.c
+++ b/bundle.c
@@ -244,7 +244,7 @@ static int is_tag_in_date_range(struct object *tag, struct 
rev_info *revs)
 
 
 /* Write the pack data to bundle_fd, then close it if it is > 1. */
-static int write_pack_data(int bundle_fd, struct rev_info *revs)
+static int write_pack_data(int bundle_fd, struct lock_file *lock, struct 
rev_info *revs)
 {
struct child_process pack_objects = CHILD_PROCESS_INIT;
int i;
@@ -256,6 +256,14 @@ static int write_pack_data(int bundle_fd, struct rev_info 
*revs)
pack_objects.in = -1;
pack_objects.out = bundle_fd;
pack_objects.git_cmd = 1;
+
+   /*
+* At this point we know that start_command is going to close our
+* bundle_fd, whether successful or not. Tell the lock code that
+* it is no longer in charge of it, so we don't try to double-close.
+*/
+   lock_file_release_descriptor(lock);
+
if (start_command(_objects))
return error(_("Could not spawn pack-objects"));
 
@@ -421,21 +429,10 @@ int create_bundle(struct bundle_header *header, const 
char *path,
bundle_to_stdout = !strcmp(path, "-");
if (bundle_to_stdout)
bundle_fd = 1;
-   else {
+   else
bundle_fd = hold_lock_file_for_update(, path,
  LOCK_DIE_ON_ERROR);
 
-   /*
-* write_pack_data() will close the fd passed to it,
-* but commit_lock_file() will also try to close the
-* lockfile's fd. So make a copy of the file
-* descriptor to avoid trying to close it twice.
-*/
-   bundle_fd = dup(bundle_fd);
-   if (bundle_fd < 0)
-   die_errno("unable to dup file descriptor");
-   }
-
/* write signature */
write_or_die(bundle_fd, bundle_signature, strlen(bundle_signature));
 
@@ -463,10 +460,8 @@ int create_bundle(struct bundle_header *header, const char 
*path,
goto err;
 
/* write pack */
-   if (write_pack_data(bundle_fd, )) {
-   bundle_fd = -1; /* already closed by the above call */
+   if (write_pack_data(bundle_fd, , ))
goto err;

Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-15 Thread Johannes Schindelin
Hi Peff,

On Wed, 14 Nov 2018, Jeff King wrote:

> On Wed, Nov 14, 2018 at 02:08:48PM -0800, Stefan Beller wrote:
> 
> > On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren  wrote:
> > >
> > > On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget
> > >  wrote:
> > > > However, the `.lock` file was still open and on Windows that means
> > > > that it could not be deleted properly. This patch fixes that issue.
> > >
> > > Hmmm, doesn't the tempfile machinery remove the lock file when we die?
> > 
> > On Windows this seems not to be the case. (Open files cannot be deleted
> > as the open file is not kept by inode or similar but by the file path 
> > there?)
> > 
> > Rewording your concern: Could the tempfile machinery be taught to
> > work properly on Windows, e.g. by first closing all files and then deleting
> > them afterwards?
> 
> It already tries to do so. See delete_tempfile(), or more likely in the
> die() case, the remove_tempfiles() handler which is called at exit.
> 
> Are we sure this is still a problem?
> 
> I looked at the test to see if it would pass, but it is not even
> checking anything about lockfiles! It just checks that we exit 1 by
> returning up the callstack instead of calling die(). And of course it
> would not have a problem under Linux either way. But if I run something
> similar under strace, I see:
> 
>   $ strace ./git bundle create foobar.bundle HEAD..HEAD
>   [...]
>   openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", 
> O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3
>   [...]
>   close(3)= 0
>   unlink("/home/peff/compile/git/foobar.bundle.lock") = 0
>   exit_group(128) = ?
> 
> which seems right.

Without the fix, the added regression test fails thusly:

-- snip --
[...]
++ test_expect_code 1 git bundle create foobar.bundle master..master
++ want_code=1
++ shift
++ git bundle create foobar.bundle master..master
fatal: Refusing to create empty bundle.
warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash 
directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied
++ exit_code=128
++ test 128 = 1
++ echo 'test_expect_code: command exited with 128, we wanted 1 git bundle 
create foobar.bundle master..master'
test_expect_code: command exited with 128, we wanted 1 git bundle create 
foobar.bundle master..master
++ return 1
error: last command exited with $?=1
not ok 9 - try to create a bundle with empty ref count
#
#   test_expect_code 1 git bundle create foobar.bundle 
master..master
#
-- snap --

So yes, we are trying to unlink the `.lock` file, and as far as I can tell that
`unlink()` call comes from the tempfile cleanup asked for by Martin. However, as
we still have a handle open to that file, that call fails.

I do not think that there is any better way to fix this than to close the file
explicitly. If we tried to just close whatever file descriptor is still open to
that file before deleting it, we would possibly cause problems in code that is
still to be executed and assumes that it has a perfectly valid file descriptor.
Besides, trying to do this kind of "automatically" won't work, like, at all,
when it is one child process that holds an open file descriptor while another
process wants to delete the file.

Ciao,
Dscho

Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-14 Thread Jeff King
On Wed, Nov 14, 2018 at 02:08:48PM -0800, Stefan Beller wrote:

> On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren  wrote:
> >
> > On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget
> >  wrote:
> > > However, the `.lock` file was still open and on Windows that means
> > > that it could not be deleted properly. This patch fixes that issue.
> >
> > Hmmm, doesn't the tempfile machinery remove the lock file when we die?
> 
> On Windows this seems not to be the case. (Open files cannot be deleted
> as the open file is not kept by inode or similar but by the file path there?)
> 
> Rewording your concern: Could the tempfile machinery be taught to
> work properly on Windows, e.g. by first closing all files and then deleting
> them afterwards?

It already tries to do so. See delete_tempfile(), or more likely in the
die() case, the remove_tempfiles() handler which is called at exit.

Are we sure this is still a problem?

I looked at the test to see if it would pass, but it is not even
checking anything about lockfiles! It just checks that we exit 1 by
returning up the callstack instead of calling die(). And of course it
would not have a problem under Linux either way. But if I run something
similar under strace, I see:

  $ strace ./git bundle create foobar.bundle HEAD..HEAD
  [...]
  openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", 
O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3
  [...]
  close(3)= 0
  unlink("/home/peff/compile/git/foobar.bundle.lock") = 0
  exit_group(128) = ?

which seems right.

-Peff


Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-14 Thread Stefan Beller
On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren  wrote:
>
> On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget
>  wrote:
> > However, the `.lock` file was still open and on Windows that means
> > that it could not be deleted properly. This patch fixes that issue.
>
> Hmmm, doesn't the tempfile machinery remove the lock file when we die?

On Windows this seems not to be the case. (Open files cannot be deleted
as the open file is not kept by inode or similar but by the file path there?)

Rewording your concern: Could the tempfile machinery be taught to
work properly on Windows, e.g. by first closing all files and then deleting
them afterwards?

There was a refactoring of tempfiles merged in 89563ec379
(Merge branch 'jk/incore-lockfile-removal', 2017-09-19), which sounded
promising at first, as it is dated after the original patch[1] date
(June 2016), but it has no references for Windows being different,
so we might still have the original issue; most of the lockfile
infrastructure was done in 2015 via db86e61cbb (Merge branch
'mh/tempfile', 2015-08-25)

[1] https://github.com/git-for-windows/git/pull/797


Re: [PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-14 Thread Martin Ågren
On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget
 wrote:
> However, the `.lock` file was still open and on Windows that means
> that it could not be deleted properly. This patch fixes that issue.

Hmmm, doesn't the tempfile machinery remove the lock file when we die?

> ref_count = write_bundle_refs(bundle_fd, );
> -   if (!ref_count)
> -   die(_("Refusing to create empty bundle."));
> -   else if (ref_count < 0)
> +   if (ref_count <= 0)  {
> +   if (!ref_count)
> +   error(_("Refusing to create empty bundle."));
> goto err;
> +   }

One less `die()` in libgit -- nice.

> +test_expect_success 'try to create a bundle with empty ref count' '
> +   test_expect_code 1 git bundle create foobar.bundle master..master
> +'

This tries to make sure that we don't just die, but that we exit in a
"controlled" way with exit code 1. After reading the log message, I had
perhaps expected something like

  test_must_fail git bundle [...] &&
  test_path_is_missing foobar.bundle.lock

That relies on magically knowing the ".lock" suffix, but my point is
that for me (on Linux), this alternative test passes both before and
after the patch. Before, because tempfile.c cleans up; after, because
bundle.c does so. Doesn't this happen on Windows too? What am I missing?

Martin


[PATCH v2 1/1] bundle: cleanup lock files on error

2018-11-14 Thread Gaël Lhez via GitGitGadget
From: =?UTF-8?q?Ga=C3=ABl=20Lhez?= 

When an user tries to create an empty bundle via `git bundle create
 ` where `` resolves to an empty list (for
example, like `master..master`), the command fails and warns the user
about how it does not want to create empty bundle.

However, the `.lock` file was still open and on Windows that means
that it could not be deleted properly. This patch fixes that issue.

This closes https://github.com/git-for-windows/git/issues/790

Signed-off-by: Gaël Lhez 
Signed-off-by: Johannes Schindelin 
---
 bundle.c| 7 ---
 t/t5607-clone-bundle.sh | 4 
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/bundle.c b/bundle.c
index 1ef584b93b..4e349feff9 100644
--- a/bundle.c
+++ b/bundle.c
@@ -457,10 +457,11 @@ int create_bundle(struct bundle_header *header, const 
char *path,
object_array_remove_duplicates();
 
ref_count = write_bundle_refs(bundle_fd, );
-   if (!ref_count)
-   die(_("Refusing to create empty bundle."));
-   else if (ref_count < 0)
+   if (ref_count <= 0)  {
+   if (!ref_count)
+   error(_("Refusing to create empty bundle."));
goto err;
+   }
 
/* write pack */
if (write_pack_data(bundle_fd, )) {
diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh
index 348d9b3bc7..f84b875950 100755
--- a/t/t5607-clone-bundle.sh
+++ b/t/t5607-clone-bundle.sh
@@ -71,4 +71,8 @@ test_expect_success 'prerequisites with an empty commit 
message' '
git bundle verify bundle
 '
 
+test_expect_success 'try to create a bundle with empty ref count' '
+   test_expect_code 1 git bundle create foobar.bundle master..master
+'
+
 test_done
-- 
gitgitgadget