Re: [PATCH v2 1/1] bundle: cleanup lock files on error
Hi Peff, On Thu, 15 Nov 2018, Jeff King wrote: > On Thu, Nov 15, 2018 at 05:32:15PM +0100, Johannes Schindelin wrote: > > > I cannot claim that I wrapped my head around your explanation or your > > diff (I am busy trying to prepare Git for Windows' `master` for > > rebasing to v2.20.0-rc0), but it does fix the problem. Thank you so > > much! > > > > The line `test_expect_code 1 ...` needs to be adjusted to > > `test_expect_code 128`, of course, and to discern from the fixed > > problem (which also exits with code 128), the error output should be > > verified, like so: > > > > -- snip -- > > test_expect_success 'try to create a bundle with empty ref count' ' > > test_must_fail git bundle create foobar.bundle master..master 2>err && > > test_i18ngrep "Refusing to create empty bundle" err > > ' > > -- snap -- > > It seems like we should be checking that the stale lockfile isn't left, > which is the real problem (the warning is annoying, but is a symptom of > the same thing). I.e., something like: > > test_must_fail git bundle create foobar.bundle master..master && > test_path_is_missing foobar.bundle.lock > > That would already pass on non-Windows platforms, but that's OK. It will > never give a false failure. > > If you don't mind, can you confirm that the test above fails without > either of the two patches under discussion? This test succeeds with your patch as well as with Gaël's, and fails when neither patch is applied. So you're right, it is the much better test. > > Do you want to integrate this test into your patch and run with it, or > > do you want me to shepherd your patch? > > I'll wrap it up with a commit message and a test. Thank you so much! Dscho
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Thu, Nov 15, 2018 at 1:59 PM Johannes Schindelin wrote: > So yes, we are trying to unlink the `.lock` file, and as far as I can tell > that > `unlink()` call comes from the tempfile cleanup asked for by Martin. However, > as > we still have a handle open to that file, that call fails. > > I do not think that there is any better way to fix this than to close the file > explicitly. I may be talking nonsense here but I remember someone mentioned something about deleting a file while it's still open on Windows and after a quick internet search, it may be FILE_SHARE_DELETE. Since _we_ open these files, can we just open them with this to be able to delete them? -- Duy
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Thu, Nov 15, 2018 at 05:32:15PM +0100, Johannes Schindelin wrote: > I cannot claim that I wrapped my head around your explanation or your diff (I > am > busy trying to prepare Git for Windows' `master` for rebasing to v2.20.0-rc0), > but it does fix the problem. Thank you so much! > > The line `test_expect_code 1 ...` needs to be adjusted to `test_expect_code > 128`, of course, and to discern from the fixed problem (which also exits with > code 128), the error output should be verified, like so: > > -- snip -- > test_expect_success 'try to create a bundle with empty ref count' ' > test_must_fail git bundle create foobar.bundle master..master 2>err && > test_i18ngrep "Refusing to create empty bundle" err > ' > -- snap -- It seems like we should be checking that the stale lockfile isn't left, which is the real problem (the warning is annoying, but is a symptom of the same thing). I.e., something like: test_must_fail git bundle create foobar.bundle master..master && test_path_is_missing foobar.bundle.lock That would already pass on non-Windows platforms, but that's OK. It will never give a false failure. If you don't mind, can you confirm that the test above fails without either of the two patches under discussion? > Do you want to integrate this test into your patch and run with it, or do you > want me to shepherd your patch? I'll wrap it up with a commit message and a test. -Peff
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
Hi Peff, On Thu, 15 Nov 2018, Jeff King wrote: > On Thu, Nov 15, 2018 at 01:57:45PM +0100, Johannes Schindelin wrote: > > > > I looked at the test to see if it would pass, but it is not even > > > checking anything about lockfiles! It just checks that we exit 1 by > > > returning up the callstack instead of calling die(). And of course it > > > would not have a problem under Linux either way. But if I run something > > > similar under strace, I see: > > > > > > $ strace ./git bundle create foobar.bundle HEAD..HEAD > > > [...] > > > openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", > > > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 > > > [...] > > > close(3)= 0 > > > unlink("/home/peff/compile/git/foobar.bundle.lock") = 0 > > > exit_group(128) = ? > > > > > > which seems right. > > > > Without the fix, the added regression test fails thusly: > > > > -- snip -- > > [...] > > ++ test_expect_code 1 git bundle create foobar.bundle master..master > > ++ want_code=1 > > ++ shift > > ++ git bundle create foobar.bundle master..master > > fatal: Refusing to create empty bundle. > > warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash > > directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied > > Hmph. So who has it open, and why isn't the tempfile code working as > designed? > > Aha, I see the problem. We dup() the descriptor in create_bundle(). So > the patch _is_ necessary and (fairly) correct. But the explanation > probably ought to be something like: > > In create_bundle(), we duplicate the lockfile descriptor via dup(). > This means that even though the lockfile code carefully calls close() > before unlinking the lockfile, we may still have the file open. Unix > systems don't care, but under Windows, this prevents the unlink > (causing an annoying warning and a stale lockfile). > > But that also means that all of the other places we could die (e.g., in > write_or_die) are going to have the same problem. We've fixed only one. > Is there a way we can avoid doing the dup() in the first place? > > The comment there explains that we duplicate because write_pack_data() > will close the descriptor. Unfortunately, that's hard to change because > it comes from run-command. But we don't actually need the descriptor > ourselves after it's closed; we're just trying to appease the lockfile > code; see e54c347c1c (create_bundle(): duplicate file descriptor to > avoid closing it twice, 2015-08-10). > > We just need some reasonable way of telling the lock code what's > happening. Something like the patch below, which is a moral revert of > e54c347c1c, but instead wrapping the "lock->fd = -1" in an official API. > > Does this make your warning go away? > > diff --git a/bundle.c b/bundle.c > [...] I cannot claim that I wrapped my head around your explanation or your diff (I am busy trying to prepare Git for Windows' `master` for rebasing to v2.20.0-rc0), but it does fix the problem. Thank you so much! The line `test_expect_code 1 ...` needs to be adjusted to `test_expect_code 128`, of course, and to discern from the fixed problem (which also exits with code 128), the error output should be verified, like so: -- snip -- test_expect_success 'try to create a bundle with empty ref count' ' test_must_fail git bundle create foobar.bundle master..master 2>err && test_i18ngrep "Refusing to create empty bundle" err ' -- snap -- Do you want to integrate this test into your patch and run with it, or do you want me to shepherd your patch? Ciao, Dscho
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Thu, Nov 15, 2018 at 01:57:45PM +0100, Johannes Schindelin wrote: > > I looked at the test to see if it would pass, but it is not even > > checking anything about lockfiles! It just checks that we exit 1 by > > returning up the callstack instead of calling die(). And of course it > > would not have a problem under Linux either way. But if I run something > > similar under strace, I see: > > > > $ strace ./git bundle create foobar.bundle HEAD..HEAD > > [...] > > openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", > > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 > > [...] > > close(3)= 0 > > unlink("/home/peff/compile/git/foobar.bundle.lock") = 0 > > exit_group(128) = ? > > > > which seems right. > > Without the fix, the added regression test fails thusly: > > -- snip -- > [...] > ++ test_expect_code 1 git bundle create foobar.bundle master..master > ++ want_code=1 > ++ shift > ++ git bundle create foobar.bundle master..master > fatal: Refusing to create empty bundle. > warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash > directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied Hmph. So who has it open, and why isn't the tempfile code working as designed? Aha, I see the problem. We dup() the descriptor in create_bundle(). So the patch _is_ necessary and (fairly) correct. But the explanation probably ought to be something like: In create_bundle(), we duplicate the lockfile descriptor via dup(). This means that even though the lockfile code carefully calls close() before unlinking the lockfile, we may still have the file open. Unix systems don't care, but under Windows, this prevents the unlink (causing an annoying warning and a stale lockfile). But that also means that all of the other places we could die (e.g., in write_or_die) are going to have the same problem. We've fixed only one. Is there a way we can avoid doing the dup() in the first place? The comment there explains that we duplicate because write_pack_data() will close the descriptor. Unfortunately, that's hard to change because it comes from run-command. But we don't actually need the descriptor ourselves after it's closed; we're just trying to appease the lockfile code; see e54c347c1c (create_bundle(): duplicate file descriptor to avoid closing it twice, 2015-08-10). We just need some reasonable way of telling the lock code what's happening. Something like the patch below, which is a moral revert of e54c347c1c, but instead wrapping the "lock->fd = -1" in an official API. Does this make your warning go away? diff --git a/bundle.c b/bundle.c index 1ef584b93b..dc26551b83 100644 --- a/bundle.c +++ b/bundle.c @@ -244,7 +244,7 @@ static int is_tag_in_date_range(struct object *tag, struct rev_info *revs) /* Write the pack data to bundle_fd, then close it if it is > 1. */ -static int write_pack_data(int bundle_fd, struct rev_info *revs) +static int write_pack_data(int bundle_fd, struct lock_file *lock, struct rev_info *revs) { struct child_process pack_objects = CHILD_PROCESS_INIT; int i; @@ -256,6 +256,14 @@ static int write_pack_data(int bundle_fd, struct rev_info *revs) pack_objects.in = -1; pack_objects.out = bundle_fd; pack_objects.git_cmd = 1; + + /* +* At this point we know that start_command is going to close our +* bundle_fd, whether successful or not. Tell the lock code that +* it is no longer in charge of it, so we don't try to double-close. +*/ + lock_file_release_descriptor(lock); + if (start_command(_objects)) return error(_("Could not spawn pack-objects")); @@ -421,21 +429,10 @@ int create_bundle(struct bundle_header *header, const char *path, bundle_to_stdout = !strcmp(path, "-"); if (bundle_to_stdout) bundle_fd = 1; - else { + else bundle_fd = hold_lock_file_for_update(, path, LOCK_DIE_ON_ERROR); - /* -* write_pack_data() will close the fd passed to it, -* but commit_lock_file() will also try to close the -* lockfile's fd. So make a copy of the file -* descriptor to avoid trying to close it twice. -*/ - bundle_fd = dup(bundle_fd); - if (bundle_fd < 0) - die_errno("unable to dup file descriptor"); - } - /* write signature */ write_or_die(bundle_fd, bundle_signature, strlen(bundle_signature)); @@ -463,10 +460,8 @@ int create_bundle(struct bundle_header *header, const char *path, goto err; /* write pack */ - if (write_pack_data(bundle_fd, )) { - bundle_fd = -1; /* already closed by the above call */ + if (write_pack_data(bundle_fd, , )) goto err;
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
Hi Peff, On Wed, 14 Nov 2018, Jeff King wrote: > On Wed, Nov 14, 2018 at 02:08:48PM -0800, Stefan Beller wrote: > > > On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren wrote: > > > > > > On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget > > > wrote: > > > > However, the `.lock` file was still open and on Windows that means > > > > that it could not be deleted properly. This patch fixes that issue. > > > > > > Hmmm, doesn't the tempfile machinery remove the lock file when we die? > > > > On Windows this seems not to be the case. (Open files cannot be deleted > > as the open file is not kept by inode or similar but by the file path > > there?) > > > > Rewording your concern: Could the tempfile machinery be taught to > > work properly on Windows, e.g. by first closing all files and then deleting > > them afterwards? > > It already tries to do so. See delete_tempfile(), or more likely in the > die() case, the remove_tempfiles() handler which is called at exit. > > Are we sure this is still a problem? > > I looked at the test to see if it would pass, but it is not even > checking anything about lockfiles! It just checks that we exit 1 by > returning up the callstack instead of calling die(). And of course it > would not have a problem under Linux either way. But if I run something > similar under strace, I see: > > $ strace ./git bundle create foobar.bundle HEAD..HEAD > [...] > openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 > [...] > close(3)= 0 > unlink("/home/peff/compile/git/foobar.bundle.lock") = 0 > exit_group(128) = ? > > which seems right. Without the fix, the added regression test fails thusly: -- snip -- [...] ++ test_expect_code 1 git bundle create foobar.bundle master..master ++ want_code=1 ++ shift ++ git bundle create foobar.bundle master..master fatal: Refusing to create empty bundle. warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied ++ exit_code=128 ++ test 128 = 1 ++ echo 'test_expect_code: command exited with 128, we wanted 1 git bundle create foobar.bundle master..master' test_expect_code: command exited with 128, we wanted 1 git bundle create foobar.bundle master..master ++ return 1 error: last command exited with $?=1 not ok 9 - try to create a bundle with empty ref count # # test_expect_code 1 git bundle create foobar.bundle master..master # -- snap -- So yes, we are trying to unlink the `.lock` file, and as far as I can tell that `unlink()` call comes from the tempfile cleanup asked for by Martin. However, as we still have a handle open to that file, that call fails. I do not think that there is any better way to fix this than to close the file explicitly. If we tried to just close whatever file descriptor is still open to that file before deleting it, we would possibly cause problems in code that is still to be executed and assumes that it has a perfectly valid file descriptor. Besides, trying to do this kind of "automatically" won't work, like, at all, when it is one child process that holds an open file descriptor while another process wants to delete the file. Ciao, Dscho
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Wed, Nov 14, 2018 at 02:08:48PM -0800, Stefan Beller wrote: > On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren wrote: > > > > On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget > > wrote: > > > However, the `.lock` file was still open and on Windows that means > > > that it could not be deleted properly. This patch fixes that issue. > > > > Hmmm, doesn't the tempfile machinery remove the lock file when we die? > > On Windows this seems not to be the case. (Open files cannot be deleted > as the open file is not kept by inode or similar but by the file path there?) > > Rewording your concern: Could the tempfile machinery be taught to > work properly on Windows, e.g. by first closing all files and then deleting > them afterwards? It already tries to do so. See delete_tempfile(), or more likely in the die() case, the remove_tempfiles() handler which is called at exit. Are we sure this is still a problem? I looked at the test to see if it would pass, but it is not even checking anything about lockfiles! It just checks that we exit 1 by returning up the callstack instead of calling die(). And of course it would not have a problem under Linux either way. But if I run something similar under strace, I see: $ strace ./git bundle create foobar.bundle HEAD..HEAD [...] openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 [...] close(3)= 0 unlink("/home/peff/compile/git/foobar.bundle.lock") = 0 exit_group(128) = ? which seems right. -Peff
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Wed, Nov 14, 2018 at 1:43 PM Martin Ågren wrote: > > On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget > wrote: > > However, the `.lock` file was still open and on Windows that means > > that it could not be deleted properly. This patch fixes that issue. > > Hmmm, doesn't the tempfile machinery remove the lock file when we die? On Windows this seems not to be the case. (Open files cannot be deleted as the open file is not kept by inode or similar but by the file path there?) Rewording your concern: Could the tempfile machinery be taught to work properly on Windows, e.g. by first closing all files and then deleting them afterwards? There was a refactoring of tempfiles merged in 89563ec379 (Merge branch 'jk/incore-lockfile-removal', 2017-09-19), which sounded promising at first, as it is dated after the original patch[1] date (June 2016), but it has no references for Windows being different, so we might still have the original issue; most of the lockfile infrastructure was done in 2015 via db86e61cbb (Merge branch 'mh/tempfile', 2015-08-25) [1] https://github.com/git-for-windows/git/pull/797
Re: [PATCH v2 1/1] bundle: cleanup lock files on error
On Wed, 14 Nov 2018 at 16:26, Gaël Lhez via GitGitGadget wrote: > However, the `.lock` file was still open and on Windows that means > that it could not be deleted properly. This patch fixes that issue. Hmmm, doesn't the tempfile machinery remove the lock file when we die? > ref_count = write_bundle_refs(bundle_fd, ); > - if (!ref_count) > - die(_("Refusing to create empty bundle.")); > - else if (ref_count < 0) > + if (ref_count <= 0) { > + if (!ref_count) > + error(_("Refusing to create empty bundle.")); > goto err; > + } One less `die()` in libgit -- nice. > +test_expect_success 'try to create a bundle with empty ref count' ' > + test_expect_code 1 git bundle create foobar.bundle master..master > +' This tries to make sure that we don't just die, but that we exit in a "controlled" way with exit code 1. After reading the log message, I had perhaps expected something like test_must_fail git bundle [...] && test_path_is_missing foobar.bundle.lock That relies on magically knowing the ".lock" suffix, but my point is that for me (on Linux), this alternative test passes both before and after the patch. Before, because tempfile.c cleans up; after, because bundle.c does so. Doesn't this happen on Windows too? What am I missing? Martin
[PATCH v2 1/1] bundle: cleanup lock files on error
From: =?UTF-8?q?Ga=C3=ABl=20Lhez?= When an user tries to create an empty bundle via `git bundle create ` where `` resolves to an empty list (for example, like `master..master`), the command fails and warns the user about how it does not want to create empty bundle. However, the `.lock` file was still open and on Windows that means that it could not be deleted properly. This patch fixes that issue. This closes https://github.com/git-for-windows/git/issues/790 Signed-off-by: Gaël Lhez Signed-off-by: Johannes Schindelin --- bundle.c| 7 --- t/t5607-clone-bundle.sh | 4 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/bundle.c b/bundle.c index 1ef584b93b..4e349feff9 100644 --- a/bundle.c +++ b/bundle.c @@ -457,10 +457,11 @@ int create_bundle(struct bundle_header *header, const char *path, object_array_remove_duplicates(); ref_count = write_bundle_refs(bundle_fd, ); - if (!ref_count) - die(_("Refusing to create empty bundle.")); - else if (ref_count < 0) + if (ref_count <= 0) { + if (!ref_count) + error(_("Refusing to create empty bundle.")); goto err; + } /* write pack */ if (write_pack_data(bundle_fd, )) { diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh index 348d9b3bc7..f84b875950 100755 --- a/t/t5607-clone-bundle.sh +++ b/t/t5607-clone-bundle.sh @@ -71,4 +71,8 @@ test_expect_success 'prerequisites with an empty commit message' ' git bundle verify bundle ' +test_expect_success 'try to create a bundle with empty ref count' ' + test_expect_code 1 git bundle create foobar.bundle master..master +' + test_done -- gitgitgadget