Hi Peff,
On Thu, 15 Nov 2018, Jeff King wrote:
> On Thu, Nov 15, 2018 at 01:57:45PM +0100, Johannes Schindelin wrote:
>
> > > I looked at the test to see if it would pass, but it is not even
> > > checking anything about lockfiles! It just checks that we exit 1 by
> > > returning up the callstack instead of calling die(). And of course it
> > > would not have a problem under Linux either way. But if I run something
> > > similar under strace, I see:
> > >
> > > $ strace ./git bundle create foobar.bundle HEAD..HEAD
> > > [...]
> > > openat(AT_FDCWD, "/home/peff/compile/git/foobar.bundle.lock",
> > > O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3
> > > [...]
> > > close(3) = 0
> > > unlink("/home/peff/compile/git/foobar.bundle.lock") = 0
> > > exit_group(128) = ?
> > >
> > > which seems right.
> >
> > Without the fix, the added regression test fails thusly:
> >
> > -- snip --
> > [...]
> > ++ test_expect_code 1 git bundle create foobar.bundle master..master
> > ++ want_code=1
> > ++ shift
> > ++ git bundle create foobar.bundle master..master
> > fatal: Refusing to create empty bundle.
> > warning: unable to unlink 'C:/git-sdk-64/usr/src/git/wip2/t/trash
> > directory.t5607-clone-bundle/foobar.bundle.lock': Permission denied
>
> Hmph. So who has it open, and why isn't the tempfile code working as
> designed?
>
> Aha, I see the problem. We dup() the descriptor in create_bundle(). So
> the patch _is_ necessary and (fairly) correct. But the explanation
> probably ought to be something like:
>
> In create_bundle(), we duplicate the lockfile descriptor via dup().
> This means that even though the lockfile code carefully calls close()
> before unlinking the lockfile, we may still have the file open. Unix
> systems don't care, but under Windows, this prevents the unlink
> (causing an annoying warning and a stale lockfile).
>
> But that also means that all of the other places we could die (e.g., in
> write_or_die) are going to have the same problem. We've fixed only one.
> Is there a way we can avoid doing the dup() in the first place?
>
> The comment there explains that we duplicate because write_pack_data()
> will close the descriptor. Unfortunately, that's hard to change because
> it comes from run-command. But we don't actually need the descriptor
> ourselves after it's closed; we're just trying to appease the lockfile
> code; see e54c347c1c (create_bundle(): duplicate file descriptor to
> avoid closing it twice, 2015-08-10).
>
> We just need some reasonable way of telling the lock code what's
> happening. Something like the patch below, which is a moral revert of
> e54c347c1c, but instead wrapping the "lock->fd = -1" in an official API.
>
> Does this make your warning go away?
>
> diff --git a/bundle.c b/bundle.c
> [...]
I cannot claim that I wrapped my head around your explanation or your diff (I am
busy trying to prepare Git for Windows' `master` for rebasing to v2.20.0-rc0),
but it does fix the problem. Thank you so much!
The line `test_expect_code 1 ...` needs to be adjusted to `test_expect_code
128`, of course, and to discern from the fixed problem (which also exits with
code 128), the error output should be verified, like so:
-- snip --
test_expect_success 'try to create a bundle with empty ref count' '
test_must_fail git bundle create foobar.bundle master..master 2>err &&
test_i18ngrep "Refusing to create empty bundle" err
'
-- snap --
Do you want to integrate this test into your patch and run with it, or do you
want me to shepherd your patch?
Ciao,
Dscho