Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
On 08/25/2016 06:01 PM, Alex Nauda wrote: > On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggerty> wrote: >> On 08/24/2016 11:39 PM, Jeff King wrote: >>> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote: >>> Elastic File System (EFS) is Amazon's scalable filesystem product that is exposed to the OS as an NFS mount. We're using EFS to host the filesystem used by a Jenkins CI server. Sometimes when Jenkins tries to git fetch, we get this error: $ git -c core.askpass=true fetch --tags --progress g...@github.com:mediasilo/dodo.git +refs/pull/*:refs/remotes/origin/pr/* fatal: Reference directory conflict: refs/heads/ $ echo $? 128 Has anyone seen anything like this before? Any tips on how to troubleshoot it? >>> >>> No, I haven't seen it before. That's an internal assertion in the refs >>> code that shouldn't ever happen. It looks like it happens when the loose >>> refs end up with duplicate directory entries. While a bug in git is an >>> obvious culprit, I wonder if it's possible that your filesystem might >>> expose the same name twice in one set of readdir() results. >>> >>> +cc Michael, who added this assertion long ago (and since this is the >>> first report in all these years, it does make me suspect that the >>> filesystem is a critical part of reproducing). >> >> Thanks for the CC. >> >> I've never heard of this problem before. >> >> What Git version are you using? > Git client 2.7.4 against GitHub (Git 2.6.5) > >> >> I tried to provoke the problem by hand-corrupting the packed-refs file, >> but wasn't successful. >> >> So Peff's suggestion that the problem originates in your filesystem >> seems to be to be the most likely cause. A quick Google search found, >> for example, >> >> https://bugzilla.redhat.com/show_bug.cgi?id=739222 >> >> http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs >> >> though these reports seem connected with having lots of files in the >> directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a >> more careful search, and it is easily possible that there are other bugs >> in NFS (or EFS) that could be affecting you. >> >> If this were repeatable, you could run Git under strace to test Peff's >> hypothesis. But I suppose it only happens rarely, right? > Actually it seems to be reproducible. Here's the last portion of an strace: > > [...] > stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644, > st_size=41, ...}) = 0 > lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644, > st_size=41, ...}) = 0 > open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4 > read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41 > read(4, "", 215)= 0 > close(4)= 0 > getdents(3, /* 0 entries */, 32768) = 0 > close(3)= 0 > open(".git/refs/remotes/origin/pr/16/", > O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 > fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > getdents(3, /* 3 entries */, 32768) = 72 > stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644, > st_size=41, ...}) = 0 > lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644, > st_size=41, ...}) = 0 > open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4 > read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41 > read(4, "", 215)= 0 > close(4)= 0 > getdents(3, /* 0 entries */, 32768) = 0 > close(3)= 0 > open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 > fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > getdents(3, /* 2 entries */, 32768) = 48 > getdents(3, /* 0 entries */, 32768) = 0 > close(3)= 0 > open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = > -1 ENOENT (No such file or directory) > open(".git/packed-refs", O_RDONLY) = -1 ENOENT (No such file or > directory) > fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 > write(2, "fatal: Reference directory confl"..., 58fatal: Reference > directory conflict: refs/remotes/origin/ > ) = 58 > exit_group(128) = ? > +++ exited with 128 +++ Thanks for the additional information. >From the strace output it is clear that there is no packed-refs file at the time of the problem, so the problem must be among the loose refs. The error is a "Reference directory conflict", which suggests that "refs/remotes/origin/" appears in two entries; once as a reference directory and once as a reference. But in fact it could also mean that "refs/remotes/origin/" appears twice, both as directories. Neither one should happen in normal operation. Unfortunately there is not enough strace output to see whether (in this case) path `refs/remotes/origin` was
Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
On Thu, Aug 25, 2016 at 2:28 AM, Michael Haggertywrote: > On 08/24/2016 11:39 PM, Jeff King wrote: >> On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote: >> >>> Elastic File System (EFS) is Amazon's scalable filesystem product that >>> is exposed to the OS as an NFS mount. We're using EFS to host the >>> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries >>> to git fetch, we get this error: >>> $ git -c core.askpass=true fetch --tags --progress >>> g...@github.com:mediasilo/dodo.git >>> +refs/pull/*:refs/remotes/origin/pr/* >>> fatal: Reference directory conflict: refs/heads/ >>> $ echo $? 128 >>> >>> Has anyone seen anything like this before? Any tips on how to troubleshoot >>> it? >> >> No, I haven't seen it before. That's an internal assertion in the refs >> code that shouldn't ever happen. It looks like it happens when the loose >> refs end up with duplicate directory entries. While a bug in git is an >> obvious culprit, I wonder if it's possible that your filesystem might >> expose the same name twice in one set of readdir() results. >> >> +cc Michael, who added this assertion long ago (and since this is the >> first report in all these years, it does make me suspect that the >> filesystem is a critical part of reproducing). > > Thanks for the CC. > > I've never heard of this problem before. > > What Git version are you using? Git client 2.7.4 against GitHub (Git 2.6.5) > > I tried to provoke the problem by hand-corrupting the packed-refs file, > but wasn't successful. > > So Peff's suggestion that the problem originates in your filesystem > seems to be to be the most likely cause. A quick Google search found, > for example, > > https://bugzilla.redhat.com/show_bug.cgi?id=739222 > > http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs > > though these reports seem connected with having lots of files in the > directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a > more careful search, and it is easily possible that there are other bugs > in NFS (or EFS) that could be affecting you. > > If this were repeatable, you could run Git under strace to test Peff's > hypothesis. But I suppose it only happens rarely, right? Actually it seems to be reproducible. Here's the last portion of an strace: [...] stat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644, st_size=41, ...}) = 0 lstat(".git/refs/remotes/origin/pr/7/head", {st_mode=S_IFREG|0644, st_size=41, ...}) = 0 open(".git/refs/remotes/origin/pr/7/head", O_RDONLY) = 4 read(4, "5d82811a248900efd8e201c6d9232de5"..., 256) = 41 read(4, "", 215)= 0 close(4)= 0 getdents(3, /* 0 entries */, 32768) = 0 close(3)= 0 open(".git/refs/remotes/origin/pr/16/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 getdents(3, /* 3 entries */, 32768) = 72 stat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644, st_size=41, ...}) = 0 lstat(".git/refs/remotes/origin/pr/16/head", {st_mode=S_IFREG|0644, st_size=41, ...}) = 0 open(".git/refs/remotes/origin/pr/16/head", O_RDONLY) = 4 read(4, "2886c4f3ba8c3b5c2306029f6e39498d"..., 256) = 41 read(4, "", 215)= 0 close(4)= 0 getdents(3, /* 0 entries */, 32768) = 0 close(3)= 0 open(".git/refs/tags/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 getdents(3, /* 2 entries */, 32768) = 48 getdents(3, /* 0 entries */, 32768) = 0 close(3)= 0 open(".git/refs/bisect/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open(".git/packed-refs", O_RDONLY) = -1 ENOENT (No such file or directory) fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 write(2, "fatal: Reference directory confl"..., 58fatal: Reference directory conflict: refs/remotes/origin/ ) = 58 exit_group(128) = ? +++ exited with 128 +++ > > Is it possible that multiple clients have the same NFS filesystem > mounted while Git is running? That would seem like an especially bad > idea and I could imagine it leading to problems like this. > > It's surprising that you are seeing this problem in directory `refs`, > because (1) that directory is unlikely to have very many entries, and > (2) as far as I remember, Git will never delete the directories > `refs/heads` and `refs/tags`. Seems like sometimes it happens on other directories: refs/remotes/origin/ or refs/remotes/origin/pr/1 Then as I was stracing it again, suddenly it succeeded. Some kind of race condition? > > Michael > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
On 08/24/2016 11:39 PM, Jeff King wrote: > On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote: > >> Elastic File System (EFS) is Amazon's scalable filesystem product that >> is exposed to the OS as an NFS mount. We're using EFS to host the >> filesystem used by a Jenkins CI server. Sometimes when Jenkins tries >> to git fetch, we get this error: >> $ git -c core.askpass=true fetch --tags --progress >> g...@github.com:mediasilo/dodo.git >> +refs/pull/*:refs/remotes/origin/pr/* >> fatal: Reference directory conflict: refs/heads/ >> $ echo $? 128 >> >> Has anyone seen anything like this before? Any tips on how to troubleshoot >> it? > > No, I haven't seen it before. That's an internal assertion in the refs > code that shouldn't ever happen. It looks like it happens when the loose > refs end up with duplicate directory entries. While a bug in git is an > obvious culprit, I wonder if it's possible that your filesystem might > expose the same name twice in one set of readdir() results. > > +cc Michael, who added this assertion long ago (and since this is the > first report in all these years, it does make me suspect that the > filesystem is a critical part of reproducing). Thanks for the CC. I've never heard of this problem before. What Git version are you using? I tried to provoke the problem by hand-corrupting the packed-refs file, but wasn't successful. So Peff's suggestion that the problem originates in your filesystem seems to be to be the most likely cause. A quick Google search found, for example, https://bugzilla.redhat.com/show_bug.cgi?id=739222 http://superuser.com/questions/640419/how-can-i-have-two-files-with-the-same-name-in-a-directory-when-mounted-with-nfs though these reports seem connected with having lots of files in the directory, which seems unlikely for `$GIT_DIR/refs/`. But I didn't do a more careful search, and it is easily possible that there are other bugs in NFS (or EFS) that could be affecting you. If this were repeatable, you could run Git under strace to test Peff's hypothesis. But I suppose it only happens rarely, right? Is it possible that multiple clients have the same NFS filesystem mounted while Git is running? That would seem like an especially bad idea and I could imagine it leading to problems like this. It's surprising that you are seeing this problem in directory `refs`, because (1) that directory is unlikely to have very many entries, and (2) as far as I remember, Git will never delete the directories `refs/heads` and `refs/tags`. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
On Wed, Aug 24, 2016 at 04:52:33PM -0400, Alex Nauda wrote: > Elastic File System (EFS) is Amazon's scalable filesystem product that > is exposed to the OS as an NFS mount. We're using EFS to host the > filesystem used by a Jenkins CI server. Sometimes when Jenkins tries > to git fetch, we get this error: > $ git -c core.askpass=true fetch --tags --progress > g...@github.com:mediasilo/dodo.git > +refs/pull/*:refs/remotes/origin/pr/* > fatal: Reference directory conflict: refs/heads/ > $ echo $? 128 > > Has anyone seen anything like this before? Any tips on how to troubleshoot it? No, I haven't seen it before. That's an internal assertion in the refs code that shouldn't ever happen. It looks like it happens when the loose refs end up with duplicate directory entries. While a bug in git is an obvious culprit, I wonder if it's possible that your filesystem might expose the same name twice in one set of readdir() results. +cc Michael, who added this assertion long ago (and since this is the first report in all these years, it does make me suspect that the filesystem is a critical part of reproducing). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
on Amazon EFS (NFS): "Reference directory conflict: refs/heads/" with status code 128
Elastic File System (EFS) is Amazon's scalable filesystem product that is exposed to the OS as an NFS mount. We're using EFS to host the filesystem used by a Jenkins CI server. Sometimes when Jenkins tries to git fetch, we get this error: $ git -c core.askpass=true fetch --tags --progress g...@github.com:mediasilo/dodo.git +refs/pull/*:refs/remotes/origin/pr/* fatal: Reference directory conflict: refs/heads/ $ echo $? 128 Has anyone seen anything like this before? Any tips on how to troubleshoot it? Related Jenkins issue: https://issues.jenkins-ci.org/browse/JENKINS-37653 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html