Re: Git gc removes all packs
On Fri, Feb 27, 2015 at 11:16:09AM +0100, Dmitry Neverov wrote: I followed your advice and removed a symlink ref from my repository. But didn't help.. automatic GC has just removed all packs again. May alternates cause such a behavior? Are any ways to make gc log somewhere why it removes packs? If you have two repositories, A and B, and A points to B via alternates, then you cannot safely run git gc in B unless it knows about all of the refs in A. As we discussed before, symlinking the refs is not enough, because those symlinks get stale. But nor is removing the symlinks and just not knowing about the refs. :) The only safe thing to do is to fetch all of the refs from A into B just before running the gc (and consequently, you probably want to disable gc.auto in B). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
I followed your advice and removed a symlink ref from my repository. But didn't help.. automatic GC has just removed all packs again. May alternates cause such a behavior? Are any ways to make gc log somewhere why it removes packs? On Thu, Feb 5, 2015 at 9:03 PM, Jeff King p...@peff.net wrote: On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: I'm using git p4 for synchronization with perforce. Sometimes after 'git p4 rebase' git starts a garbage collection. When gc finishes a local repository contains no pack files only loose objects, so I have to re-import repository from perforce. It also doesn't contain a temporary pack git gc was creating. It sounds like git didn't find any refs; it will pack only objects which are reachable. Unreachable objects are either: 1. Exploded into loose objects if the mtime on the pack they contain is less than 2 weeks old (and will eventually expire when they become 2 weeks old). 2. Dropped completely if older than 2 weeks. One more thing about my setup: since git p4 promotes a use of a linear history I use a separate repository for another branch in perforce. In order to be able to cherry-pick between repositories I added this another repo objects dir as an alternate and also added a ref which is a symbolic link to a branch in another repo (so I don't have to do any fetches). You can't symlink refs like this. The loose refs in the filesystem may be migrated into the packed-refs file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
Michael Haggerty mhag...@alum.mit.edu writes: On 02/17/2015 10:57 PM, Junio C Hamano wrote: ... Do you mean that we would end up reading refs/heads/hold if the user did this: git rev-parse --verify HEAD -- precious ln -s ../../../precious .git/refs/heads/hold because that symbolic link does not begin with refs/, Correct, you can do exactly that. The hold reference is resolvable and listable using for-each-ref. But if I try to update it, the contents of the precious file are overwritten. On the other hand, if I run pack-refs, then the current value of the hold reference is moved to packed-refs and the symlink is removed. This behavior is not sane. and is an accident waiting to happen so we should forbid it in the longer term and warning when we see it would be the first step? Yes, I am proposing that approach, though if somebody can suggest a use case I'm willing to be convinced otherwise. Thanks. I agree the proposed tightening is probably harmless, but I too would want to see if somebody comes up with a valid use case. I do not think of anything offhand. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: You can't symlink refs like this. The loose refs in the filesystem may be migrated into the packed-refs file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with refs/, then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. Right, but we should be able to notice that: 1. We found a symlink. 2. We couldn't read it its ref value (because it's a broken link). I think we _do_ notice that at the lowest level, and set REF_ISBROKEN. But the problem is that the reachability code in prune and in pack-objects (triggered by repack -ad) uses for_each_ref, and not for_each_rawref. So they ignore broken refs rather than complaining, even though failing to read a ref may mean we could drop objects which were only mentioned by that ref. It is still possible to write symbolic refs that are represented as symlinks (see core.preferSymlinkRefs), but that backwards-compatibility code was added in 2006(!) Maybe it's time to deprecate it. And maybe we should start working towards a future where any symlinks under refs cause git to complain. I wouldn't mind seeing all of the symlink code go away, but I think it is orthogonal to the problem I mentioned. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
On 02/05/2015 09:03 PM, Jeff King wrote: On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: [...] One more thing about my setup: since git p4 promotes a use of a linear history I use a separate repository for another branch in perforce. In order to be able to cherry-pick between repositories I added this another repo objects dir as an alternate and also added a ref which is a symbolic link to a branch in another repo (so I don't have to do any fetches). You can't symlink refs like this. The loose refs in the filesystem may be migrated into the packed-refs file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with refs/, then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. It is still possible to write symbolic refs that are represented as symlinks (see core.preferSymlinkRefs), but that backwards-compatibility code was added in 2006(!) Maybe it's time to deprecate it. And maybe we should start working towards a future where any symlinks under refs cause git to complain. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
On 02/17/2015 05:55 PM, Jeff King wrote: On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: You can't symlink refs like this. The loose refs in the filesystem may be migrated into the packed-refs file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with refs/, then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. Right, but we should be able to notice that: 1. We found a symlink. 2. We couldn't read it its ref value (because it's a broken link). I think we _do_ notice that at the lowest level, and set REF_ISBROKEN. But the problem is that the reachability code in prune and in pack-objects (triggered by repack -ad) uses for_each_ref, and not for_each_rawref. So they ignore broken refs rather than complaining, even though failing to read a ref may mean we could drop objects which were only mentioned by that ref. Yes, this makes sense too. But my point was that sticking symlinks to random files in your refs hierarchy is pretty questionable even *before* the symlink gets broken. If we would warn the user as soon as we saw such a thing, then the user's problem would never have advanced as far as it did. Do you think that emitting warnings on *intact* symlinks is too draconian? [...] Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
Michael Haggerty mhag...@alum.mit.edu writes: On 02/17/2015 05:55 PM, Jeff King wrote: On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with refs/, then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. ... Yes, this makes sense too. But my point was that sticking symlinks to random files in your refs hierarchy is pretty questionable even *before* the symlink gets broken. If we would warn the user as soon as we saw such a thing, then the user's problem would never have advanced as far as it did. Do you think that emitting warnings on *intact* symlinks is too draconian? Do you mean that we would end up reading refs/heads/hold if the user did this: git rev-parse --verify HEAD -- precious ln -s ../../../precious .git/refs/heads/hold because that symbolic link does not begin with refs/, and is an accident waiting to happen so we should forbid it in the longer term and warning when we see it would be the first step? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
On 02/17/2015 10:57 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: On 02/17/2015 05:55 PM, Jeff King wrote: On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with refs/, then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. ... Yes, this makes sense too. But my point was that sticking symlinks to random files in your refs hierarchy is pretty questionable even *before* the symlink gets broken. If we would warn the user as soon as we saw such a thing, then the user's problem would never have advanced as far as it did. Do you think that emitting warnings on *intact* symlinks is too draconian? Do you mean that we would end up reading refs/heads/hold if the user did this: git rev-parse --verify HEAD -- precious ln -s ../../../precious .git/refs/heads/hold because that symbolic link does not begin with refs/, Correct, you can do exactly that. The hold reference is resolvable and listable using for-each-ref. But if I try to update it, the contents of the precious file are overwritten. On the other hand, if I run pack-refs, then the current value of the hold reference is moved to packed-refs and the symlink is removed. This behavior is not sane. and is an accident waiting to happen so we should forbid it in the longer term and warning when we see it would be the first step? Yes, I am proposing that approach, though if somebody can suggest a use case I'm willing to be convinced otherwise. The only thing I can imagine symlinks being useful for might be to temporarily create a fake repo, run one or two specific known-safe commands, then delete the repo again. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git gc removes all packs
On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: I'm using git p4 for synchronization with perforce. Sometimes after 'git p4 rebase' git starts a garbage collection. When gc finishes a local repository contains no pack files only loose objects, so I have to re-import repository from perforce. It also doesn't contain a temporary pack git gc was creating. It sounds like git didn't find any refs; it will pack only objects which are reachable. Unreachable objects are either: 1. Exploded into loose objects if the mtime on the pack they contain is less than 2 weeks old (and will eventually expire when they become 2 weeks old). 2. Dropped completely if older than 2 weeks. One more thing about my setup: since git p4 promotes a use of a linear history I use a separate repository for another branch in perforce. In order to be able to cherry-pick between repositories I added this another repo objects dir as an alternate and also added a ref which is a symbolic link to a branch in another repo (so I don't have to do any fetches). You can't symlink refs like this. The loose refs in the filesystem may be migrated into the packed-refs file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html