Re: [git-users] “Ghost” commit
Thank you guys. I was aware only the child knows about the parent, but as gitk shows the child. I know in terms of implementation it's only the child who knows about its parent. Like the time, there's only the number of seconds since 01/01/1970 but we can say day, month, year. I have just run the suggested command but with a “-unreachable”: git reflog expire --all --expire-unreachable=now Next time I run git prune -n -v it showed me the commit, the tree and the blob. What I wanted! :) I'm not yet sure what would be the difference between git reflog expire --all --expire-unreachable=now and git reflog expire --expire-unreachable=now but that helped a lot already. Thank you guys! On Saturday, 29 September 2012 15:57:28 UTC+1, Konstantin Khomoutov wrote: On Fri, 28 Sep 2012 17:55:41 -0700 (PDT) Thiago Rossi thiagor...@gmail.com javascript: wrote: [...] What I don't understand (or see the reasons) is why the blob for the amended commit is not “purge-able”… If it's not listed there, I guess something is referencing to it, and I wonder what, as it seems only reflog is capable of show it. It's not purge-able precisely because the reflog is enabled in your repository (this is the default), and the reflog records all non-linear movenent of branches. You could also use the terms drastic or catastrophic for such movements -- they are those which would otherwise leave the previous commit a branch pointed at before it moved truly dangling. When you run `git commit --amend`, you throw away the tip commit of your current branch and replace it with another one -- this is such a drastic movement of a branch. The reflog keeps a reference to the previous state of a branch, precisely for the reasons of easy recovery, if needed. By default, the reflog keeps its history for 30 days or so. Again, please note that you should not really care about whether Git killed that replaced commit or not, or if that loose commit and the objects it references are dangling and need to be garbage-collected -- this stuff is only to be considered when for some reason you're facing a serious disk space/process memory issue, otherwise a Git repository is self-maintaining: Git will periodically garbage-collect and pack it so there's nothing to worry about. A disk space/memory issue may occur if, say, someone commits a huge file by mistake, you undo this action and do really want that file to go away *physically* to not waste disk space (which might be an issue when the repo is hosted on a VDS for instance). But this is a special case, and you can combat it with a special tools like `git reflog expire --all --expire=now` followed by `git gc --prune=now` or something like this. If I have A --- B and amend B, now I have A --- C, and B, as far as I know, is no longer listed as a child of A and doesn't have A as a parent. reflog shows B, but who is using B? I am able to checkout B if I know its SHA1 code, and I think it would be a bad idea… I don't know why B can't be purged then. The reflog is using B by referencing it. The rest is as Philip already pointed out: the parent/child relations are reversed in a DVCS system -- it's children who reference their parents, not the other way around. This will become logical once you recall that commits in a DVCS systems are immutable, so once the commit has been recorded you can't retrofit a reference to a child in it. Hence the implemented scheme is way more flexible: you can add/remove any number of children any time without touching any commits at all. -- You received this message because you are subscribed to the Google Groups Git for human beings group. To view this discussion on the web visit https://groups.google.com/d/msg/git-users/-/OvWkFOYGqMUJ. To post to this group, send email to git-users@googlegroups.com. To unsubscribe from this group, send email to git-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/git-users?hl=en.
Re: [git-users] “Ghost” commit
Thank you very much for the links. Actually I've read a few pages of the book recently… And I think I understand the part about blobs, trees, commits… But what I don't understand is… if I stage a file, edit it, and stage it again, the blob created during the first staging will be purged if I run git purge. That happens because no commit was made between the two “adds”/staging. Nothing refers to the first one (any more). I also get that if a commit is amended, we are not supposed to use the amended one… or reference it, as the intention of amending is replacing it… pretend the replaced commit never happened. What I don't understand (or see the reasons) is why the blob for the amended commit is not “purge-able”… If it's not listed there, I guess something is referencing to it, and I wonder what, as it seems only reflog is capable of show it. If I have A --- B and amend B, now I have A --- C, and B, as far as I know, is no longer listed as a child of A and doesn't have A as a parent. reflog shows B, but who is using B? I am able to checkout B if I know its SHA1 code, and I think it would be a bad idea… I don't know why B can't be purged then. On Friday, 28 September 2012 09:16:04 UTC+1, Konstantin Khomoutov wrote: On Thu, Sep 27, 2012 at 05:28:23AM -0700, Thiago Rossi wrote: [...] Not sure what dangling means. I mean, how it differs from orphans/not being referenced… In my current repository, b95cad5 has been replaced by b219846. Both have the same parent. But b95cad5 became “invisible” in most interfaces, including git log, gitk and GitX. This commit should be reachable from the reflog of your repository as amending the (tip) commit moved the HEAD in a non-linear way, so this has been recorded to the reflog. Read the `git reflog` manual for more info. In general, you should just absorb the fact that reaching for the commit replaced by `git commit --amend` is an *unusual* case, and providing for a way to routinely expose it in various parts of Git's user interface is odd. I mean, when you do `git commit --amend`, provided you read and understood its manual page, you expect that command to completely replace the commit you're amending, as if the original commit just did not exist in the first place. The fact the original commit is still there by the time `git commit --amend` completed is just an implementation detail of the Git storage backend which uses garbage collection. The reflog I mentioned above, while not being a recent addition, have not always been there (and it's usually disabled in bare repositories). I understand you might have complications understanding why amending a commit works like it works (that is, why the commit changes, I mean, it's SHA-1 changes). This is because the commit object consists of metadata and a reference to a tree object, representing the state of files associated with that commit. The metadata, among other things, records the date and time the commit was made, and the commit message. Observe, that even if you did not change anything in the commit while amending it (I bother to check if Git compares the new commit message with the original one), the commit date/time changes and hence changes the commit object and hence changes its SHA-1. Changing the tree referenced by the commit (say, adding a forgotten file) changes that tree's SHA-1 hash which is recorded in the commit object and hence changes the commit object itself. If you like to dabble in such technical subtleties to better understand your tools, I highly recommend to read Git from the bottom up [1], and if you do not feel quite confident about those SHA-1 hashes and why commit object reference tree objects and stuff, you could start from The Git Parable [2] -- it's probably the simplest introduction to the concepts and is very fun to read. Another aspect is that different people have different tastes and different ideas about how a VCS tool should work, and you might just dislike the (frivolous) ways in which Git treats unpushed history (and pushed, too, as you'll discover sometime). You then can just try to refrain from using `git commit --amend`. This won't help if you want to fix the commit message up [*], but if you forgot to add a file or want to remove a file, you might just record another commit doing exactly that. This is not considered to be a best practice, but you know better what works better for your workflow/approach. [*] For instance, Fossil, while not allowing rewriting history at all, would allow you to change the commit message, but it does this by recording a special artifact in the repository, and the original commit message is preserved for later inspection. As you can see, Git's developers have radically different idea about how to treat the unpublished history. 1. http://newartisans.com/2008/04/git
Re: [git-users] “Ghost” commit
Hello Adam. It's exactly what I was trying to say. And I think maybe the time has to do with the fact it doesn't show. I remember seeing something about 2 weeks old… but I thought when something was not referenced by anyone, let's say, orphan, the time would be ignored. I will try to run commands specifying a different time (1 second). Well, I try to run prune with expire 0, expire 1 (I am guessing it means seconds?) and didn't work… But I tried this and… git fsck --lost-found Checking object directories: 100% (256/256), done. dangling commit b95cad521864494280d3609b295ccc0aa9e135e2 Not sure what dangling means. I mean, how it differs from orphans/not being referenced… In my current repository, b95cad5 has been replaced by b219846. Both have the same parent. But b95cad5 became “invisible” in most interfaces, including git log, gitk and GitX. On Thursday, 27 September 2012 13:17:45 UTC+1, Adam Prescott wrote: When you amend the second commit replaces the results of the first. It's for the occasion when you commit too early and possibly forget to add some files, or you mess up your commit message. I don't think you're answering on the same level of abstraction as Thiago's question. At a high level, the commit has indeed been replaced, but the commits D and E are both separate entities as far as Git is concerned, it's just that E's parent has been set to C and D isn't reachable from any ref; D is in a dangling state. Try this: cd /tmp git init foo cd foo touch foo git add foo git commit -a -m Initial commit touch bar git add bar git commit -a -m Bar git log --oneline # output: 2e1c949 Bar 499724c Initial commit git commit --amend # amend the message to be Bar (edited) git log --oneline # output: 94e441b Bar (edited) 499724c Initial commit Now look at the log starting from the commit that was amended (2e1c949): git log --oneline 2e1c949 # output: 2e1c949 Bar 499724c Initial commit 2e1c949 still exists and still has its parent set to 499724c, it's just that 2e1c949 is not reachable from, in this case, master. You can see this more explicitly in the log when specifying master and the dangling 2e1c949: $ git log --graph --oneline --decorate master 2e1c949 # two explicit places to start from * 94e441b (HEAD, master) Bar (edited) | * 2e1c949 Bar |/ * 499724c Initial commit The two are separate, as you can see. As far as an actual answer concerning `prune` and `fsck`, I'm not quite sure but it may be something to do with the default length of time before an object will actually be pruned. I'd be interested in reading a fuller answer by someone who knows. -- You received this message because you are subscribed to the Google Groups Git for human beings group. To view this discussion on the web visit https://groups.google.com/d/msg/git-users/-/YH3c9njsBwwJ. To post to this group, send email to git-users@googlegroups.com. To unsubscribe from this group, send email to git-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/git-users?hl=en.
Re: [git-users] “Ghost” commit
That was a good hint, thank you! However I still find the behaviour weird: git config gc.reflogexpireUnreachable = now git fsck --unreachable Checking object directories: 100% (256/256), done. git prune -n -v [empty output] If I understood git, it should list that commit in here… :( On Thursday, 27 September 2012 13:31:43 UTC+1, maxhodges wrote: just peeked at the source files and noticed this text which may relate to some questions you had about pruning: The optional configuration variable 'gc.reflogExpireUnreachable' can be set to indicate how long historical reflog entries which are not part of the current branch should remain available in this repository. These types of entries are generally created as a result of using `git commit --amend` or `git rebase` and are the commits prior to the amend or rebase occurring. Since these changes are not part of the current project most users will want to expire them sooner. This option defaults to '30 days' Search amend (587 hits in 95 files) if you want to dig in! :) On Thu, Sep 27, 2012 at 9:22 PM, Max Hodges m...@whiterabbitpress.comjavascript: wrote: Hum, well it might be currently implemented in a way that leaves traces, but it seems amend is documented to replace the previous commit with the new one. Future implementations and garbage collection may vary. What's the purpose of trying to see if traces of this previous commit exist anyway? Just playing around to reverse-engineer the implementation of amend? Git if open-source so you could dig into the code too I suppose. On Thu, Sep 27, 2012 at 9:17 PM, Adam Prescott ad...@aprescott.comjavascript: wrote: When you amend the second commit replaces the results of the first. It's for the occasion when you commit too early and possibly forget to add some files, or you mess up your commit message. I don't think you're answering on the same level of abstraction as Thiago's question. At a high level, the commit has indeed been replaced, but the commits D and E are both separate entities as far as Git is concerned, it's just that E's parent has been set to C and D isn't reachable from any ref; D is in a dangling state. Try this: cd /tmp git init foo cd foo touch foo git add foo git commit -a -m Initial commit touch bar git add bar git commit -a -m Bar git log --oneline # output: 2e1c949 Bar 499724c Initial commit git commit --amend # amend the message to be Bar (edited) git log --oneline # output: 94e441b Bar (edited) 499724c Initial commit Now look at the log starting from the commit that was amended (2e1c949): git log --oneline 2e1c949 # output: 2e1c949 Bar 499724c Initial commit 2e1c949 still exists and still has its parent set to 499724c, it's just that 2e1c949 is not reachable from, in this case, master. You can see this more explicitly in the log when specifying master and the dangling 2e1c949: $ git log --graph --oneline --decorate master 2e1c949 # two explicit places to start from * 94e441b (HEAD, master) Bar (edited) | * 2e1c949 Bar |/ * 499724c Initial commit The two are separate, as you can see. As far as an actual answer concerning `prune` and `fsck`, I'm not quite sure but it may be something to do with the default length of time before an object will actually be pruned. I'd be interested in reading a fuller answer by someone who knows. -- You received this message because you are subscribed to the Google Groups Git for human beings group. To post to this group, send email to git-...@googlegroups.comjavascript: . To unsubscribe from this group, send email to git-users+...@googlegroups.com javascript:. For more options, visit this group at http://groups.google.com/group/git-users?hl=en. -- You received this message because you are subscribed to the Google Groups Git for human beings group. To view this discussion on the web visit https://groups.google.com/d/msg/git-users/-/0BUKGCChFnkJ. To post to this group, send email to git-users@googlegroups.com. To unsubscribe from this group, send email to git-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/git-users?hl=en.