Re: Storing state in $GIT_DIR
Martin Langhoff [EMAIL PROTECTED] writes: On 8/26/05, Eric W. Biederman [EMAIL PROTECTED] wrote: Thinking about it going from arch to git should be just a matter of checking sha1 hashes, possibly back to the beginning of the arch tree. Yup, though actually replaying the tree to compute the hashes is something I just _won't_ do ;) I guess if you have the tla branch names it won't be necessary. If you are careful how you do the import you can have two parallel imports of the same data and produce exactly the same git tree. That is largely why I care about a stable algorithm for the hashes. Going from git to arch is the trickier mapping, because you need to know the full repo--category--branch--version--patch mapping. My plan doesn't include git-arch support... yet... One of my interests, and if I get the time to worry about it is to get a scm that is a sufficient superset of what other scms do so it can serve as a bidirectional gateway. git is fairly close to what is needed to implement that. Hmm. I wonder if a git metadata branch in general is sufficient to store information that does not map to git natively? Hmm. Thinking about arch from a git perspective arch tags every commit. So the really sane thing to do (I think) is to create a git tag object for every arch commit. Now I like that interesting idea. It doesn't solve all my problems, but is a reasonable mapping point. Will probably do it. With patch trading (Martin I think I know what you are refering to) arch does seem to have a concept that does not map very well to git, and this I think is a failing in git. I won't get into _that_ flamewar ;) pouts No flamewar /pouts My plan for merges is to detect when two branches up until what point branches are fully merged, and mark that in git -- because that is what git considers a merge. The rest will be known to the importer, but nothing else. I looked at least back to the StGit announcement and it helped to clarify my thinking. A patch is equivalent to a branch with just one change. This makes cherry picking a single patch roughly equivalent to describing that patch as a single commit branch at the fork point from the common ancestor of the two branches, and then having the single commit merged. The fact that the original branch that was cherry picked from can really only be represented as a an graft. Like the original linux kernel history. The shortcoming I see in git-applypatch is that it doesn't attempt to find the original base of a patch and instead simply assumes it is against the current tree. There is a similar short coming in git-diff-tree where it reports the commit that you are on when take the diff, but it does not report the commit the diff is against. .. Thinking a little more there is also a connection with reverting patches. Cherry picking changes from a branch may also be thought of as reverting all of the other changes from a branch and then merging the branch. The practical impact of all of these things is there a form that will allow future merges to realize the same change has already been applied so it can skip it the second time. Inter-operating with darcs, tla, quilt, and raw diff/patch brings up these issues. So my practical questions are: - What information can a current git merge algorithms and more sophisticated merge algorithms use to avoid having conflicts when the same changes are merged into the same branch multiple times? - Is the git meta data sufficient to represent the history sophisticated merge algorithms can use. - Is the git meta data sufficient to represent the result of sufficient meta data operations. - Is the current representation of a reverted change sufficient for the merge algorithms, or could they do a better job if they new a change was revert of a previous change. I'm just trying to think through the issues that working with patch based systems bring up. Eric - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing state in $GIT_DIR
Linus, I like the solution you are suggesting, but I suspect it will create more problems that it will solve, and while the coolness factor is drawing me in we ain't gonna need it, as the xp people say. More below... On 8/26/05, Linus Torvalds [EMAIL PROTECTED] wrote: Git won't care, so it will work, but things like clone/pull etc also won't actually ever look there, so it will only work for that one repo. Storing things there _works_ in the sense that it will be ignored, and that is fine with me. So I could just be lazy and have it strictly tied to the repo. In practice, if you are tracking an external Arch repo, you really have it scripted, and use a dedicated git repo for that. Not using a dedicated repo is quite... messy. If you do other things in that particular repo, the import script may find it dirty, and mess things up on import. And after the import, you'll probably run git-push-script --all because it's bringing a dynamically growing forest of heads from the arch repo. That's another reason why your private branches should be elsewhere. OTOH, storing the metadata in a branch will allow us to run the import in alternating repositories. But as Junio points out, unless I can guarantee that the metadata and the tree are in sync, I cannot trivially resume the import cycle from a new repo. The git solution to this (which nobody has ever _used_, but which technically is wonderful) is to have a side branch that does not share any commits (or files, for that matter) in common with the real branch, and which is used to track any metadata. In fact, you can obviously have any number of side branches. A couple of days ago, playing with the import, I realised that the git repo can hold unrelated projects, too, if you just commit orphan trees as new heads. I mean - it was a bug in my script but I thought it was cool. ;) The way to maintain a metadata branch is to have not only a different branch name (obviously), but also use a totally different index file, so that you can index both branches in parallell, and you don't actually need to check out one or the other. Hmmm. Now that's voodoo magic! I was thinking of reading the file by asking directly for the object by its sha, or doing a checkout in a tmpdir. Interesting. cheers, martin - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing state in $GIT_DIR
On Fri, 26 Aug 2005, Martin Langhoff wrote: OTOH, storing the metadata in a branch will allow us to run the import in alternating repositories. But as Junio points out, unless I can guarantee that the metadata and the tree are in sync, I cannot trivially resume the import cycle from a new repo. But you can. Remember: the metadata is the pointers to the original git conversion, and objects are immutable. In other words, if you just have a last commit pointer in your meta-data, then git is _by_definition_ in sync. There's never anything to get out of sync, because objects aren't going to change. So you can think of your meta-data as a strange kind of head ref. Or rather, a _collection_ of these strange refs. And it doesn't matter if somebody ends up committing on top of an arch import. The metadata by definition doesn't know about it, so the import head doesn't move anywhere (if you do git and arch work in parallell, you can then merge the two heads with git, of course). Linus - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing state in $GIT_DIR
On 8/26/05, Linus Torvalds [EMAIL PROTECTED] wrote: OTOH, storing the metadata in a branch will allow us to run the import in alternating repositories. But as Junio points out, unless I can guarantee that the metadata and the tree are in sync, I cannot trivially resume the import cycle from a new repo. But you can. Remember: the metadata is the pointers to the original git conversion, and objects are immutable. In other words, if you just have a last commit pointer in your meta-data, then git is _by_definition_ in sync. There's never anything to get out of sync, because objects aren't going to change. Hmmm. That repo is in sync, but there are no guarantees that they will travel together to a different repo. In fact, the push/pull infrastructure wants to push/pull one head at a time. And if they are not in sync, I have no way of knowing. Hmpf. I lie: the arch metadata could keep track of what it expects the last head commits to be, and complain bitterly if something smells rotten. let me think about it ;) martin - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing state in $GIT_DIR
Martin Langhoff [EMAIL PROTECTED] writes: In other words, if you just have a last commit pointer in your meta-data, then git is _by_definition_ in sync. There's never anything to get out of sync, because objects aren't going to change. Hmmm. That repo is in sync, but there are no guarantees that they will travel together to a different repo. In fact, the push/pull infrastructure wants to push/pull one head at a time. Wrong as of last week ;-), and definitely wrong since this morning. And if they are not in sync, I have no way of knowing. Hmpf. I lie: the arch metadata could keep track of what it expects the last head commits to be, and complain bitterly if something smells rotten. What Linus suggests is doable by using an object that can hold a pointer to at least one commit---you used that to record the head commit of the corresponding git branch that the arch metainfo represents. You only pull arch metainfo branch; the objects associated with the corresponding git branch head will be pulled together when you pull it. You do not have to tell git to pull git-part of the commit chain. There is no need to worry about version skew when you use git this way. Now, among the existing object types, there are only two kinds of objects you can use for this. If the only thing you need to record is some textual information with one pointer to git branch head, then you can use tag that points at the git head, and store everything else as the tag comment. This is doable but unwieldy. You could abuse a commit object as well; you store commit objects (such as the corresponding git branch head) as parent commits, and put everything else in a tree that is associated with that commit. - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing state in $GIT_DIR
Linus Torvalds [EMAIL PROTECTED] writes: That kind of extension shouldn't be too hard, and might make tags much more generally usable (ie you could say I sign these n official releases or something). Well, I admit that once I advocated changing tag to bag, but one problem is how you would dereference something like that. v0.99.5^0 means look at the named object v0.99.5, dereference it repeatedly until you get a non-tag, and take the result, which had better be a commit. If a tag can contain more than one pointers, I do not know what it means. - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html