Re: [RFC/PATCH 0/7] Rework git core for native submodules
Am 18.04.2013 01:17, schrieb Philip Oakley: Would it be possible to summarise the key points and proposals of where the subject is now? Here you go, time to post our third iteration of the comparison list, containing two updates: - easier coding was removed from the advantages - git submodule foreach was retired from the disadvantages As in the two first versions, the issues in parentheses had been brought up but were dismissed and are only kept for reference together with the reason why they aren't relevant anymore. Only those preceded by a '*' are still considered valid. Advantages: * Information is stored in one place, no need to lookup stuff in another file/blob. * No need to cd-to-toplevel to change configuration in the .gitmodules file, the special tools to edit link information will work in any subdirectory. (It is all but clear that this approach will lead to easier coding, some parts of the code - like rm and mv - will profit from that while others won't, e.g. we have to implement the link object manipulation tools that are not needed for .gitmodules and we get another indirection retrieving the submodule commit from the link object. And then there is the fact that the new code would have to catch up with functionality already coded using .gitmodules, like the status/diff ignore and the fetch flags). (We currently need a checked out work tree to access the .gitmodules file, but there is ongoing work to read the configuration directly from the database) (While it is easier to merge the link object, a .gitmodules aware merge driver would work just as well) Disadvantages: * Changes in user visible behavior, compatibility problems when Git versions are mixed. * Special tools are needed to edit submodule information where currently a plain editor is sufficient and a standard format is used. * merge conflicts are harder to resolve and require special git commands, solving them in .gitmodules is way more intuitive as users are already used to conflict markers. * With .gitmodules we lose a central spot where configuration concerning many submodules can be stored (git submodule foreach becomes harder to implement is not the case, as that command currently also walks all tree objects and does not read the list of submodules from the .gitmodules file) (When we also put the submodule name in the link object we could also retain the ability to repopulated moved submodules from their old repo, which is found by that name) (That a link object can have no unstaged counterpart that a file easily has can be fixed by special casing this, e.g. in using a file in .git/link-specs/) As no new arguments have been brought up, it all boils down to a change that'll hurt users badly and won't fix any issue relevant to them. It'll bring them a flag day after which the .gitmodules is gone and they'll have to learn new tools to update and merge the submodule metadata (and not only the users, GUIs have to follow and implement support for something which currently is a perfectly normal merge conflict in a file). You'd have to smoke really weird stuff to even consider such a change under these circumstances (or you don't care one bit about your users). The submodules does need 'fixing', as does agreeing the problem and abuse cases. Sure, but almost all problems I know about are work tree related, so changing the internal representation buys us nothing here. It will not magically do a bisect over submodules or will recursively update submodule work trees, and all that stuff won't be easier to code either just because we have to get the information from a new object instead of a gitlink/.gitmodules combo. Let's just close this case and get back to working on things that users will actually profit from. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Philip Oakley wrote: Would it be possible to summarise the key points and proposals of where the subject is now? Sure. If you want an update from the current approach, wait for a v2; I'm cooking it for some time, and getting some resulting ideas merged into upstream early (look for clone.submoduleGitDir on the list, for instance). When upstream is in better shape to ease in a better fundamental design, I'll post my v2 to the list. I'll refrain from posting any updates now, because I don't think the resulting discussion will generate any value. If you want to know what this thread was about, I think [1] and [2] summarize my arguments quite well. [1]: http://thread.gmane.org/gmane.comp.version-control.git/220047/focus=220436 [2]: http://thread.gmane.org/gmane.comp.version-control.git/220047/focus=220495 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Mon, Apr 8, 2013 at 7:23 AM, Jonathan Nieder jrnie...@gmail.com wrote: Ramkumar Ramachandra wrote: It's about the core object code of git parsing links, as opposed to a fringe submodule.c/ submodule.sh parsing .gitmodules. What's stopping the core object code of git parsing .gitmodules? What is the core object code? How does this compare to other metadata files like .gitattributes and .gitignore? Somewhat related to the topic. Why can't .gitattributes be used for storing what's currently in .gitmodules? -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: Somewhat related to the topic. Why can't .gitattributes be used for storing what's currently in .gitmodules? It can. It's just a small syntax change from key = value attributes inside a toplevel [submodule name] section separated by newlines, to a path marked with multiple key=value attributes separated by whitespace. However, we don't want to make this change because these submodule attributes are somewhat different from .gitattributes attributes. Roughly speaking, the current .gitmodules design treats submodule directories as directories with special attributes, with two differences: these directories have a special mode in the index, and a commit object is created in the database to represent the partial state of this submodule. If you think about it, the information stored in the commit object is no less/ no more important than the path-attribute mapping in .gitmodules. I was arguing for using a special OBJ_LINK to represent the full state of the submodule, and doing away with the attributes altogether, but not everyone agrees. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Wed, Apr 17, 2013 at 9:06 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Duy Nguyen wrote: Somewhat related to the topic. Why can't .gitattributes be used for storing what's currently in .gitmodules? It can. It's just a small syntax change from key = value attributes inside a toplevel [submodule name] section separated by newlines, to a path marked with multiple key=value attributes separated by whitespace. However, we don't want to make this change because these submodule attributes are somewhat different from .gitattributes attributes. Roughly speaking, the current .gitmodules design treats submodule directories as directories with special attributes, with two differences: these directories have a special mode in the index, and a commit object is created in the database to represent the partial state of this submodule. That was my thinking. .gitmodules would break if a user moves the submodule manually (or even if .gitattributes is used) If you think about it, the information stored in the commit object is no less/ no more important than the path-attribute mapping in .gitmodules. I was arguing for using a special OBJ_LINK to represent the full state of the submodule, and doing away with the attributes altogether, but not everyone agrees. Include me to those everyone. url feels like a local thing that should not stay in object database (another way of looking at it is like an email address: the primary one fixed in stone in commits with .mailmap for future substitution). Other attributes like .update, .fetchRecursiveSubmodules... definitely should not be stored in object database. I think if they are stored in the submodule's config file, then the manual move problem above will go away. And if you're dead set on storing some submodule state in object database, why not reuse tag object with some nea header lines? -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: Include me to those everyone. url feels like a local thing that should not stay in object database (another way of looking at it is like an email address: the primary one fixed in stone in commits with .mailmap for future substitution). We've been over this several times in earlier emails. That's like saying that a blob should not be stored in the object database, because it is not fixed in stone (my OBJ_LINK is just a special kind of blob, as I've repeated many times already). I don't rely on what I feel, which is why I started out by posting an implementation: the implementation seems to indicate that getting an OBJ_LINK will simplify a lot of things. And that is my primary criterion for deciding: if the implementation is simple and elegant, it must clearly be doing something right. Again, I'm not saying that my approach is Correct and Final. What I'm saying is: Here's what I've done. Something interesting is going on. It's probably worth a look? Other attributes like .update, .fetchRecursiveSubmodules... definitely should not be stored in object database. Coffee and other beverages definitely should be served cold. All very nice to say, but I don't see any rationale. I think if they are stored in the submodule's config file, then the manual move problem above will go away. What? The submodule's .git/config? Why should a submodule repository know that it is being used as a submodule? What inherent properties of a git repository change if it is being used as a submodule? And if you're dead set on storing some submodule state in object database, I'm not. I'm just saying that it seems to be an interesting alternative approach. Considering that nobody else brought up a real alternative approach, and chose to just keep defending .gitmodules to the death, it's the only other approach we have. why not reuse tag object with some nea header lines? Or a unified blob, which is currently what we have. The point is to have structured parseable information that the object-parsing code of git code and easily slurp and give to the rest of git-core. Please clear your reading backlog to avoid bringing up the same points over and over again. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Wed, Apr 17, 2013 at 9:56 PM, Ramkumar Ramachandra artag...@gmail.com wrote: why not reuse tag object with some nea header lines? Or a unified blob, which is currently what we have. The point is to have structured parseable information that the object-parsing code of git code and easily slurp and give to the rest of git-core. I think you misunderstood. I meant instead of introducing new object type OBJ_LINK, you can reuse tag object and add new header lines for your purposes. Please clear your reading backlog to avoid bringing up the same points over and over again. Yep. I'll shut up until it's cleared. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: On Wed, Apr 17, 2013 at 9:56 PM, Ramkumar Ramachandra artag...@gmail.com wrote: why not reuse tag object with some nea header lines? Or a unified blob, which is currently what we have. The point is to have structured parseable information that the object-parsing code of git code and easily slurp and give to the rest of git-core. I think you misunderstood. I meant instead of introducing new object type OBJ_LINK, you can reuse tag object and add new header lines for your purposes. Oh, I interpreted your typo nea as neat, when you meant new. Yeah, it's worth exploring: I don't know what backward compatibility benefits it will yield yet. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen pclo...@gmail.com writes: Somewhat related to the topic. Why can't .gitattributes be used for storing what's currently in .gitmodules? You _could_ use gitattributes to encode, but it goes against what a gitattributes file does or is for. It is a mechanism to associate groups of paths (that may not even exist) to a set of attributes. You could list a single pattern that happens to match a single path and at the implementation level you may be able to make it work, but at the design/philosophical level, it is wrong. We need info on each submodule and we need to key it with the name of the submodule, not with its path. At any given time, a single submodule lives at (at most) one path, so you could still use path as a key in the .gitattributes, but when you need to move the submodule path, you would need to update the entry for the submodule in .gitattributes file by finding a pattern that match the old path and making it a pattern that match the new path. We have a much more suitable file format that we use to associate various values to keys: the config format. Also having a file that is only about submodules and nothing else means we could write a content-aware smart ll-merge driver that can take advantage of the knowledge that it is written in the config format and it talks about submodules. The answer to why can't question is no. No, there is no reason why you can't use it. We don't do it, because it just does not make sense. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
From: Ramkumar Ramachandra artag...@gmail.com Sent: Wednesday, April 17, 2013 12:56 PM We've been over this several times in earlier emails. [...] Again, I'm not saying that my approach is Correct and Final. What I'm saying is: Here's what I've done. Something interesting is going on. It's probably worth a look? [...] The point is to have structured parseable information that the object-parsing code of git code and easily slurp and give to the rest of git-core. Please clear your reading backlog to avoid bringing up the same points over and over again. -- Ram, The email thread is pretty long with a lot of too and fro, that would be difficult to catch up on (too much $dayjob+$family vs $sparetime). Would it be possible to summarise the key points and proposals of where the subject is now? The submodules does need 'fixing', as does agreeing the problem and abuse cases. Philip -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
From: Ramkumar Ramachandra artag...@gmail.com Sent: Monday, April 08, 2013 10:03 PM This is going nowhere. You're stuck at making the current submodule system work, not answering my questions, diverting conversation, repeatedly asking the same stupid questions, labelling everything that I say subjective, and refusing to look at the objective counterpart (aka, the code). It's clear to me that no matter how many more emails I write, you're not going to concede. I'm not interested in wasting any more of my time with this nonsense. I give up. -- Please don't give up. It is a bit of a 'wicked' problem [1]. Yes to taking a rest, stepping back and trying to summarise/review what was discussed. I couldn't keep up with all the discussion, and I doubt many others kept up, especially those who have been frustrated in their (mis-) use of submodules. Do remember that Junio has multiple roles which belie the softness of the word 'maintainer'. It includes Defender of the Heritage in the same way that keepers of ancient monuments will want visitors to enjoy the site, but rail against a garish new stainless steel and glass entrance to the Colosseum (choose you local heritage site) (see [1] again). I get confused (about sub-modules) with msysgit where git.git is a sub-module, and is the fastest moving (an inversion of control issue), and when hacking at (just) the msys level when the git sub-module isn't in sync. In many ways sub-module tracking is like file renames and empty directories (both of which come up a lot). The submodule meta information issue has great similarity to the empty directories issue. It's about meta information, not about content (which is certified verified by sha1), and about how users know what is going on and get a (natural) feeling of control (without upsetting other users/controllers) . regards Philip [now to schedule some time to do the catch up reading. $dayjob beckons] [1] www.poppendieck.com/wicked.htm -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Ramkumar Ramachandra wrote: 2. If we want to make git-submodule a part of git-core (which I think everyone agrees with), we will need to make the information in .gitmodules available more easily to the rest of git-core. Care to define more easily which is another subjective word? The .gitmodules file uses the bog-standard configuration format that can be easily read with the config.c infrastructure. It is a separate matter that git_config() API is cumbersome to use, but improving it would help not just .gitmodules but also the regular non-submodule users of Git. There is a topic in the works to read data in that ^^ format from core Heiko is working on. ^ BTW. this is something that I was missing to implement better submodule support in gitweb (and thus git-instaweb) than just marking it as submodule in 'tree' view. -- Jakub Narębski -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Am 07.04.2013 23:30, schrieb Ramkumar Ramachandra: Jonathan Nieder wrote: What's stopping the core object code of git parsing .gitmodules? Just to clarify that: git core already does that. A git grep gitmodules_config shows it is parsed by some git core commands: checkout, commit, the diff family and fetch. Others will follow in the recursive update series. And git mv support will teach that command to manipulate the .gitmodules file (and I hope that a patch teaching git rm to remove the section from .gitmodules will be accepted in the near future). Nothing, except that it's perversely unnatural for object parsing code to parse something outside the object store. Hmm, at least the unstaged .gitmodules file has to be parsed from the file system. And Heiko's current work on parsing .gitmodules directly from the object store will help here too, right? How does this compare to other metadata files like .gitattributes and .gitignore? .gitignore and .gitattributes are parsed in dir.c, where git treats worktree paths. It's quite nicely integrated. And .gitmodules is parsed in submodule.c where Git treats .gitmodules entries. So I don't see a problem here, what am I missing? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jens Lehmann wrote: Hmm, at least the unstaged .gitmodules file has to be parsed from the file system. You seem to be touting it as a distinct advantage. In my opinion, .gitmodules is a wart that needs to be done away with: it should _not_ be on the filesystem, just like a commit object isn't on the filesystem. Getting links to unstage is two hours of work, tops. And I'm the one writing the whole thing, so I don't see what everyone else is complaining about. And Heiko's current work on parsing .gitmodules directly from the object store will help here too, right? Ofcourse, you _can_ parse a blob into a struct. It's just extremely gross to treat a blob located in a certain tree path differently from other blobs. It's a perverse violation of git's fundamental design, and I'm strongly against such a change. What I still fail to understand is why you keep mentioning work-in-progress. You've had five years in which you haven't been able to do things that I did in two days. Yes, you _can_ keep .gitmodules and hack around everything, but why do you _want_ to do that? Preserving backward compatibility is not *that* important, in my opinion. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Hi Ram, Ramkumar Ramachandra wrote: In my opinion, .gitmodules is a wart that needs to be done away with: it should _not_ be on the filesystem, just like a commit object isn't on the filesystem. What do you think of .gitignore and .gitattributes? Should they be somewhere other than the filesystem as well? [...] What I still fail to understand is why you keep mentioning work-in-progress. You've had five years in which you haven't been able to do things that I did in two days. I don't think Jens had any obligation to work on submodules and nothing else for the last five years. ;-) If you end up convincing others that your tools are worth working on and those tools pleasantly take care of the same workflows that submodules do, then I imagine people will be happy to migrate. Speaking only for myself, I actually prefer the submodule UI, despite not being thrilled with the single-.gitmodules-file-at-the-root-of-the-worktree feature. So I will not be working on your proposed redesign, unless it evolves enough to be as pleasant a UI as (the long proposed UI of) submodules. Hope that helps, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: Jens Lehmann wrote: Hmm, at least the unstaged .gitmodules file has to be parsed from the file system. You seem to be touting it as a distinct advantage. To clarify what I said in a side thread: yes, as long as the submodule metadata includes the hostname I am downloading a library from, having it in an ordinary file is an advantage. The problem with URLs (and especially hostnames) is that they change. When my project's previous domain name is lost because the hosting company lost interest, I want to be able to grep for all instances of that domain name in my project's documentation and metadata and change them all at once with a simple command like the following: git grep -l -F -e oldhost.example.com | xargs sed -i -e s/oldhost.example.com/newhost.example.com/g When I clone a project with --no-recurse-submodules, I want to be able to see what other servers will be contacted when I run git checkout --recurse-submodules. The current .gitmodules file lets me find that out with a simple, intuitive command: cat .gitmodules I might change some URLs locally, because I know that some project's upstream has moved. git submodule init git config --edit On the other hand, the single .gitmodules file will be a pain to merge if multiple branches modify it. So I do look forward to a merge strategy that deals more intelligently with its content, and wouldn't have minded a design that split this information into multiple files if we were starting over. Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jonathan Nieder wrote: What do you think of .gitignore and .gitattributes? Should they be somewhere other than the filesystem as well? I would argue that .gitignore and .gitattributes are done right. They are integrated into a very mature part of git-core very well, and their nature is fundamentally different from that of .gitmodules. .gitignore and .gitattributes specify extended globs (see: wildmatch) rules to apply on the worktree, and can be in multiple places in the worktree. They apply strictly on the current worktree; they have nothing to do with the index, and have no interaction with other objects in the repository. Now, you might argue that they should be part of the tree object, but I will disagree because they don't operate on concrete entries in the tree but rather extended globs that match worktree paths. .gitmodules, on the other hand, specifies fundamental repository composition: it should be a special object in the tree precisely because it changes the fundamental meaning of one concrete tree entry. It has nothing to do with path treatment in the worktree, and hence has nothing to do with .gitattributes or.gitignore. I don't think Jens had any obligation to work on submodules and nothing else for the last five years. ;-) I know. What I'm saying is that his current approach is just filled with tons of unnecessary complexity, inelegance, and pain. This is evidenced by the fact that the current submodule system is pathetic after five years of work (and I don't think the developers working on it were particularly incompetent or lazy). If you end up convincing others that your tools are worth working on and those tools pleasantly take care of the same workflows that submodules do, then I imagine people will be happy to migrate. Yes, I'm planning a strict superset of the current submodule system features. After some thought, I've decided not to have any feature regressions in my first version for merge (although that means a lot of work for me). Speaking only for myself, I actually prefer the submodule UI, despite not being thrilled with the single-.gitmodules-file-at-the-root-of-the-worktree feature. So I will not be working on your proposed redesign, unless it evolves enough to be as pleasant a UI as (the long proposed UI of) submodules. I'm very interested in building a pleasant UI. I've always been a person who cares deeply about UI: this is evidenced by my recent remote.pushdefault patch, and my pull.autostash WIP. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jonathan Nieder wrote: git grep -l -F -e oldhost.example.com | xargs sed -i -e s/oldhost.example.com/newhost.example.com/g Yes, I've had to do this too: in a proxied environment I had to s/git:\/\//https:\/\//. So yes, we will have features to operate on multiple links at the same time. I'm thinking something fine-grained that allows you to pick which links to operate on. It's currently a vague thought, and I'm not sure what the implementation will look like. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Fri, Apr 5, 2013 at 5:55 AM, Jonathan Nieder jrnie...@gmail.com wrote: Ramkumar Ramachandra wrote: 1. 'git add' should not go past submodule boundaries. I should not be able to 'git add clayoven/' or 'git add clayoven/LICENSE'. In addition, the shell completion also needs to be fixed. Yep. This is a bug. I notice that this case is handled by git-add, but there is probably a bug somewhere. Ram, can you make a test case for this? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Mon, Apr 8, 2013 at 7:08 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Jonathan Nieder wrote: What do you think of .gitignore and .gitattributes? Should they be somewhere other than the filesystem as well? I would argue that .gitignore and .gitattributes are done right. They are integrated into a very mature part of git-core very well, and their nature is fundamentally different from that of .gitmodules. Probably off-topic, but I'm starting to find .gitignore can be found in every directory a burden to day-to-day git operations. So imo it's not done right entirely ;-) .gitignore and .gitattributes specify extended globs (see: wildmatch) rules to apply on the worktree, and can be in multiple places in the worktree. They apply strictly on the current worktree; they have nothing to do with the index, and have no interaction with other objects in the repository. Index operations sometimes read these .git{ignore,attributes}. I believe git-archive reads worktree's .gitattributes, so it's not really just about worktree. I don't think Jens had any obligation to work on submodules and nothing else for the last five years. ;-) I know. What I'm saying is that his current approach is just filled with tons of unnecessary complexity, inelegance, and pain. This is evidenced by the fact that the current submodule system is pathetic after five years of work (and I don't think the developers working on it were particularly incompetent or lazy). I don't follow this thread closely, but I think there's a common ground where improvements can benefit both approaches. There are a lot of problems for deep integration and erasing submodule's boundaries from UI perspective. I think maybe you can work on that first, gain experience along the way, and maintain the link-object changes separately. Maybe someday you will manage to switch .gitmodules with it. Or maybe I'm wrong (partly because I did not read the whole thread) -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: Probably off-topic, but I'm starting to find .gitignore can be found in every directory a burden to day-to-day git operations. So imo it's not done right entirely ;-) Why is it a burden? I would argue that the tooling support is not yet there, but git check-ignore is a step in the right direction. What alternate design would you propose, just out of curiosity? Index operations sometimes read these .git{ignore,attributes}. I believe git-archive reads worktree's .gitattributes, so it's not really just about worktree. I should've said largely, only affects the current worktree. I don't follow this thread closely, but I think there's a common ground where improvements can benefit both approaches. There are a lot of problems for deep integration and erasing submodule's boundaries from UI perspective. I think maybe you can work on that first, gain experience along the way, and maintain the link-object changes separately. Maybe someday you will manage to switch .gitmodules with it. Or maybe I'm wrong (partly because I did not read the whole thread) Yes, there is some common ground. But: 1. The inspiration for fixing fundamental design problems comes from my redesign. For instance, I would've never discovered the git add bug if I'd not attempted to git add (as opposed to the unnatural abstraction that git submodule add presents). 2. I think it is absolutely imperative that we do the redesign now, before we've descended too far into the madness that the current design is. I think I'm capable of doing the redesign now, with some help and support from the list. My attitude doesn't align with the I'm feeling lazy; why don't we postpone it? argument. Let's finish what I started now: I'm more than willing to dedicate the next few months full-time towards finishing this and getting it merged. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: Probably off-topic, but I'm starting to find .gitignore can be found in every directory a burden to day-to-day git operations. So imo it's not done right entirely ;-) Or are you saying it's hard to implement elegantly and efficiently in git-core? If so, I agree wholeheartedly. I'm not yet sure what to do about the situation. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Mon, Apr 8, 2013 at 9:06 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Duy Nguyen wrote: Probably off-topic, but I'm starting to find .gitignore can be found in every directory a burden to day-to-day git operations. So imo it's not done right entirely ;-) Why is it a burden? I would argue that the tooling support is not yet there, but git check-ignore is a step in the right direction. What alternate design would you propose, just out of curiosity? You don't know if .gitignore is there, so you need to check for it in every directory. If we fixed its location (e.g. worktree's top) we would not have to look in every directory. Then again it may be a bit inconvenient that way. If you remove a directory, you also remove .gitignore rules inside when you distribute .gitignore files. Otherwise you need to clean up top .gitignore once in a while. 1. The inspiration for fixing fundamental design problems comes from my redesign. For instance, I would've never discovered the git add bug if I'd not attempted to git add (as opposed to the unnatural abstraction that git submodule add presents). I actually spotted a similar use of git-add in the test suite [1]. You see, it's a bug that should be fixed but in that particular case, it's valid to add something inside a submodule. I wanted to fix that with my read_directory rewrite (part of the pathspec stuff) but never got around to finish it and eventually gave up, which leads to your next point.. 2. I think it is absolutely imperative that we do the redesign now, before we've descended too far into the madness that the current design is. I think I'm capable of doing the redesign now, with some help and support from the list. My attitude doesn't align with the I'm feeling lazy; why don't we postpone it? argument. Let's finish what I started now: I'm more than willing to dedicate the next few months full-time towards finishing this and getting it merged. Good luck. Bug such a big work usually requires more than one volunteer. If you haven't convinced (*) the community it's right, maybe you should take a few days thinking about it again before implementing. [1] http://thread.gmane.org/gmane.comp.version-control.git/177454 (*) just a feeling after a quick glance, I may be terribly wrong again -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Duy Nguyen wrote: Good luck. Bug such a big work usually requires more than one volunteer. If you haven't convinced (*) the community it's right, maybe you should take a few days thinking about it again before implementing. Yes, I'm thinking about it before rushing in to implement it. There will always be resistance to change, especially when it involves a change that breaks a working implementation. If anything, the resistance is only going to get worse with time, as people pile more and more hacks on top of the current submodule implementation. I say: do it now, before we lose steam. As far as I can tell, I'm completely unbiased: I have no vested interests in either implementation, and I just want to see the best implementation win. My conviction in the new approach has only strengthened after discussions on this thread: there must be some reason for that, no? Frankly, I was hoping that atleast one or two people on the thread would take my side of the argument (or atleast tell me that I'm not deranged), but that hasn't happened. Nevertheless, I hope to convince more people by doing more work and posting a beautifully working implementation. I'm already prepared for the worst case: I'll be forced to dump all my work and be disappointed with the git community. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jonathan Nieder jrnie...@gmail.com writes: [snipped everything I agree with...] On the other hand, the single .gitmodules file will be a pain to merge if multiple branches modify it. So I do look forward to a merge strategy that deals more intelligently with its content, and wouldn't have minded a design that split this information into multiple files if we were starting over. I find it a sensible suggestion to have a content-aware merge driver. Such a custom merge driver to help merging a structured datafile in the config format will have other uses when we need to do more than the current system (outside submodules there will be other things frotz that need information about frotz in the future, and a .gitfrotz file would be one possible way to do so). I do not think it needs to be split per-submodule. When a submodule in the common ancestor was at path dirA/, and you are merging with another branch that moved it to path dirB/, the contents of .gitmodules file for that module (that is identified by its name) will need a three-way merge of its .path element: common ancestor:submodule.name.path = dirA/ ours: submodule.name.path = dirA/ theirs: submodule.name.path = dirB/ And your content-aware merge driver should be able to do the resolving by following the usual three-way merge rules. We started from the same dirA/ and only they changed, so the result is dirB/. By the way, that's a merge driver (which deals with per-path content merge), not a strategy (which deals with the entire tree level merge). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: As far as I can tell, I'm completely unbiased: I have no vested interests in either implementation,... ... Frankly, I was hoping that atleast one or two people on the thread would take my side of the argument (or atleast tell me that I'm not deranged), but that hasn't happened. Aren't these two quite contradicting? After listening to what others tell one with an unbiased mind and finding that nobody agrees with what one initially proposed, an unbiased person would step back, take a deep breath and think again, before insulting the intelligence of others with a dissapointed, like this: I'm already prepared for the worst case: I'll be forced to dump all my work and be disappointed with the git community. Would it be possible that (at least some part of, or possibly all of) your ideas had some merit, but with all your hostility against the current system and the work that went behind it, you did not communicate well enough to make others understand you? What I found very hard to read in this thread was that your messages all went like this: 1. In the current system, I have to be at the top level of a submodule to work in it (or some other problems). 2. I will fix it in a more elegant way. 3. I have to have a new object at the submodule path, not the current submodule is a commit bound at the submodule path, and information about the submodule is in .gitmodules. There was very little concrete explanation on how #3 leads to #2, i.e. the overall design of your new system and how it will work, other than you would read what we currently write in .gitmodules from a new kind of object. When an alternative solution was suggested, all your responses were full of subjective inelegant and ugly, and at least I couldn't read much substance in it (here, the usual me might add maybe others differ, but after reading this thread, I strongly suspect that others share this problem). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Would it be possible that (at least some part of, or possibly all of) your ideas had some merit, but with all your hostility against the current system and the work that went behind it, you did not communicate well enough to make others understand you? Agreed. My annoyance with the current system did go a little overboard, and I've been having a splitting headache for the last few days. What I found very hard to read in this thread was that your messages all went like this: 1. In the current system, I have to be at the top level of a submodule to work in it (or some other problems). 2. I will fix it in a more elegant way. 3. I have to have a new object at the submodule path, not the current submodule is a commit bound at the submodule path, and information about the submodule is in .gitmodules. There was very little concrete explanation on how #3 leads to #2, i.e. the overall design of your new system and how it will work, other than you would read what we currently write in .gitmodules from a new kind of object. I had no way of expressing what I wanted to do except by writing code when I started off this thread, but am in much better shape now. Let me try to explain my fundamental assumptions and code in a concise way now. 1. Having a toplevel .gitmodules means that any git-core command like add/ rm/ mv will be burdened with looking for the .gitmodules at the toplevel of the worktree and editing it appropriately along with whatever it was built to do (ie. writing to the index and committing it). This is highly unnatural. Putting the information in link objects means that we get a more natural UI + warts like cd-to-toplevel disappear with no extra code. 2. If we want to make git-submodule a part of git-core (which I think everyone agrees with), we will need to make the information in .gitmodules available more easily to the rest of git-core. One way to do it without breaking anything is to unpack the root tree, look for an entry with the path .gitmodules and handle it different from other blobs: ie. parse it into structured data that the rest of git-core can consume. However, I think it is very gross as the blob is not inherently special in any way: it's just incidentally stored at a specific tree path. The alternative is to have an inherently special kind of blob (ie. link object). In the git-core code, I can simply match for a link object and operate on it accordingly. As opposed to matching a blob object, and its tree path. Moreover, this means that the user can simply git edit-link link from anywhere in the worktree instead of having to refer to the appropriate section in the toplevel .gitmodules. 3. Currently diffing/ merging one huge .gitmodules file is a mess, as it doesn't have to conform to a strict format. This means that I can get conflicts between these two: url = gh:artagnon/clayoven url =gh:artagnon/clayoven Moreover, since the fields are not ordered, a simple reordering of the fields will cause a merge conflict. The correct way to fix this is to split up .gitmodules into many logical files, have a git edit-gitmodules which reduces user input to a strict format, and then write custom diff/merge drivers. My proposal involves having a git edit-link, and teaching git-core to diff/merge appropriately. The information is already in logical bits. 4. The only seeming disadvantage of not having a file accessible via the filesystem is that it doesn't behave like a full blob. But it does; the code to unstage a link object (emulation) is actually very simple: I'm currently writing it. 5. Having a first-class link object comes with functional advantages. It means that I can have a ref pointing to link objects and easily initialize a nested repository without having to initialize the containing repository (ie. essentially replacing repo). We can have true floating submodules, which is really nice in my opinion: you can fix a library at v3.1 and switch it to v3.2 at some point in the future without using ugly SHA-1 hexes anywhere. 6. While it is possible to work top-down from the current system, that approach is clearly taking too long and is too painful. This explains why submodules haven't come a long way in the last five years. With my approach, I'm trying to make life simpler for everyone: it will suddenly become much easier to hack on submodules, and it can improve more rapidly over the next five years. I'm not thinking about short-term fixes precisely for this reason: the long-term goal is worth a little bit of short-term inconvenience. 7. I estimate that replacing the current submodule system without feature regressions will not take a lot of effort and can be done with minimal breakages. It's not a lot of code or anything very complex. We just have to follow along the lines of how git-core handles blobs, and write a little bit of code to make links behave like blobs (I'm halfway done with this already).
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Christian Couder wrote: What if instead of a git submodule I want to have an hg, or, God/Linus/deity forbid, an SVN submodule, inside my git worktree? What if I just want a very big movie or .tgz downloaded from somewhere else? Since the link object is rooted to the tree, it's impossible to have anything but a working copy in the link directory. How can I have a non-git-worktree link directory without breaking checkout? I think that making it too generic will make the entire submodule experience suffer, because the implementation must be coded according to the lowest-common-denominator. This is the mistake that the tool mr makes: since it's so generic, it can't provide very powerful functionality specifically for git repositories. I'll try to think of something else. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano gits...@pobox.com writes: Jonathan Nieder jrnie...@gmail.com writes: [snipped everything I agree with...] On the other hand, the single .gitmodules file will be a pain to merge if multiple branches modify it. So I do look forward to a merge strategy that deals more intelligently with its content, and wouldn't have minded a design that split this information into multiple files if we were starting over. I find it a sensible suggestion to have a content-aware merge driver. Such a custom merge driver to help merging a structured datafile in the config format will have other uses when we need to do more than the current system (outside submodules there will be other things frotz that need information about frotz in the future, and a .gitfrotz file would be one possible way to do so). I do not think it needs to be split per-submodule. Another thing to think about is what to do when/if we want to express this is the default that applies to all submodules. For example, a superproject that binds multiple submodules may want to say When on this branch, make all submodules also on 'next'. With a unified single place that holds information about all submodules, it is trivial to add a default section, perhaps like this: [default] branch = next [submodule framework] url = ... path = framework [submodule common] url = ... path = common branch = master ;# regardless of other modules... on top of the submodule.name.branch mechanism for floating checkout (the default is of course not limited to branch but applies in general). It is not obvious where such a default piece should go once you start splitting these into per-submodule files, be it a separate but still in-tree file that is different from the submodule it desribes, or a blob-like object that sits at the path for the submodule in the tree and in the index as Ram wants to do (as I kept saying, the storage mechanism is not fundamental). This is similar to why .gitattributes is easy to work with, I think. You can describe the information about paths in that file (which lives at a place different from the paths that are described), and you can have a catch-all rule in it. This is a tangent, but you could build a system that attaches attributes to individual paths and hide the attributes from the working tree filesystem (think: svn:blah) and have a set of special commands (think: svn propset, proplist, etc.) to work with them, and that is an equally valid way to implement attributes (it does not make .gitattributes less valid way to do so, though). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: 1. Having a toplevel .gitmodules means that any git-core command like add/ rm/ mv will be burdened with looking for the .gitmodules at the toplevel of the worktree and editing it appropriately along with whatever it was built to do (ie. writing to the index and committing it). Burdened is a subjective word. What's bad about having a single place you know you can read and find out information about things? You have to learn about them to do anything specific to them anyway. This is highly unnatural. Unnatural is a subjective word, and there is no justification I see here in your message. Putting the information in link objects means that we get a more natural UI + warts like cd-to-toplevel disappear with no extra code. I do not see how link objects _means_ natural UI, yet, without an explanation how one leads to the other. What does cd-to-toplevel have anything to do with it? In case you did not notice, all the core commands internally cd-to-toplevel and carry the prefix information while doing so, and prepend the prefix to user-supplied paths to find which path the user is talking about. So cd to toplevel before starting to carry the operation out is a natural pattern inside Git. As many people already told you, the user has to run 'git submodule' from the top-level of the submodule working tree is a simple oversight of the implementation. 2. If we want to make git-submodule a part of git-core (which I think everyone agrees with), we will need to make the information in .gitmodules available more easily to the rest of git-core. Care to define more easily which is another subjective word? The .gitmodules file uses the bog-standard configuration format that can be easily read with the config.c infrastructure. It is a separate matter that git_config() API is cumbersome to use, but improving it would help not just .gitmodules but also the regular non-submodule users of Git. There is a topic in the works to read data in that format from core Heiko is working on. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Ramkumar Ramachandra artag...@gmail.com writes: 1. Having a toplevel .gitmodules means that any git-core command like add/ rm/ mv will be burdened with looking for the .gitmodules at the toplevel of the worktree and editing it appropriately along with whatever it was built to do (ie. writing to the index and committing it). Burdened is a subjective word. What's bad about having a single place you know you can read and find out information about things? You have to learn about them to do anything specific to them anyway. Burden refers to the extra work of looking for a file in the worktree, when this is completely unnecessary if you use a link object. This is highly unnatural. Unnatural is a subjective word, and there is no justification I see here in your message. git's design follows along the lines of the UNIX philosophy: do one thing, do it well. git add/ rm/ mv have a very sharply defined task: they first lock the index file, read the_index using read_cache(), build a cache_entry struct using user-supplied data (this might involve worktree code from dir.c to recurse subdirectories, for example), write that cache_entry to the_index (removing existing entries with cache_tree_invalidate_path() if necessary), and finally write the_index to the index file, releasing the lock. Would you agree that any operation that doesn't follow along this line is unnatural? What then, does writing a special file in the worktree (aka .gitmodules) have to do with this entire process? Does git diff/ commit/ add/ rm or any other command you can think of rely on a special file in the worktree (aka .gitmodules) to be checked out? Then why does git submodule require it? Isn't this a requirement that is inconsistent with the rest of git-core? Putting the information in link objects means that we get a more natural UI + warts like cd-to-toplevel disappear with no extra code. I do not see how link objects _means_ natural UI, yet, without an explanation how one leads to the other. I should've said means an easy route to get the existing UI to work with little or no additional code. Making the submodule information available to git-core is precisely what leads to this. In index_path(), you can inject a case for S_IFDIR to write a link object to the database, writing the sha1 to the supplied argument. This is not unnatural in any way, because we're just following along the lines of the S_IFLNK codepath, which writes a blob object to the database. Now index_path() is called by add_to_index(), which is the master function for adding anything to the index. Therefore, git add just works. git rm is much simpler: it calls remove_file_from_index() which in turn calls cache_tree_invalidate() and remove_index_entry_at(). Once the entry is removed from the cache, our job is done. The link object will be cleaned up at gc-time. git mv is just a combination of git rm and git add: it invalidates an existing entry and adds a new one with a different name. There is no special .gitmodules to take care of. What does cd-to-toplevel have anything to do with it? In case you did not notice, all the core commands internally cd-to-toplevel and carry the prefix information while doing so, and prepend the prefix to user-supplied paths to find which path the user is talking about. So cd to toplevel before starting to carry the operation out is a natural pattern inside Git. As many people already told you, the user has to run 'git submodule' from the top-level of the submodule working tree is a simple oversight of the implementation. Yes, I am aware. I'm piggy-banking on the mature parts of git-core to get functionality that I would otherwise have to write by hand. The current implementation needs to hand-code this, and hasn't done it yet (presumably because it's non-trivial). 2. If we want to make git-submodule a part of git-core (which I think everyone agrees with), we will need to make the information in .gitmodules available more easily to the rest of git-core. Care to define more easily which is another subjective word? The .gitmodules file uses the bog-standard configuration format that can be easily read with the config.c infrastructure. It is a separate matter that git_config() API is cumbersome to use, but improving it would help not just .gitmodules but also the regular non-submodule users of Git. There is a topic in the works to read data in that format from core Heiko is working on. This goes both ways: the information is both easier to read and write. I can easily create a link object from anywhere: index_path() or cmd_edit_link(). To do this, I just have to call write_sha1_file() with the buffer filled out and with the parameter link_type (which is already defined). To access the data in a link, I have to fill out a tree_desc with HEAD, an unpack_tree_opts with a custom callback, and pass it to unpack_trees(). An example of a custom callback:
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Ramkumar Ramachandra artag...@gmail.com writes: Does git diff/ commit/ add/ rm or any other command you can think of rely on a special file in the worktree (aka .gitmodules) to be checked out? Try git add foo~ with usual suspect in .gitignore ;-) First, it's not a hard requirement: in the worst case, git add will add the file even without a -f. Second, I've already argued about how I think this is the right design: What part of that do you disagree with? What alternate design do you propose? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: Junio C Hamano wrote: Ramkumar Ramachandra artag...@gmail.com writes: Does git diff/ commit/ add/ rm or any other command you can think of rely on a special file in the worktree (aka .gitmodules) to be checked out? Try git add foo~ with usual suspect in .gitignore ;-) First, it's not a hard requirement: in the worst case, git add will add the file even without a -f. In the same sense .gitmodules is not a hard requirement, either. I use a submodule without .gitmodules in one of my repositories (the top-level houses the source to generete my dotfiles and is cloned to my environment at work, but the submodule houses my private files that live only at home). The gitlink entry in the index and the tree and presence of the .git repository in the submodule checkout (where it exists) is sufficient to make the layout work. If your complaints were I cannot make X work with the current system, even with changes to git-submodule and some core part of the system, and I think the reason is because the way module information is stored is in a separate file .gitmodules, with a concrete X, people who are more versed with the submodule subsystem may be able to help you come up with a cleaner solution without throwing the baby with the bathwater, but I do not think we saw any concrete X mentioned. The same sentence followed by ... and with an object of a new type stored at the path of the submodule, I can make it work by doing A, B and C, with concrete A, B and C, some people may be interested in pursuing that avenue with you, but I do not think we saw such combinations of X, A, B, C either. If all of your argument starts from I think .gitmodules is ugly because it is not an object of a separate type stored at the path of the submodule, and here are the reasons why I think it is ugly, I have nothing more to say to you. That ugly is at best skewed aesthetics, and each and every example that comes up in this discussion, like this 'git add' works with .gitignore, and the one I sent on .gitattributes vs .gitmodules on the default in the nearby subthread to Jonathan, makes me realize that .gitmodules is _more_ in line with the rest of the system, not less. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ok, here comes an updated version of our comparison list which I updated with what I read in recent discussions. As I said earlier, please speak up if I missed anything (or forgot to add anyone to the CC). I picked up one advantage (no need to cd-to-toplevel to edit .gitmodules) two new disadvantages (foreach and default submodule config) and retired one Ram showed a solution for (the unstaged gitlink). Advantages: * Information is stored in one place, no need to lookup stuff in another file/blob. * Easier coding, as we find all information in a single object. * No need to cd-to-toplevel to change configuration in the .gitmodules file, the special tools to edit link information will work in any subdirectory. (We currently need a checked out work tree to access the .gitmodules file, but there is ongoing work to read the configuration directly from the database) (While it is easier to merge the link object, a .gitmodules aware merge driver would work just as well) Disadvantages: * Changes in user visible behavior, possible compatibility problems when Git versions are mixed. * Special tools are needed to edit submodule information where currently a plain editor is sufficient and a standard format is used. * merge conflicts are harder to resolve and require special git commands, solving them in .gitmodules is way more intuitive as users are already used to conflict markers. * git submodule foreach becomes harder to implement * With .gitmodules we lose a central spot where configuration concerning many submodules can be stored (I think when we also put the submodule name in the object we could also retain the ability to repopulated moved submodules from their old repo, which is found by that name) (That a link object can have no unstaged counterpart that a file easily has can be fixed by special casing this, e.g. in using a file in .git/link-specs/) Hmm, while it is still too early to close the polls, it looks to me as most advantages are about easier coding while most disadvantages hit the user. That makes it more understandable for me why Ram is so convinced of his approach and why on the other hand submodule users like myself are rather sceptical. I think we need some more advantages that users will directly profit from, the cd-to-toplevel for .gitmodules is definitely not enough to support the change Ram is proposing. What other advantages are missing here? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
This is going nowhere. You're stuck at making the current submodule system work, not answering my questions, diverting conversation, repeatedly asking the same stupid questions, labelling everything that I say subjective, and refusing to look at the objective counterpart (aka, the code). It's clear to me that no matter how many more emails I write, you're not going to concede. I'm not interested in wasting any more of my time with this nonsense. I give up. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Mon, Apr 08, 2013 at 10:41:57PM +0200, Jens Lehmann wrote: (While it is easier to merge the link object, a .gitmodules aware merge driver would work just as well) I have not been following this thread that closely, so apologies if I missed it, but one thing I have not seen mention of is how the extra information inside the gitlink object will require extra merge effort. Imagine I have two branches; one updates the submodule's commit pointer, and the other updates some meta-information about the submodule (e.g., it points the URL to a new host). In the current system, one change goes into .gitmodules, and the other goes into the gitlink path. In a new combined object, there is a conflict and we must do content-level merging on it (which presumably would be done with a specialized merge driver). So I think that in some cases .gitmodules creates more conflicts (submodule A and submodule B are touched and have a textual conflict), and sometimes the combined object would create more objects (you touch two parts of the of the combined object). The solution in both cases is a smarter merge driver that understands which parts semantically conflict and which parts do not. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: I think it is too premature to discuss _your_ code. The patches do not even tell us anything about how much more work is needed to merely make Git with your patches work properly again. For one thing, I suspect that you won't even be able to repack a repository that has OBJ_LINK only with the patches you posted. Let me try to rephrase my original request: I'm an inexperienced contributor trying to do something very ambitious. Having authored a huge part of it, Linus and you understand git-core much better than I can ever hope to understand. These are things that you need to tell me after reading the patches. I only have a rough idea to make Git work properly with my patches again: I can't know for sure until I write all the code. What's more? Your guess will probably be better than mine after you read the code. You're asking me to submit a perfect 40 or 50 part series that's a potential candidate for merging. I'm sorry to say this, but I'm incapable of doing that without posting intermediate work and getting help. Frankly, it's a very unreasonable expectation; I don't think anyone except you or Linus can even get close after making such a fundamental change. I might end up writing all the code (which I'm perfectly okay with), and all I'm asking from you is to constantly keep picking my brain (by reviewing my code and posting good critiques). Am I being unreasonable? At this point the only thing that we can gain from reading your patch is that you can write C to do _something_, but that something is so fuzzily explained that we do not know what to make of that knowledge that you write good (or bad, we don't know) C. C is irrelevant here: I'm not asking you for style/ structuring tips; I'm asking you for a critique on the implementation of this idea. Linus raised some good points after reading [0/7] that I countered, and there isn't anything else either anyone can raise without reading more. Code speaks more clearly than the English in [0/7]: you should be able to deduce a lot more intent and direction. It would be much more productive to learn what these specific issues X, Y and Z are, and if the problems you are having with existing solutions are really fundamental that need changes to object layer to solve. To reiterate: link does not make possible something that is not fundamentally _possible_ with a .githack and a 100k-line Perl script. At its core, every variant of submodules does this. What I'm essentially proposing: break up the information in .githack into smaller bits and create a new object type so it can be parsed by git-core easily. If you agree that my proposal doesn't make impossible what was possible earlier, and that it makes life easier for everyone, we should be good to go. When the series matures, we can investigate the other implementations in greater detail so we can pick out more optional fields to add to the link object before getting it merged. This is not the right time to do that: we're currently trying to get git-core working with the mandatory fields. I do not think we have heard anything concrete and usable about what you are trying to achieve yet. I'll try to rephrase your concerns here: 0. We don't know if this approach will yield a mergeable series at all, because it breaks so many things and is so difficult to complete. 1. We don't know how much work is needed to bring the series to a point where it is in a mergeable state. There is no timeline specified. 2. We can't build an exhaustive list of the problems that this new approach will solve (ie. we haven't finalized the optional fields). 3. We don't have anything useable yet. And your non-concerns should be: 1. We know that this approach won't fundamentally limit us in not being able to solve some specific problems that the .githack approach solves. 2. We know that this approach makes life easier for everyone, and there are significant concrete benefits to teaching git-core about links. I agree with all this fully. I don't have a concrete roadmap; we'll just have to dive in based on what we've seen so far, and hope that we're able to finish what we started. So, my final question is: are you still not convinced that this approach shows a lot of potential, and is worth exploring now? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: Junio C Hamano wrote: I think it is too premature to discuss _your_ code. The patches do not even tell us anything about how much more work is needed to merely make Git with your patches work properly again. For one thing, I suspect that you won't even be able to repack a repository that has OBJ_LINK only with the patches you posted. ... You're asking me to submit a perfect 40 or 50 part series... Not at all. And please do not start _coding_. When the design is not clear enough that a 7-patch series is not ready to be reviewed, certainly 50-patch series will not be. Not until you can explain what you are trying to solve and convince others why other less disruptive approaches are fundamentally unworkable, and why we need to change the object layer. To reiterate: link does not make possible something that is not fundamentally _possible_ with a .githack and a 100k-line Perl script. At its core, every variant of submodules does this. What I'm essentially proposing: break up the information in .githack into smaller bits and create a new object type so it can be parsed by git-core easily. The .gitmodules file is designed to be easily parsable by the config infrastructure and implemented as such already, thank-you-very-much. Why do you keep calling an already working solution with derogatory misspelling? That only gives others an impression that you do not understand how the current system works, and pursuade them not to waste time responding to you. Stop it. When the series matures, we can investigate the other implementations in greater detail so we can pick out more optional fields to add to the link object before getting it merged. Sorry, but that is not how open source works in general, and certainly not how this project works. We do not add disruptive change just for the sake of changing it to break a working system, make an extra work to clean up the fallout for ourselves (i.e. your 40 to 50 patch series, but honestly speaking I expect it would be more like a 4 months work for a full time engineer or two), for unproven design (that has not yet to be illustrated) to solve problems (that has not yet to be explained), without knowing that (1) the problems are worth solving; (2) the design will solve the problems; and (3) solving the same problems without such a disruptive change is impossible, or so cumbersome that it will be far larger than the work needed to clean-up the fallout of the disruptive change. So what are your X, Y, Z? You still haven't answered that question. For that matter, you didn't answer the same question that was more tersely phrased by Linus in the very first response in the thread: Linus Torvalds wrote: I don't dispute that a new link object might be a good idea, but there's no explanation of the actual format of this thing anywhere, and what the real advantages would be. A clearer this is the design, this is the format of the link object, and this is what it buys us would be a good idea. Yeah, I need help with that. I've just stuffed in whatever fields popped into my mind first. The current ones are: And what you listed were your back-then-current thinking on actual format. What are the real advantages? How are they used? What do they allow us to do what we cannot do with .gitmodules (or repo or gitslave for that matter)? What do they buy us? What problem are you trying to solve? I have this suspicion that you do not have to change anything in the object layer to make Git behave very differently from the current submodule implementation. For example, if your gripe were (I am just speculating without any input from you in this thread) that each submodule working tree has .git at their top and there is no unified view from the top-level [*1*], we certainly can solve it without any change to the object layer. We currently add a cache entry that has the commit object name to the index from the tree object when we check out the superproject, and create a separate repository with a working tree when we instantiate a submodule. This arrangement does not have to be fundamental. It is a design choice of one particular working tree layout, which is totally local to individual superproject working tree. You could arrange a single index (the one in the superproject) to hold the tree contents from the commit in the submodule, while noting the original commit object name in a new mandatory extension section in the index. The index will have a unified view of the whole tree, and we do not have to have a .git at the root of each submodule working tree (be it a directory or a gitfile). I think the message where I talked about the bind idea in the list archive URLs I gave you earlier would give you such a layout, and you should go read it again to understand how the flow from object database to index to working tree back to index back to object database was envisioned to
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Your overall hostility is unappreciated. The burden of proof is on me, while you calmly sit back and criticize anything that breaks the current working state, and refuse to look at the implementation. Anyway, here we go again. Junio C Hamano wrote: Not at all. And please do not start _coding_. You've successfully killed all my enthusiasm. Congratulations. When the design is not clear enough that a 7-patch series is not ready to be reviewed, certainly 50-patch series will not be. Not until you can explain what you are trying to solve and convince others why other less disruptive approaches are fundamentally unworkable, and why we need to change the object layer. So, my final question is: are you still not convinced that this approach shows a lot of potential, and is worth exploring now? No. I don't know how many times to repeat this: No, Junio. A less disruptive approach is _not_ fundamentally unworkable. You can spend the next five years fixing submodule.c/ git-submodule.sh, or can take a step back and think about why it's in such pathetic shape right now. To reiterate: link does not make possible something that is not fundamentally _possible_ with a .githack and a 100k-line Perl script. At its core, every variant of submodules does this. What I'm essentially proposing: break up the information in .githack into smaller bits and create a new object type so it can be parsed by git-core easily. The .gitmodules file is designed to be easily parsable by the config infrastructure and implemented as such already, thank-you-very-much. You're missing the point. Who parses .gitmodules? submodule.c and git-submodule.sh, as opposed to a link being parsed by git-core. How is it any different? That's what my series is trying to answer. Why do you keep calling an already working solution with derogatory misspelling? That only gives others an impression that you do not understand how the current system works, and pursuade them not to waste time responding to you. Stop it. I don't see why you have to get offended by my deliberate misspelling: we're not emotionally attached to software, and I'm merely criticizing what I think is a bad hack. I'm not pointing out the concrete limitations of git-submodule precisely because they can be fixed without any changes to the object layer: this thread will become a discussion about how to fix submodule.c/ git-submodule.sh. You want floating submodules? Fine, we'll write a helper script that auto-commits to superproject everytime the SHA-1 changes. Everything _can_ be done. What exactly don't I understand about the current system, apart from the fact that everybody is super-rigid and defensive about what already works? Let us take a moment to look at the current state of git-submodule (note that this is after many years of hard work). This is just off the top of my head: 1. To add a submodule, you can't git add. You need to git submodule add. And only from the toplevel directory. You can't first clone and then add either: a git submodule add clones, adds lines to .gitmodules, AND stages everything. 2. There is currently no way to remove a submodule. You have to git rm it, remove the lines in .gitmodules, and remove the GITDIR from .git/modules. 3. It is currently impossible to git mv a submodule, because of the amount of gymnastics required to relocate the object store, rewrite the .gitmodules and stage the correct changes. 4. It is currently impossible to do true floating submodules, because we're using a commit object in-tree. 5. You have to execute all submodule commands from the toplevel of the worktree. 6. It is currently impossible to initialize a nested submodule without initializing the container submodule. If I really want this, I have to trade-off composability and use repo. What is going on? Either the people working on git-submodule are horribly incompetent, or there's some fundamental problem. I believe the problem is the latter and have tried to show that the above quirks can be fixed in a much simpler way with two days of work. What part of this didn't you understand? Sorry, but that is not how open source works in general, and certainly not how this project works. We do not add disruptive change just for the sake of changing it to break a working system, make an extra work to clean up the fallout for ourselves (i.e. your 40 to 50 patch series, but honestly speaking I expect it would be more like a 4 months work for a full time engineer or two), for unproven design (that has not yet to be illustrated) to solve problems (that has not yet to be explained), without knowing that (1) the problems are worth solving; (2) the design will solve the problems; and (3) solving the same problems without such a disruptive change is impossible, or so cumbersome that it will be far larger than the work needed to clean-up the fallout of the disruptive change. So what are your X, Y, Z?
Re: [RFC/PATCH 0/7] Rework git core for native submodules
I suspect you're overtly worried about the fallout of such a disruptive change. If so, you could've just said: Ram, I like the idea. But what breakages do you estimate we'll have to deal with? instead of attacking the idea and repeatedly questioning its purpose. So, I'll make a rough guess based on the first iteration I intend to get merged: - Not all the git submodule subcommands will work. add/ status/ init/ deinit are easy to rewrite, but stuff like --recursive and foreach might be slightly problematic as I already pointed out earlier. We'll have to code depending on how far you think the first iteration should go. After a few iterations, we can make 'git submodule' just print This command is deprecated. Please read `man gitsubmodules`. - All existing repositories with submodules will not be supported. My plan to deal with this: Have git-core code detect commit objects in-tree and disable things like diff. As soon as the user executes the first 'git submodule' command, remove all existing submodules, along with .gitmodules and re-add them as link objects. Then print a message saying: We've just migrated your submodules to the new format. Please commit this. That's really it. It's certainly not earth-shattering breakage; and I think the inconvenience it causes is more than compensated by its beautiful design and UI/UX. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
John Keeping wrote: Meaning that every repository using submodules need to have a flag day when all of the people using it switch to the new Git version at once? No, I would be totally against a migration that involves a flag-day. What I meant is that having old-style submodule side-by-side with new-style submodules is confusing (think about people using an older version and getting confused), and that we should disallow it. Users will still be able to use existing repositories with new versions of git with a few caveats: 1. They won't be able to add new new submodules without migrating all existing submodules. 2. git ls-tree will show the in-tree object incorrectly as a link (ie. not commit). That's about it, I think. Obviously, everyone working on the repository has to upgrade to a new version of git before they can use new-style submodules. I think you need a much better argument than it makes the implementation more beautiful to convince users that a flag day is necessary. There is no flag day necessary, and that is not my argument at all: new-style submodules brings lots of new functionality to the table. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
John Keeping wrote: So not a flag day, but still some point at which the repository transitions to will not work with Git older than version X. And if you need to add a new submodule then you cannot delay that transition any longer. Yes, that is true. I don't see any way out of this. I haven't seen anywhere a concise list of what functionality this is. Do you have a simple bulleted list of what new features this would allow? Sure, I'll write it out for you from an end-user perspective: 0. Great UI/UX. No more cd-to-toplevel, and a beautiful set of native commands that are consistent with the overall design of git-core. Which means: clone (to put something in an unstaged place), add (to stage), and commit (to commit the change). There's now exactly one place in your worktree (which is represented as one file in git; think of it a sort of symlink) to look in for all the information. git cat-link link to figure out its parameters, git edit-link to edit its parameters: no more find the matching pwd in .gitmodules in toplevel. To remove a submodule, just git rm. And git mv works! 1. True floating submodules. You can have a submodule checked out at `master` or `v3.1`: no more detached HEADs in submodules unless you want fixed submodules. No additional cruft required to do the floating: the information is native, in a link object. 2. Initializing a nested submodule without having to initialize the outer one: no more repo XML nonsense. And it's composable: you don't need to put the information about all submodules in one central place. 3. Ability to have very many large submodule repositories without the performance hit. It makes sense to block stat() from going through when you have floating submodules. This means that many levels of nesting are very easily possible. 4. It's suddenly much easier to add new features to this implementation. You don't need to do the kind of gymnastics you'd have to do if you were hacking on submodule.c/ git-submodule.sh. This is basically how great design plays out. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Sun, Apr 07, 2013 at 10:52:50PM +0530, Ramkumar Ramachandra wrote: Sure, I'll write it out for you from an end-user perspective: To play Devil's Advocate for a bit... 0. Great UI/UX. No more cd-to-toplevel, and a beautiful set of native commands that are consistent with the overall design of git-core. Which means: clone (to put something in an unstaged place), add (to stage), and commit (to commit the change). There's now exactly one place in your worktree (which is represented as one file in git; think of it a sort of symlink) to look in for all the information. git cat-link link to figure out its parameters, git edit-link to edit its parameters: no more find the matching pwd in .gitmodules in toplevel. To remove a submodule, just git rm. And git mv works! Presumably now without .git/config support, so I can't override the checked-in settings without my own custom branch. Even carrying a dirty working tree seems problematic here since a checked-out link object is a directory, which can't have information like the remote URL in it. 1. True floating submodules. You can have a submodule checked out at `master` or `v3.1`: no more detached HEADs in submodules unless you want fixed submodules. No additional cruft required to do the floating: the information is native, in a link object. Can't I do that now with submodule.name.branch and git submodule update --remote --rebase and friends? 2. Initializing a nested submodule without having to initialize the outer one: no more repo XML nonsense. And it's composable: you don't need to put the information about all submodules in one central place. How does this interact when there is the following structure: super `-- sub `-- subsub (specified by sub) and subsub is specified as a submodule in *both* super and sub but with different settings. Do I get different behaviour depending on $PWD? 3. Ability to have very many large submodule repositories without the performance hit. It makes sense to block stat() from going through when you have floating submodules. This means that many levels of nesting are very easily possible. Can't I already control this to some degree? Certainly the following commands take different amounts of time to run: git status git -c status.submodulesummary=true status 4. It's suddenly much easier to add new features to this implementation. You don't need to do the kind of gymnastics you'd have to do if you were hacking on submodule.c/ git-submodule.sh. This is basically how great design plays out. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
John Keeping wrote: With the clarifications Ram's provided in this thread, I think there are also some important regressions in functionality in his proposal (at least as it currently stands), particularly losing the .gitconfig overrides. If we want the entire feature list in the very first iteration, it's going to be huge. The only proposed change that seems to me to be impossible with the current .gitmodules approach is the submodule in a non-initialized submodule feature, but I've never seen anyone ask for that and it seems likely to open a whole can of worms where the behaviour is likely to vary with $PWD. The current hierarchical approach provides sensible encapsulation of repositories and is simple to understand: once you're in a repository nothing above its root directory affects you. That can be implemented in the current submodule system too, fwiw. It doesn't seem to me that it's harder than I'd like to add a feature I want is a good reason to subject all users of submodules to a lot of pain migrating to some new implementation that doesn't work the way they're used to and which will mean they have to deal with complaints when people using an older version of Git can't clone their repository (and I doubt we want this mailing list to be flooded with such complaints either). Like I've said before: there is nothing that _cannot_ be done with the current submodule system. To see the real advantages of this new submodule system, you have to think like a developer, not an end-user. Focusing just on end-user happiness is a very myopic way to develop software, and I think the git community is better than that. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
This reminds me of the commit generation numbers thread. But how can we determine ancestry? Use the commit timestamp. But what if there are clock skews? Put in a slop. It breaks existing stuff, and it's hard to show any end-user benefit. I fear this proposal will meet with the same fate. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Am 07.04.2013 20:44, schrieb Ramkumar Ramachandra: Jens Lehmann wrote: The whole feature list is full of red herrings like this which have nothing to do with the advantages of a new object, but talk about UI issues which are easy to solve in both worlds. Really? git-submodule.sh was written in 2007, and does not have git mv or cd-to-toplevel restriction removed to date. What does that say about git-submodule? That there is still some work to do, which I never denied and am actively working on (see git mv support in pu, which tackles one of the UI issues you mentioned). I specifically said end-user's perspective. Why exactly would I be talking about the advantages of the link object? Because they are all that matters when it comes to decide if a link object should be introduced to replace the current model. We should discuss the differences in the UI that result from introducing such an object, not the stuff that is still missing from our current implementation (as that has to be coded either way and can not be taken in favor of either solution). And we can additionally also talk about the differences in hacking on git, where I concede that putting everything into a single object could lead to shorter code than having to consult a .gitmodules file for that (even though I believe these arguments are much less important than UI changes). Just to be sure: I think we agree that both approaches are capable of allowing all relevant use cases, because they store the same information? Disclaimer: I am not opposed to the link object per se, but after all we are talking about severely changing user visible behavior. So I want to see striking evidence that we gain something from it, discussed separately from UI deficiencies of the current code (no cd-to-toplevel please ;-). So I started putting together a list of advantages and one of disadvantages of the new link object compared to the current model. We can extend and refine that to see what your proposal would mean for us. After all we are talking about severely changing user visible behavior, so we need convincing reasons to do that. Advantages: * Information is stored in one place, no need to lookup stuff in another file/blob. * Easier coding, as we find all information in a single object. (I did not forget to add the point that you currently need a checked out work tree to access the .gitmodules file, as there is ongoing work to read the configuration directly from the database) (Another advantage would be that it is easier to merge the link object, but a - still to be coded - .gitmodules aware merge driver would work just as well) Disadvantages: * Changes in user visible behavior, possible compatibility problems when Git versions are mixed. * Special tools are needed to edit submodule information where currently a plain editor is sufficient. * merge conflicts are harder to resolve and require special git commands, solving them in .gitmodules is way more intuitive as users are already used to conflict markers. * A link object has no unstaged counterpart that a file easily has. What would that mean for adding a submodule and then unstaging it (or how could we add a submodule unstaged, like you proposed in another email)? (I think when we also put the submodule name in the object we could also retain the ability to repopulated moved submodules from their old repo, which is found by that name) I'm not saying that this list is complete, I just wrote down what came to mind. When we e.g. find workable solutions to the Disadvantages we can remove them from the list and append them in parentheses for later reference like I did here. Does that sound like a plan? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jens Lehmann wrote: Just to be sure: I think we agree that both approaches are capable of allowing all relevant use cases, because they store the same information? Yes. Disclaimer: I am not opposed to the link object per se, but after all we are talking about severely changing user visible behavior. So I want to see striking evidence that we gain something from it, discussed separately from UI deficiencies of the current code (no cd-to-toplevel please ;-). The only mandatory user-visible behavior change is the absence of .gitmodules. The git submodule subcommand will be have to be present and made to work, whether we like it or not. (I did not forget to add the point that you currently need a checked out work tree to access the .gitmodules file, as there is ongoing work to read the configuration directly from the database) Read the configuration from the database? How? Also, I want refs quite badly: I really can't stand repo. (Another advantage would be that it is easier to merge the link object, but a - still to be coded - .gitmodules aware merge driver would work just as well) It's very simple to implement: if you turn it into a blob, you can diff and merge as usual. Disadvantages: * Changes in user visible behavior, possible compatibility problems when Git versions are mixed. Agreed. * Special tools are needed to edit submodule information where currently a plain editor is sufficient. Um, I actually really like this. I don't want to cd-to-toplevel, open up my .gitmodules and look for the relevant section. And it's a very simple tool: see the git cat-file that I posted earlier. * merge conflicts are harder to resolve and require special git commands, solving them in .gitmodules is way more intuitive as users are already used to conflict markers. There shouldn't be that many merge conflicts to begin with! It happens because you've stuffed all the information into one gigantic .gitmodules. With links, life is *much* easier: you already have a tight buffer format and a predefined order in which the key/value pairs will appear. But yes, we will require to grow git-core to merge links seamlessly. * A link object has no unstaged counterpart that a file easily has. What would that mean for adding a submodule and then unstaging it (or how could we add a submodule unstaged, like you proposed in another email)? Adding a submodule untracked (not unstaged) is possible, and is default: git clone gets the submodules, and you have to use git add to stage it. I agree that you can't edit-link and have an unstaged change, but I really don't care about that. (I think when we also put the submodule name in the object we could also retain the ability to repopulated moved submodules from their old repo, which is found by that name) Hm, considering that the information is not present anywhere (certainly not in the tree), this is probably a good idea. We'd have the history of the submodule's name too. I'm not saying that this list is complete, I just wrote down what came to mind. When we e.g. find workable solutions to the Disadvantages we can remove them from the list and append them in parentheses for later reference like I did here. Does that sound like a plan? Yes, good plan. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jens Lehmann wrote: * Easier coding, as we find all information in a single object. It's not just the difference between a single location versus multiple locations. It's about the core object code of git parsing links, as opposed to a fringe submodule.c/ submodule.sh parsing .gitmodules. When you push git-submodule.sh into core, you'll have to constantly call functions to parse .gitmodules and get the information. With links, all that information is free, provided you've parsed the object. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Mon, Apr 08, 2013 at 02:19:10AM +0530, Ramkumar Ramachandra wrote: Jens Lehmann wrote: * A link object has no unstaged counterpart that a file easily has. What would that mean for adding a submodule and then unstaging it (or how could we add a submodule unstaged, like you proposed in another email)? Adding a submodule untracked (not unstaged) is possible, and is default: git clone gets the submodules, and you have to use git add to stage it. I agree that you can't edit-link and have an unstaged change, but I really don't care about that. I do. I quite often use git add -p to sort things out and submodules currently fit into that seamlessly: I can add the submodule and then wait until later to commit it, without needing to either clone and remember to submodule add later or commit and play with rebase. Losing the ability to do that is a major usability regression as far as I'm concerned. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: It's about the core object code of git parsing links, as opposed to a fringe submodule.c/ submodule.sh parsing .gitmodules. What's stopping the core object code of git parsing .gitmodules? What is the core object code? How does this compare to other metadata files like .gitattributes and .gitignore? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Hi again, So we've thought about it for some time, and I really need you to start reviewing the code now. I'll just summarize what we've discussed so far: 1. The malleability argument doesn't hold, because we're proposing a link object with optional fields. 2. The local-fork argument doesn't hold, because users will be rebasing changes to the link object in exactly the same way as they currently do with the blob object .gitmodules. 3. The worktree argument doesn't hold, because we're proposing to treat the link object as nothing more than a blob object that can be parsed by git-core. It will stage and unstage just like a blob. Sure, it's not accessible directly by the filesystem: so what? What is the difference does `emacsclient .gitmodules` versus `git edit-link clayoven` make to the end-user? 4. The diff-confusion argument is just another by-the-way, but it doesn't really hold either. Currently, we see: - Subproject commit b83492 + Subproject commit 39ab2f (with diff.submodule set to log, we can actually see the log of the submodule between these two commits. With links, we will see: - checkout_rev = b83492 + checkout_rev = 39ab2f There's nothing that prevents us from respecting diff.submodule (some minor glue code will have to be written; that's all). *. There is actually one thing that .gitmodules does better than links. foreach. It's trivial to implement with .gitmodules and hard to implement with links: with .gitmodules, the paths of all the submodules are in one place. But with links, we'll have to unpack_trees() every tree in the entire repository, and dig through it to find all the link objects to initialize. Basically, inefficient and inelegant. However, I don't think this is a big problem in practice, since this is not exactly a common operation: I'd probably want to recurse-submodules once at clone time. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: So we've thought about it for some time, and I really need you to start reviewing the code now. I'll just summarize what we've discussed so far: ... I do not think we have heard anything concrete and usable about what you are trying to achieve yet. You may be proposing to discard baby with bathwater. We haven't seen an evidence that the change is really worth having. We do not even know what you are trying to change, other than I want to add a new object type to largely replicate what is recorded in .gitmodules file. What are you trying to solve? I want to have a project for an appliance, that binds two projects, the kernel and the appliance's userspace. The usual suspects to use to implement such a project would be Git submodule, repo, or Gitslave. I want to be able to do X and Y and Z in managing such a project. If I try to use submodule, I cannot see how I could make it do X for _this thing_, and it is not a bug in the implementation but is fundamental because of _this and that_. If I try to use repo, .. the same, and the same for Gitslave. .. I propose to add a new gitlink object recorded in the tree and in the index, and the said cases X, Y and Z can be solved in _such and such way_. We cannot solve it without having a new gitlink object recorded in the tree object because of _this and that reason_. I think it is too premature to discuss _your_ code. The patches do not even tell us anything about how much more work is needed to merely make Git with your patches work properly again. For one thing, I suspect that you won't even be able to repack a repository that has OBJ_LINK only with the patches you posted. At this point the only thing that we can gain from reading your patch is that you can write C to do _something_, but that something is so fuzzily explained that we do not know what to make of that knowledge that you write good (or bad, we don't know) C. It would be much more productive to learn what these specific issues X, Y and Z are, and if the problems you are having with existing solutions are really fundamental that need changes to object layer to solve. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: After some discussion, I hope to be able to finalize a list of fields that will suffice for (nearly) everything. The task is actually much easier than this. All we have to do is finalize the list of fields that will mandatorily be written to the link object. As I might have indicated in my series, this is: upstream_url, checkout_rev, and ref_name. Really, the user only needs to supply a valid upstream_url: after a clone, everything else can be inferred (with the exception of a ref_name conflict; I don't like auto-mangling). Other fields are like .git/config fields. We can add new key/value pairs in the future, without worrying about migration. A problem arises only if we want to add a new mandatory field, change the default value of a key, or deprecate an existing key/value pair. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Thu, Apr 4, 2013 at 1:04 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Linus Torvalds wrote: Or you could also just edit and carry a dirty .gitmodules around for your personal use-case. I'm sorry, but a dirty worktree is unnecessarily painful to work with. Bzzt. Wrong. A dirty worktree is not only easy to work with (I do it all the time, having random test-patches in my tree that I never even intend to commit), it's a *requirement*. One thing that git does really really well is merging. And one of the reasons why git does merging well (apart from the obvious meta-issue: it's what I care about) is that it not only has the stable information in the object database, it also has the staging information in the index, *and* it has dirty data in the working tree. You absolutely need all three. Having an edit command to edit stable data (or staging data) is broken. Trust me, I've been there, done that, got the T-shirt and know it is wrong. The whole stable objects + index + dirty worktree is FUNDAMENTALLY the right way to work, and it *has* to work that way for merges to work well. The only things that we don't have dirty data for in the worktree is creating commits and tags, but those aren't relevant for the merging process anyway, in the sense that you never change them for merging, you create them *after* merging (and this is fundamental, and not just a git implementation issue). So you absolutely need a dirty worktree. You need it for testing, and you need it for merging. Having a model where you don't have a in-progress entity that works as a temporary is absolutely and entirely wrong. Linus -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: So you absolutely need a dirty worktree. You need it for testing, and you need it for merging. Having a model where you don't have a in-progress entity that works as a temporary is absolutely and entirely wrong. I agree entirely. My comment was just a by the way, and specific to how people work with .gitmodules: I didn't imply any strong notions of Right or Wrong with respect to dirty worktrees in general. So, yes: links stage and unstage, just like blobs do. Oh, and I'm currently writing infrastructure to work with links like blobs. Here's a WIP: git cat-link link is exactly the same as cat file, to the end user. -- 8 -- From d8a1de6f9075771dde6f1fde9ffa193dce386a17 Mon Sep 17 00:00:00 2001 From: Ramkumar Ramachandra artag...@gmail.com Date: Fri, 5 Apr 2013 19:42:56 +0530 Subject: [PATCH] builtin/cat-link: implement new builtin This is a simple program that calls unpack_trees() with a custom callback that just prints the contents of whatever objects were matched using revs.prune_data. Blobs can be cat'ed directly from the filesystem, so this program is primarily useful for links; git cat-link link shows it up like a blob. We will use this program to build edit-link. Signed-off-by: Ramkumar Ramachandra artag...@gmail.com --- Makefile | 3 +- builtin.h | 1 + builtin/cat-link.c | 83 ++ diff-lib.c | 10 +++ diff.h | 6 git.c | 1 + 6 files changed, 98 insertions(+), 6 deletions(-) create mode 100644 builtin/cat-link.c diff --git a/Makefile b/Makefile index cd4b6f9..28194d7 100644 --- a/Makefile +++ b/Makefile @@ -349,7 +349,7 @@ GIT-VERSION-FILE: FORCE # CFLAGS and LDFLAGS are for the users to override from the command line. -CFLAGS = -g -O2 -Wall +CFLAGS = -g -O0 -Wall LDFLAGS = ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS) ALL_LDFLAGS = $(LDFLAGS) @@ -893,6 +893,7 @@ BUILTIN_OBJS += builtin/blame.o BUILTIN_OBJS += builtin/branch.o BUILTIN_OBJS += builtin/bundle.o BUILTIN_OBJS += builtin/cat-file.o +BUILTIN_OBJS += builtin/cat-link.o BUILTIN_OBJS += builtin/check-attr.o BUILTIN_OBJS += builtin/check-ignore.o BUILTIN_OBJS += builtin/check-ref-format.o diff --git a/builtin.h b/builtin.h index faef559..be0160d 100644 --- a/builtin.h +++ b/builtin.h @@ -49,6 +49,7 @@ extern int cmd_blame(int argc, const char **argv, const char *prefix); extern int cmd_branch(int argc, const char **argv, const char *prefix); extern int cmd_bundle(int argc, const char **argv, const char *prefix); extern int cmd_cat_file(int argc, const char **argv, const char *prefix); +extern int cmd_cat_link(int argc, const char **argv, const char *prefix); extern int cmd_checkout(int argc, const char **argv, const char *prefix); extern int cmd_checkout_index(int argc, const char **argv, const char *prefix); extern int cmd_check_attr(int argc, const char **argv, const char *prefix); diff --git a/builtin/cat-link.c b/builtin/cat-link.c new file mode 100644 index 000..14dd92b --- /dev/null +++ b/builtin/cat-link.c @@ -0,0 +1,83 @@ +/* + * Copyright (c) 2013 Ramkumar Ramachandra + */ +#include cache.h +#include tree.h +#include cache-tree.h +#include unpack-trees.h +#include commit.h +#include diff.h +#include revision.h + +static int cat_file(struct cache_entry **src, struct unpack_trees_options *o) { + int cached, match_missing = 1; + unsigned dirty_submodule = 0; + unsigned int mode; + const unsigned char *sha1; + struct cache_entry *idx = src[0]; + struct cache_entry *tree = src[1]; + struct rev_info *revs = o-unpack_data; + enum object_type type; + unsigned long size; + char *buf; + + cached = o-index_only; + if (ce_path_match(idx ? idx : tree, revs-prune_data)) { + if (get_stat_data(idx, sha1, mode, cached, match_missing, + dirty_submodule, NULL) 0) + die(Something went wrong!); + buf = read_sha1_file(sha1, type, size); + printf(%s, buf); + } + return 0; +} + +int cmd_cat_link(int argc, const char **argv, const char *prefix) +{ + struct unpack_trees_options opts; + int cached = 1; + struct rev_info revs; + struct tree *tree; + struct tree_desc tree_desc; + struct object_array_entry *ent; + + if (argc 2) + die(Usage: git cat-link link); + + init_revisions(revs, prefix); + setup_revisions(argc, argv, revs, NULL); /* For revs.prune_data */ + add_head_to_pending(revs); + + /* Hack to diff against index; we create a dummy tree for the + index information */ + if (!revs.pending.nr) { + struct tree *tree; + tree = lookup_tree(EMPTY_TREE_SHA1_BIN); + add_pending_object(revs, tree-object, HEAD); + } + + if (read_cache() 0) +
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: Junio C Hamano wrote: git log -p .gitmodules would be a way to review what changed in the information about submodules. Don't you need git log-link for exactly the same reason why you need git diff-link in the first place? So you may not have suggested it, but I suspect that was only because you haven't had enough time to think things through. What is this git log -p .gitmodules doing? It's walking down the commit history, and picking out the commits in which that blob changed. Then it's diffing the blobs in those commits with each other. Why is git log -p link any different? We already know how to diff blobs, and we just need a way to diff links. You already forget what you invented git diff-link as a solution for, perhaps? By recording the submodules themselve and information _about_ the submodules separately (the latter is in .gitmodules), git diff A can show the difference in submodule A, while git diff .gitmodules can show a change, which is a possibly in-working-tree-only proposed change, in information about submoudules. Once you start recording the latter also at path A, it becomes unclear what git diff A should show. That is what I said in the message, to which you invented diff-link as a solution to the unclear-ness. Am I misremembering the flow of discussion in this thread? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Once you start recording the latter also at path A, it becomes unclear what git diff A should show. That is what I said in the message, to which you invented diff-link as a solution to the unclear-ness. I just thought it would be a stopgap until we get diff to support links natively. Obviously, when we get native diff support, 'log -p' will be able to show differences as well. As it turns out from my little experiment with cat-link, it's really easy to get native diff support, and I'm targeting that directly instead of a scripted solution. As for the unclearness issue, it's a little more complicated than that: a non-floating submodule could've previously been a floating one, or vice-versa. As of this moment, I'm only planning to show differences between link buffers. In the case when two consecutive commits change a link checkout_rev (where floating is set to 0), we can come up with something like the current diff.submodule = log. I see no cause to worry about the interface of that now. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Thu, Apr 4, 2013 at 11:30 AM, Ramkumar Ramachandra artag...@gmail.com wrote: The purpose of this series is to convince you that we've made a lot of fundamental mistakes while designing submodules, and that we should fix them now. [1/7] argues for a new object type, and this is the core of the idea. I don't dispute that a new link object might be a good idea, but there's no explanation of the actual format of this thing anywhere, and what the real advantages would be. A clearer this is the design, this is the format of the link object, and this is what it buys us would be a good idea. Also, one of the arguments against using link objects originally was that the format wasn't stable, and in particular the address of the actual submodule repository might differ for different people. So when adding a new object type, explaining *why* the format of such an object is globally stable (since it will be part of the SHA1 of the object) is a big deal. Linus -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: The purpose of this series is to convince you that we've made a lot of fundamental mistakes while designing submodules, and that we should fix them now. [1/7] argues for a new object type, and this is the core of the idea. Oh, dear. Shouldn't it be possible to explain the same thing using a test script illustrating intended UI? [...] $ git clone gh:artagnon/varlog $ cd varlog $ git clone gh:artagnon/clayoven # Notice how it puts clayoven.git in ~/bare I really would like to be able to continue doing something like git clone --recurse-submodules git://repo.or.cz/cgit.git # never mind! rm -fr cgit without leaving any clutter behind. I have used systems that kept state in my home directory before and found them a pain in the neck to debug. Others may disagree, though. [...] # Again, just works! No cd-to-toplevel nonsense Didn't Jens mention that git-submodule requiring that one work at the toplevel is just a (presumably easily fixable) bug? [...] If you think this is all a big waste of time, and that we should focus on improving git-submodule.sh, you're probably deranged. Because it's I don't think that *you* should focus on improving git-submodule, as long as you are not using it and dislike its design. But I do think it's strange to at the same time 1) tell me I'm deranged for liking submodules 2) dismiss other experiments that have been created as alternatives I like experimentation, which means sometimes having tools whose purposes overlap, and I like when it's possible to help something evolve to be better, even far enough to interoperate with or replace uses of another tool. I also believe in live and let live. That means that even if someone is a little crazy, if they are not actively harmful, I do not destroy their tools. That probably marks me as deranged. Hope that helps, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: I don't dispute that a new link object might be a good idea, but there's no explanation of the actual format of this thing anywhere, and what the real advantages would be. A clearer this is the design, this is the format of the link object, and this is what it buys us would be a good idea. Yeah, I need help with that. I've just stuffed in whatever fields popped into my mind first. The current ones are: 1. upstream_url: this records the upstream URL. No need to keep a .gitmodules. 2. checkout_rev: this records the ref to check out the submodule to. As opposed to a concrete SHA-1, this allows for more flexibility; you can put refs/heads/master and have truly floating submodules. 3. ref_name: this specifies what name the ref under refs/modules/branch/ should use. 4. floating: this bit specifies whether to record a concrete SHA-1 in checkout_rev. 5. statthrough: this bit specifies whether git should stat() through the worktree. We can turn it off on big repositories for performance reasons. Also, one of the arguments against using link objects originally was that the format wasn't stable, and in particular the address of the actual submodule repository might differ for different people. So when adding a new object type, explaining *why* the format of such an object is globally stable (since it will be part of the SHA1 of the object) is a big deal. After some discussion, I hope to be able to finalize a list of fields that will suffice for (nearly) everything. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: 1. 'git add' should not go past submodule boundaries. I should not be able to 'git add clayoven/' or 'git add clayoven/LICENSE'. In addition, the shell completion also needs to be fixed. Yep. This is a bug. 2. An empty directory containing a .git file is a perfectly valid worktree, but does not show up in the superproject's 'git status' output. How can it be treated like an empty directory? Stated like that, it doesn't sound like a bug. Git since very early has deliberately not tracked files or directories named .git. Do you need this as a way of importing from a foreign VCS when someone has accidentally checked in a .git directory along with everything else? Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jonathan Nieder wrote: Ramkumar Ramachandra wrote: The purpose of this series is to convince you that we've made a lot of fundamental mistakes while designing submodules, and [...] Shouldn't it be possible to explain the same thing using a test script illustrating intended UI? Sorry, I sent this reply too quickly. Your explanation to Linus clarified the idea. Regards, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Thu, Apr 4, 2013 at 11:52 AM, Ramkumar Ramachandra artag...@gmail.com wrote: 1. upstream_url: this records the upstream URL. No need to keep a .gitmodules. 2. checkout_rev: this records the ref to check out the submodule to. As opposed to a concrete SHA-1, this allows for more flexibility; you can put refs/heads/master and have truly floating submodules. 3. ref_name: this specifies what name the ref under refs/modules/branch/ should use. 4. floating: this bit specifies whether to record a concrete SHA-1 in checkout_rev. 5. statthrough: this bit specifies whether git should stat() through the worktree. We can turn it off on big repositories for performance reasons. So the thing is (and this was pretty much the original basis for .gitmodules) that pretty much *all* of the above fields are quite possibly site-specific, rather than globally stable. So I actually conceptually like (and liked) the notion of a link object, but I just don't think it is necessarily practically useful, exactly because different installations of the *same* supermodule might well want to have different setups wrt these submodule fields. My gut feel is that yes, .gitmodules was always a bit of a hack, but it's a *working* hack, and it does have advantages exactly because it's more fluid than an actual git object (which by definition has to be set 100% in stone). If there are things you feel it does wrong (like the git add bug that is being discussed elsewhere), I wonder if it's not best to at least try to fix/extend them in the current model. The features you seem to be after (ie that whole floating/refname thing) don't seem fundamentally antithetical to the current model (a commit SHA1 of all zeroes for floating, with a new refname field in .submodules? I dunno).. Linus -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: So the thing is (and this was pretty much the original basis for .gitmodules) that pretty much *all* of the above fields are quite possibly site-specific, rather than globally stable. So I actually conceptually like (and liked) the notion of a link object, but I just don't think it is necessarily practically useful, exactly because different installations of the *same* supermodule might well want to have different setups wrt these submodule fields. My gut feel is that yes, .gitmodules was always a bit of a hack, but it's a *working* hack, and it does have advantages exactly because it's more fluid than an actual git object (which by definition has to be set 100% in stone). If there are things you feel it does wrong (like the git add bug that is being discussed elsewhere), I wonder if it's not best to at least try to fix/extend them in the current model. The features you seem to be after (ie that whole floating/refname thing) don't seem fundamentally antithetical to the current model (a commit SHA1 of all zeroes for floating, with a new refname field in .submodules? I dunno).. Let's compare the two alternatives: .gitmodules versus link object. If I want my fork of .gitmodules, I create a commit on top. If I want my fork of the link object, I create a link object, plus tree object, plus commit object on top of that. But the commit still rebases fine. On malleability, have you looked at [5/7], where I create edit-link (dead code; half done)? The buffer looks just like a .gitmodules buffer. Fundamentally, what is the difference between this and a blob? git-core can parse it into structured data that it can slurp easily. I don't want full float or nothing. I want in-betweens too, and refs are great. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: don't seem fundamentally antithetical to the current model I don't think it's fundamentally antithetical either. This basically makes the life of git-submodule much simpler, and eventually obsolete it away completely. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
On Thu, Apr 4, 2013 at 12:36 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Let's compare the two alternatives: .gitmodules versus link object. If I want my fork of .gitmodules, I create a commit on top. Or you could also just edit and carry a dirty .gitmodules around for your personal use-case. I don't know if anybody does that, but it should work fine. And I don't see what you can do with the link objects that you cannot do with .gitmodules. That's what it really boils down to. .gitmodules do actually work. Your extensions would work with them too. Linus -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: On Thu, Apr 4, 2013 at 12:36 PM, Ramkumar Ramachandra artag...@gmail.com wrote: Let's compare the two alternatives: .gitmodules versus link object. If I want my fork of .gitmodules, I create a commit on top. Or you could also just edit and carry a dirty .gitmodules around for your personal use-case. Just take the link's buffer with you everywhere. All you have to do is git edit-link name and paste the file's contents there, instead of opening .gitmodules directly in your editor. And I don't see what you can do with the link objects that you cannot do with .gitmodules. That's what it really boils down to. .gitmodules do actually work. Your extensions would work with them too. If it came to that, you could write a huge Perl script to solve everything with a .githack. It breaks the internal symmetry of the repository, which is why git-submodule is having such a field day. I'm trying to prove, in my series, that making fundamental changes lets us get rid of a huge amount of complexity. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: I think Heiko and Jens's (by the way, why aren't they on the Cc: list when this topic is clearly discussing submodules? Don't we want to learn how the current submodule subsystem is used to solve what real-world problems?) .gitmodules updates is exactly going in that direction. Because it's pointless. We're not discussing a git-submodule alternative. We're discussing how to fix git-core so that git-submodule becomes much simpler; to the extent that it will be unnecessary soon. git-submodule is years of hard work and it can do a limited version of floating with great difficulty. Mine is two days of work, and can already do true floating submodules. What is going on? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Linus Torvalds wrote: Or you could also just edit and carry a dirty .gitmodules around for your personal use-case. I'm sorry, but a dirty worktree is unnecessarily painful to work with. I don't think anyone objects to committing, if they can understand basic rebase. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra wrote: Just take the link's buffer with you everywhere. All you have to do is git edit-link name and paste the file's contents there, instead of opening .gitmodules directly in your editor. On this. The buffer doesn't have to conform to a tight spec: we can just expose a .gitconfig-like buffer and reduce it to a tight spec before writing out. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Am 04.04.2013 21:17, schrieb Junio C Hamano: Linus Torvalds torva...@linux-foundation.org writes: ... The features you seem to be after (ie that whole floating/refname thing) don't seem fundamentally antithetical to the current model (a commit SHA1 of all zeroes for floating, with a new refname field in .submodules? I dunno).. Just on this part. I think Heiko and Jens's (by the way, why aren't they on the Cc: list when this topic is clearly discussing submodules? Don't we want to learn how the current submodule subsystem is used to solve what real-world problems?) .gitmodules updates is exactly going in that direction. - A submodule can be marked as floating in .gitmodules and be specified how (typially, use the tip of this branch in the submodule); - Running submodule update a floating submodule does not detach the submodule working tree to commit in the index of the superproject; instead it will use the specified branch tip; - A floating submodule records a concrete commit object name in the index of the superproject (no need to stuff an unusual SHA-1 there to signal that the submodule is floating---it is recorded in the .gitmodules). Thanks to this, a release out of the top-level can still describe the state of the entire tree; - It would be normal for the commit recorded in the index of the superproject not to match what is checked out in the submodule working tree (i.e. the tip of the branch in the submodule may have advanced). A traditional non-floating submodule has many mechanisms to be noisy about this situation to prevent users from making an incomplete commits, but they may have to be toned down or squelched for floating submodules. Anything I missed, Jens, Heiko? Nope, that perfectly sums it up. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Am 04.04.2013 21:04, schrieb Linus Torvalds: My gut feel is that yes, .gitmodules was always a bit of a hack, but it's a *working* hack, and it does have advantages exactly because it's more fluid than an actual git object (which by definition has to be set 100% in stone). Exactly. The flexibility of the .gitmodules file will really help us when it comes to the next feature that submodules are going to learn after recursive update: automatically initialize and then populate certain submodules during the clone of the superproject. You have to be able to configure that per submodule, which needs a new config option in .gitmodules. Others may follow for different use cases. While starting to grok submodules I was wondering myself if the data stored in .gitmodules would better be stored in an extended gitlink object, but I learned soon that the scope of the data that has to be stored there was not clear at that time (and still isn't). So I'm not opposed per se to adding a special object containing all that information, but I strongly believe we are not even close to considering such a step (and won't be for quite some time and maybe never will). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jens Lehmann wrote: Exactly. The flexibility of the .gitmodules file will really help us when it comes to the next feature that submodules are going to learn after recursive update: That's like saying that the flexibility of a blob is invaluable: let's throw out all the other objects, and make do with blobs. Ofcourse we make mistakes: we didn't put a generation number in the commit object, for instance (I'm not arguing about whether it's right or wrong: just that some people think it's a mistake). While starting to grok submodules I was wondering myself if the data stored in .gitmodules would better be stored in an extended gitlink object, but I learned soon that the scope of the data that has to be stored there was not clear at that time (and still isn't). So I'm not opposed per se to adding a special object containing all that information, but I strongly believe we are not even close to considering such a step (and won't be for quite some time and maybe never will). Nonsense. We will think through it before freezing the format, like we did with the other objects. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Jens Lehmann jens.lehm...@web.de writes: While starting to grok submodules I was wondering myself if the data stored in .gitmodules would better be stored in an extended gitlink object, but I learned soon that the scope of the data that has to be stored there was not clear at that time (and still isn't). So I'm not opposed per se to adding a special object containing all that information, but I strongly believe we are not even close to considering such a step (and won't be for quite some time and maybe never will). I actually think the storage is more or less an orthogonal issue. The format must be defined to be extensible (nobody is perfect and if you wait for an exhaustive list of attributes that cover all use cases including the ones that haven't even been invented yet, you will get nowhere), and designed carefully to reduce the chance of allowing the extended/optional bit to express the same thing in two different ways to make sure the object name will not become unnecessarily unstable, but you can start small, keep adding optional fields, and be prepared to design an upgrade path when you need to add new mandatory fields---that cannot be helped whether you record the information about submodules in .gitmodules or a new blob-ish object at the location where the submodule tree should reside in the index and the tree. However, the current .gitmodules design, even though it originally was invented as a way to carry information other than what a single commit object name from an otherwise unrelated project can express without having to change anything in-core, has a few practical merits. The information _about_ submodules is stored separately (i.e. in the .gitmodules file) from submodules themselves, and it may be a good thing. When you are changing information _about_ submodules (e.g. you may be updating the recommended URL to fetch it from), you can use the usual tools like git diff to see how it changed, just like changes to any other file. If the information _about_ a submodule A is stored at path A, and at the same time you have a working tree that corresponds to the root of the submodule A at that path, it gets unclear what git diff A should report. Should it report the change in the submodule itself, or should it report the change in the information _about_ the submodule? By separating these two concepts to two different places, .gitmodules design solves the issue nicely. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: When you are changing information _about_ submodules (e.g. you may be updating the recommended URL to fetch it from), you can use the usual tools like git diff to see how it changed, just like changes to any other file. If the information _about_ a submodule A is stored at path A, and at the same time you have a working tree that corresponds to the root of the submodule A at that path, it gets unclear what git diff A should report. Should it report the change in the submodule itself, or should it report the change in the information _about_ the submodule? By separating these two concepts to two different places, .gitmodules design solves the issue nicely. git diff-link. Just turn it into a buffer and diff as usual. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: Junio C Hamano wrote: When you are changing information _about_ submodules (e.g. you may be updating the recommended URL to fetch it from), you can use the usual tools like git diff to see how it changed, just like changes to any other file. If the information _about_ a submodule A is stored at path A, and at the same time you have a working tree that corresponds to the root of the submodule A at that path, it gets unclear what git diff A should report. Should it report the change in the submodule itself, or should it report the change in the information _about_ the submodule? By separating these two concepts to two different places, .gitmodules design solves the issue nicely. git diff-link. Just turn it into a buffer and diff as usual. Sounds like you are saying that you can pile a new command on top of new command to solve what the existing tools people are familar with can already solve in a consistent way without adding anything new. Are you going to dupliate various options to git diff and git log in git diff-link? Will you then next need git log-link? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: Sounds like you are saying that you can pile a new command on top of new command to solve what the existing tools people are familar with can already solve in a consistent way without adding anything new. Are you going to dupliate various options to git diff and git log in git diff-link? Will you then next need git log-link? What I'm saying is: As always, we start with plumbing and work our way up to porcelain. We do have git diff-files, diff-index, diff-tree, so I don't see what the problem with diff-link is. The point is that we can get an initial scripted version out quickly. And no, I never suggested a git log-link. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Ramkumar Ramachandra artag...@gmail.com writes: Junio C Hamano wrote: Sounds like you are saying that you can pile a new command on top of new command to solve what the existing tools people are familar with can already solve in a consistent way without adding anything new. Are you going to dupliate various options to git diff and git log in git diff-link? Will you then next need git log-link? What I'm saying is: As always, we start with plumbing and work our way up to porcelain. We do have git diff-files, diff-index, diff-tree, so I don't see what the problem with diff-link is. The point is that we can get an initial scripted version out quickly. And no, I never suggested a git log-link. git log -p .gitmodules would be a way to review what changed in the information about submodules. Don't you need git log-link for exactly the same reason why you need git diff-link in the first place? So you may not have suggested it, but I suspect that was only because you haven't had enough time to think things through. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/7] Rework git core for native submodules
Junio C Hamano wrote: git log -p .gitmodules would be a way to review what changed in the information about submodules. Don't you need git log-link for exactly the same reason why you need git diff-link in the first place? So you may not have suggested it, but I suspect that was only because you haven't had enough time to think things through. What is this git log -p .gitmodules doing? It's walking down the commit history, and picking out the commits in which that blob changed. Then it's diffing the blobs in those commits with each other. Why is git log -p link any different? We already know how to diff blobs, and we just need a way to diff links. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html