On Wed, 2008-09-10 at 18:58 -0600, Jeff Anderson wrote: > Malcolm Tredinnick wrote: [...] > > You don't even begin to approach why this might be a good idea for > > Django. So, what does it gain? > > > > Right now, you can already use your distributed VCS of choice with > > Django and subversion. Some of us have been doing that for literally > > years. The only time I ever use "svn" is on the very rare times I want > > to alter subversion metadata properties. However, subversion is a very > > good lowest common denominator for everybody to use as the central > > repository and it makes a lot of sense to continue to have a central > > repo. > > > The problem with going with the lowest common denominator is just that-- > the lowest common denominator also means less features.
*shrug* That doesn't mean it's less features than we need. > In this case, it > means I'm stuck with subversion's linear development. non-linear > development is a requirement for the distributed model of software > development. At some level. It still linearises eventually, since changelogs are an ordered file of changes and only one thing at a time lands in the final release block of code. > If I want to start a branch in my own repo, I can do that. The problem > happens when merge conflicts start happening. I'm forced to do things > "the subversion way" when I'm stuck with a subversion backend. I > **must** rebase my work rather than merge. You can work out the merge conflicts and fix them up that way. > This isn't really a good > thing in the distributed environment. It also breaks my ability to > directly check in things from my branches and repos-- when people are > constantly rebasing their work, I lose any ability to really track their > branches, and almost all advantages of using a distributed RCS. Keep in mind that the work tracking from the central repository is only one component of development work. As will come up again below, far more work actually goes into preparing a final feature than the code change that eventually lands. Having to linearise on your side for one branch, rather than having it automatically done by the tool is a concession. But it's a useful concession since it enables a much larger audience to also participate. Using distributed tools and understanding how to use them well is hard. You've apparently done a bit of research and use here. I know I have, too. And, yet, we have some different opinions about workflow and capabilities. And we're two people. Now multiply that by 10,000. Factor in those who haven't used any version control system before. Subversion itself is tricky enough. Low barrier to entry and contribution is a requirement. Those of us wanting to use a more distributed model can do so (and are doing so), but some accommodations of the others is necessary. Short version: there are some trade-offs. They're all possible to work around if you choose to. The advantages usually outweigh the trade-offs for those of us wanting to use that model and for those that are more comfortable of doing things other ways, we aren't forcing them away. > Continuing to have a single, central repo isn't exactly moving to the > distributed model of development. I didn't realize that I needed to > explain the differences between the central repository model and the > distributed model, but I'll try. Yeah, thanks. I was wondering where I'd left those instructions about how to suck eggs. :-) Yes, I'm joking. Maybe you thought I was clueless, so you tried to fill in the blanks. Fair enough. That's being helpful. Seriously, I've been using distributed version control systems for a few years now. I track a number of projects that use them. I use them personally for both open source and client work a lot. Some are truly decentralised, others are merely distributed with a more obvious central node a la Django. All are distributed. You're still talking about how this affects your workflow, not about why it's better for Django (you listed a bunch of possibilities, but not how they're advantageous to anything beyond the fact that it will mostly eliminate the periodic merge conflict; and they won't be that common). You *can* still work on branches and exchange with other people using distributed systems. You'll have to have a branch that tracks Django and periodically merges from that to your particular published development branches. That's fine. Commit ids are stable in, for example, git-svn, so merges will be the same for everybody who merges from a subversion-tracking branch to their development branch (in the sense that everybody pulling from subversion will get the same commit id for the same upstream commit; it just won't necessarily match the one they were using on their development branch if it wasn't pulled from subversion. It's the standard rebase issue). I would like to think that other DVCS do things similarly stably. Yes, there are a few little oddities with merging things you already had that were then passed upstream and come back as a merge with a different commit, but that's relatively minor in the grand scheme of things. Most development doesn't actually result in a commit to djangoproject.com upstream, when you sit down and think about it (there's more back-and-forth in the development phase than in the final patch). Distributed systems allow creating branches very easily, so after a big block of work that is accepted upstream, it's not particularly hard to, for example, stop using the branch you were developing on for that and use a different one for the next feature you're working on. That's not abnormal practice even in highly distributed projects like the linux kernel, since it keeps new features isolated from each other as much as possible. At that point, you can publish your branches and happily work back and forth with anybody else using the same workflow. What will still happen, though, is that the central version of Django, the thing that is called "Django" and is released, is based off a particular branch, which is the one synced from our central subversion repository. This actually happens even in distributed development. When something is released it is released from *somewhere*. There is a particular commit on a particular branch in the entire universe of versions of the code that is called the release. We choose for the location of that to be in the subversion repository. This isn't contrary to distributed development at all. It's saying ahead of time that there is a "master" version that things feed up to (there's nothing about distributed development that says a hierarchy of checked-out versions isn't possible; it's just not a requirement or a non-requirement). Built on that basis, the rest still comes down to workflow. At some point, necessary changes have to filter back to the main place from which the release will be done. > They are very different philosophies. > I'm suggesting that Django considers this philosophy of developing > things in a distributed fashion. I'm not suggesting that Django continue > using a centralized repo model, and simply switch from svn to another > tool. I'm sorry if that's what it sounded like. > > A distributed model would mean abandoning this notion of committers and > non-committers, and thus also the concept of a central repository. That's not necessarily something that follows from the definition. It's one way it can work, but it's only the *only* way if you choose a restricted definition of a phrase that is new enough not to have a canonically obvious meaning yet. > There > are plenty of blog posts and documents about this approach to software > development, their benefits, and weaknesses. I highly suggest doing > research on this approach if you aren't terribly familiar with it. > > One way that it *might* work for the Django is each component would have > someone that "watches over" it. That won't really work for us, since we rely heavily on many eyes making things work. Commits to the "final resting place" for things that will ultimately make it into the release give us one checkpoint through which everything passes. Anybody and everybody can watch that and review the code. Many bugs are caught that way. Given the size of our developer and contributor base, the abilities of both and the relatively small size of our code, this is a pretty good model. > Someone would be over the translations, > someone would be over forms, brosner would probably be over the admin > app, etc. Translations I believe is a good example. A translator for a > particular language or locale would update their working copy and > commit. Their changes would get merged into the translation manager's > repo. Generally, a release maintainer would be the one that merges in > stable/completed features into their git repo, so they'd merge in > anything when the translation maintainer says he has more stuff ready. You've just described a hierarchical system of merges that is the same as what we have now. Everything filters up to the subversion repository. You can still use whatever system you like down below and trade back and forth between people using similar systems. The only concession to having something that needs to rebased (the svn -> git conversion, say) is that you don't do your development on the branch that gets updated from upstream, but, rather, merge that into your branches. Remember we're a relatively small project. There doesn't need to be more than a single layer of "formal" hierarchy for merges going into the thing targeted for release. And as each new layer is added, it really does get harder and harder to track what's going on in places you're interested in. > This is very different from the way that things currently work. There > wouldn't really need to be any formal decisions about "who is in" and > "who is out" for commit access. There is nothing at all stopping you from publishing your own repository of Django changes. And pulling changes from whoever you want. So everybody's already a committer on some level. Again, it's a difference in workflow, not capabilities. Right now, the "commit bit" for the central subversion repo just controls who can do the final push to what we use as the basis for a release. It doesn't have any influence over who can develop work, how they do so and who can ultimately propose them for inclusion. I'm personally far from convinced that the features you've outlined add significant extra advantages or remove any of the larger problems we have in our workflow to justify the retraining, community upset and *much* higher barrier to entry that it would require. Don't think of this as "either/or": you can still use DVCS for development of new stuff and the only interaction with subversion is at the interface to the final Django version. Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---