Re: how to checkout proper submodules
Hi all, I just reread this thread again. Is this one of these situations where *almost everyone agrees, but the fix just didn't happen*? In particular, there is still no formal relationship between versions of the compiler and versions of the testsuite that tests it -- that seems odd! Can we please make *testsuite at least *a sub-module? If we count this long email thread as rough consensus, is it just waiting on someone of sufficient authority typing a git submodule add command (and tweaking sync-all accordingly)? Also, Jan's suggestion sounded good -- that once all child repos are git submodules then sync-all can be replaced with something that helps out with git submodule branching, as it helps out with multi-repo branching now (a little bit). Best, -Ryan On Wed, Jun 5, 2013 at 2:02 PM, Jan Stolarek jan.stola...@p.lodz.pl wrote: I think that testsuite should be included in the main GHC repo. I don't recall any other project that has its tests placed in a separate repository. The nhc argument doesn't convince me - after all, most test that are added nowadays are GHC specific. Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Simon and I discussed this a little today. I think there are several legitimate points made throughout the threads here, but the problem is clear: consistent builds are difficult, if not legitimately impossible. That's a very big problem. Right now, it is far too late into release cycle to do anything drastic I'm afraid. Once we branch, we can feasibly start making good changes in this direction. One problem however is that we don't even have a clear writeup over what all the relevant points are (aside from this + all the ranting I did elsewhere, which is loosely in my head still.) Earlier today, I preemptively created this page, but have not jotted down any of my notes: http://ghc.haskell.org/trac/ghc/wiki/GitSubmoduleProblem For a short recap, here is what I think: 1) Several repositories should really just become part of GHC's repository. I'd argue that includes testsuite, nofib, and several others (integer-gmp/integer-simple, hpc, etc.) They don't need to be submodules and making them so is unnecessary complexity, when they can realistically never be used with anything else. This cuts down on something like 10 repositories, IIRC. 2) Several more should become submodules, where 'more' = the libraries under the new Core Libraries Committee. They will be taking over several of the other free floating repositories that are not currently submodules. We no longer will 'own' them, as it is. 3) 'base' and 'ghc-prim' are up for more debate it seems. Roman wants them in particular for haskell-suite, but really he only wants a repository to work with from what I remember. I'm not sure what to do here. Making them a submodule is realistic, but I'm honestly a little afraid of submodules for a package which is so highly traffic'd by developers (another reason I don't want e.g. testsuite as a submodule, either.) The first two points alone should help a lot in making builds more reliable and reproducible, but it will require changes in the development workflow. In particular, it's much easier to lose work with submodules - especially for those among us who are not Git masters. So we should take the time to clearly explain all of this. But 1 2 should cover a large part the current setup, and most repos are very low traffic. Also, I'd like to take the time to have a discussion with Edward Kmett (who I have CC'd) about point 2 to make sure we're on the same page here. But I haven't done this yet. Point 3 seems to really be the most contentious, since a few other things come with it. Should we give up on 'base' being usable by other compilers? Historically that's why it's separate. But really it's easy to write code against 'base' that will never work with another compiler anyway. But maybe that can be fixed. And will the base split - also slated for post 7.8 - also change the ownership of significant parts of the library, based on how it is implemented? There were several things floating around this. Regardless of point 3 and all that, something should and will be done soon. I'll put this up on the wiki later when I have time. We just need a directly spelled out plan of attack. On Thu, Aug 22, 2013 at 2:04 PM, Ryan Newton rrnew...@gmail.com wrote: Hi all, I just reread this thread again. Is this one of these situations where *almost everyone agrees, but the fix just didn't happen*? In particular, there is still no formal relationship between versions of the compiler and versions of the testsuite that tests it -- that seems odd! Can we please make *testsuite at least *a sub-module? If we count this long email thread as rough consensus, is it just waiting on someone of sufficient authority typing a git submodule add command (and tweaking sync-all accordingly)? Also, Jan's suggestion sounded good -- that once all child repos are git submodules then sync-all can be replaced with something that helps out with git submodule branching, as it helps out with multi-repo branching now (a little bit). Best, -Ryan On Wed, Jun 5, 2013 at 2:02 PM, Jan Stolarek jan.stola...@p.lodz.plwrote: I think that testsuite should be included in the main GHC repo. I don't recall any other project that has its tests placed in a separate repository. The nhc argument doesn't convince me - after all, most test that are added nowadays are GHC specific. Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: how to checkout proper submodules
There was a long discussion about this a couple of months ago. It did not reach a conclusion, but it is merely parked, not abandoned. I hope that you can all pick it up again after the release. Simon From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Austin Seipp Sent: 22 August 2013 20:31 To: Ryan Newton Cc: ghc-devs@haskell.org; Edward Kmett Subject: Re: how to checkout proper submodules Simon and I discussed this a little today. I think there are several legitimate points made throughout the threads here, but the problem is clear: consistent builds are difficult, if not legitimately impossible. That's a very big problem. Right now, it is far too late into release cycle to do anything drastic I'm afraid. Once we branch, we can feasibly start making good changes in this direction. One problem however is that we don't even have a clear writeup over what all the relevant points are (aside from this + all the ranting I did elsewhere, which is loosely in my head still.) Earlier today, I preemptively created this page, but have not jotted down any of my notes: http://ghc.haskell.org/trac/ghc/wiki/GitSubmoduleProblem For a short recap, here is what I think: 1) Several repositories should really just become part of GHC's repository. I'd argue that includes testsuite, nofib, and several others (integer-gmp/integer-simple, hpc, etc.) They don't need to be submodules and making them so is unnecessary complexity, when they can realistically never be used with anything else. This cuts down on something like 10 repositories, IIRC. 2) Several more should become submodules, where 'more' = the libraries under the new Core Libraries Committee. They will be taking over several of the other free floating repositories that are not currently submodules. We no longer will 'own' them, as it is. 3) 'base' and 'ghc-prim' are up for more debate it seems. Roman wants them in particular for haskell-suite, but really he only wants a repository to work with from what I remember. I'm not sure what to do here. Making them a submodule is realistic, but I'm honestly a little afraid of submodules for a package which is so highly traffic'd by developers (another reason I don't want e.g. testsuite as a submodule, either.) The first two points alone should help a lot in making builds more reliable and reproducible, but it will require changes in the development workflow. In particular, it's much easier to lose work with submodules - especially for those among us who are not Git masters. So we should take the time to clearly explain all of this. But 1 2 should cover a large part the current setup, and most repos are very low traffic. Also, I'd like to take the time to have a discussion with Edward Kmett (who I have CC'd) about point 2 to make sure we're on the same page here. But I haven't done this yet. Point 3 seems to really be the most contentious, since a few other things come with it. Should we give up on 'base' being usable by other compilers? Historically that's why it's separate. But really it's easy to write code against 'base' that will never work with another compiler anyway. But maybe that can be fixed. And will the base split - also slated for post 7.8 - also change the ownership of significant parts of the library, based on how it is implemented? There were several things floating around this. Regardless of point 3 and all that, something should and will be done soon. I'll put this up on the wiki later when I have time. We just need a directly spelled out plan of attack. On Thu, Aug 22, 2013 at 2:04 PM, Ryan Newton rrnew...@gmail.commailto:rrnew...@gmail.com wrote: Hi all, I just reread this thread again. Is this one of these situations where almost everyone agrees, but the fix just didn't happen? In particular, there is still no formal relationship between versions of the compiler and versions of the testsuite that tests it -- that seems odd! Can we please make testsuite at least a sub-module? If we count this long email thread as rough consensus, is it just waiting on someone of sufficient authority typing a git submodule add command (and tweaking sync-all accordingly)? Also, Jan's suggestion sounded good -- that once all child repos are git submodules then sync-all can be replaced with something that helps out with git submodule branching, as it helps out with multi-repo branching now (a little bit). Best, -Ryan On Wed, Jun 5, 2013 at 2:02 PM, Jan Stolarek jan.stola...@p.lodz.plmailto:jan.stola...@p.lodz.pl wrote: I think that testsuite should be included in the main GHC repo. I don't recall any other project that has its tests placed in a separate repository. The nhc argument doesn't convince me - after all, most test that are added nowadays are GHC specific. Janek ___ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc
Re: how to checkout proper submodules
Ok, resuming after release makes sense. Regarding whether it reached a conclusion: What struck me about this particular discussion was the *lack* of disagreement (relative to say, the records debate). It seemed like no one was arguing for the status quo and just about everyone agreed that moving to all-submodules is better than the current mix. Still, one could argue that making an improvement is premature if (1) there is significant transition cost to make the change, or (2) it puts you on some kind of local optima that makes it harder to get to a higher peak. Yet in the case of all-submodules vs. ugly-mix, the transition cost is very low, and it doesn't preclude any future improvements. (For example, it is completely reasonable to later decide to copy certain modules into the tree rather than using submodules.) But maybe I'm under-estimating the severity of the anti-submodule grumbling... that is, I may not not be accurately distinguishing the submodules have their annoyances but they are the lesser evil opinion from I will adamantly oppose adding any more submodules. Best, -Ryan On Thu, Aug 22, 2013 at 4:14 PM, Simon Peyton-Jones simo...@microsoft.comwrote: There was a long discussion about this a couple of months ago. It did not reach a conclusion, but it is merely parked, not abandoned. I hope that you can all pick it up again after the release. ** ** Simon ** ** *From:* ghc-devs [mailto:ghc-devs-boun...@haskell.org] *On Behalf Of *Austin Seipp *Sent:* 22 August 2013 20:31 *To:* Ryan Newton *Cc:* ghc-devs@haskell.org; Edward Kmett *Subject:* Re: how to checkout proper submodules ** ** Simon and I discussed this a little today. I think there are several legitimate points made throughout the threads here, but the problem is clear: consistent builds are difficult, if not legitimately impossible. That's a very big problem. ** ** Right now, it is far too late into release cycle to do anything drastic I'm afraid. Once we branch, we can feasibly start making good changes in this direction. One problem however is that we don't even have a clear writeup over what all the relevant points are (aside from this + all the ranting I did elsewhere, which is loosely in my head still.) Earlier today, I preemptively created this page, but have not jotted down any of my notes: http://ghc.haskell.org/trac/ghc/wiki/GitSubmoduleProblem ** ** For a short recap, here is what I think: ** ** 1) Several repositories should really just become part of GHC's repository. I'd argue that includes testsuite, nofib, and several others (integer-gmp/integer-simple, hpc, etc.) They don't need to be submodules and making them so is unnecessary complexity, when they can realistically never be used with anything else. This cuts down on something like 10 repositories, IIRC. ** ** 2) Several more should become submodules, where 'more' = the libraries under the new Core Libraries Committee. They will be taking over several of the other free floating repositories that are not currently submodules. We no longer will 'own' them, as it is. ** ** 3) 'base' and 'ghc-prim' are up for more debate it seems. Roman wants them in particular for haskell-suite, but really he only wants a repository to work with from what I remember. I'm not sure what to do here. Making them a submodule is realistic, but I'm honestly a little afraid of submodules for a package which is so highly traffic'd by developers (another reason I don't want e.g. testsuite as a submodule, either.) ** ** The first two points alone should help a lot in making builds more reliable and reproducible, but it will require changes in the development workflow. In particular, it's much easier to lose work with submodules - especially for those among us who are not Git masters. So we should take the time to clearly explain all of this. But 1 2 should cover a large part the current setup, and most repos are very low traffic. Also, I'd like to take the time to have a discussion with Edward Kmett (who I have CC'd) about point 2 to make sure we're on the same page here. But I haven't done this yet. ** ** Point 3 seems to really be the most contentious, since a few other things come with it. Should we give up on 'base' being usable by other compilers? Historically that's why it's separate. But really it's easy to write code against 'base' that will never work with another compiler anyway. But maybe that can be fixed. And will the base split - also slated for post 7.8 - also change the ownership of significant parts of the library, based on how it is implemented? There were several things floating around this. ** ** Regardless of point 3 and all that, something should and will be done soon. I'll put this up on the wiki later when I have time. We just need a directly spelled out plan of attack. ** ** ** ** On Thu, Aug 22, 2013 at 2:04 PM, Ryan
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 09/06/13 17:51, Ian Lynagh wrote: On Sun, Jun 09, 2013 at 11:15:37AM -0500, Austin Seipp wrote: I'm referring to Joachim Breitner's work on splitting the base. So what's the timeline here? As soon as possible after 7.8 is branched. Has there been a decision somewhere on what to do? The wiki page sets out the parameters of the design, but doesn't have any conclusions that I could see. Splitting base has the potential to be extremely destabilising, I want to make sure that we're getting appreciable benefits in exchange. Cheers, Simon ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi, We misunderstood that the new IO manager was not working properly. This is our fault. We confirmed that it is working well. Sorry for bothering you, guys. Anyway, I believe we need a way to check out proper submodules as many others said. --Kazu Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 06/12/2013 12:37 PM, Ian Lynagh wrote: On Wed, Jun 12, 2013 at 12:54:38AM +0200, Daniel Trstenjak wrote: I guess [the merge commits] may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). Just to clarify, my problem isn't so much that there are merge commits (although it would still be nicer if there weren't), but that it is hard to see whether we are in the same state as upstream, or to see what the differences between us and upstream are. I don't quite understand how you should get rid of these merge commits by using submodules, With submodules we can do cd libraries/Cabal git reset --hard an upstream commit id cd .. git commit -a and we will jump to that commit, without needing to merge it with the commit that we were at before. You can get rid of these merge commit by using the '--rebase' option of git-pull. We can't rebase, as these patches are in everyone else's GHC tree. Only if you have pushed the ghc tree. If it is only local, then rebasing is just fine. And, I would argue, desirable. For the record, I am in favor of moving everything to submodules. Geoff ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 01:13:37PM +0200, Daniel Trstenjak wrote: On Mon, Jun 10, 2013 at 11:45:22AM +0100, Ian Lynagh wrote: Is this possible with subtrees?: * Initially ghc's Cabal repo is at the same commit as upstream * We make a local commit 123 in Cabal to fix some bug * Cabal upstream makes a commit 456 to fix the same bug differently * We jump to commit 456, in such a way that we don't end up merging with our 123 commit every time we pull from Cabal in the future Yes. Every repository that's added by git-subtree to your repository is represented as a separate branch. So everything that applies to the merging of branches also applies to the merging by git-subtree. I didn't follow that. Here's an example of what happens with just a plain git repo, with no branches, submodules or subrepos involved: -8--8--8--8- upstream$ git init upstream$ echo content file upstream$ git add file upstream$ git commit -a -m initial $ git clone upstream ghc $ cd ghc ghc$ echo fix1 file ghc$ git commit -a -m fix1 upstream$ echo fix2 file upstream$ git commit -a -m fix2 ghc$ git pull --no-edit -X theirs upstream$ echo feature1 file upstream$ git commit -a -m feature1 ghc$ git pull --no-edit -X theirs upstream$ echo feature2 file upstream$ git commit -a -m feature2 ghc$ git pull --no-edit -X theirs -8--8--8--8- At the end of this, you'll see that the ghc repo has a number of merge commits. I guess they may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). Thanks Ian -- Ian Lynagh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Ian, I guess they may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). I don't quite understand how you should get rid of these merge commits by using submodules, because at the end every submodule is just a git repository and behaves in the same way as every other git repository for merges. You can get rid of these merge commit by using the '--rebase' option of git-pull. I put your git command lines into the attached script 'ghc_git_test'. Now you can get your version and the version using '--rebase' by calling: mkdir your_version rebase_version cd your_version ghc_git_test -X theirs cd ../rebase_version ghc_git_test --rebase -X ours You will certainly ask why it's 'ours' instead of 'theirs' for the rebase case, well, that's one of the quite counterintuitive things in the git user interface. Greetings, Daniel #!/usr/bin/env bash mkdir upstream ghc cd upstream git init echo content file git add file git commit -a -m initial cd .. git clone upstream ghc cd ghc echo fix1 file git commit -a -m fix1 cd ../upstream echo fix2 file git commit -a -m fix2 cd ../ghc git pull --no-edit $@ cd ../upstream echo feature1 file git commit -a -m feature1 cd ../ghc git pull --no-edit $@ cd ../upstream echo feature2 file git commit -a -m feature2 cd ../ghc git pull --no-edit $@ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 1:32 PM, Roman Cheplyaka r...@ro-che.info wrote: * John Lato jwl...@gmail.com [2013-06-10 07:59:55+0800] On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. After base is in GHC, separating it again will be only harder, not easier. Or do you have a specific plan in mind? It's more about motivation. It seems to me right now base is in a halfway state. People think that moving it further away from ghc is The Right Thing To Do, but nobody is feeling enough pain to be sufficiently motivated to do it. If we apply pain, then someone will be motivated to do it properly. And if nobody steps up, maybe having a platform-agnostic base isn't really very important. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi Geoffrey, I am of the opinion that major feature branches should be rebased *and* that they should then be merged with --no-ff. I totally agree with you. :-) --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi, I think you've to differentiate the case of merging a feature branch into the master branch and the case of merging a local with a remote branch, like just calling git pull/push on the master branch. I just wanted to say that first forward merge loses information about which sequence of commits was a topic branch. As far as I'm concerned, I rebase my topic branch by myself before I send a pull request. Therefore I'm using 'git pull --rebase' to prevent the creation of these merge commits. I think this is a good practice for puller side. :-) --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, 2013-06-10 at 11:45 +0100, Ian Lynagh wrote: Side note: the fingerprint script *didn't even work* for almost a year after it was introduced; see commit 73ce2e70. Which implies that wanting to go back in time is rare, so making it easy should be given low weight when considering the options? If 'git bisect' would work (out of the box) on the GHC repo, going back in time would certainly be a more common operation. Nicolas ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 06/10/2013 11:49 AM, Nicolas Trangez wrote: On Mon, 2013-06-10 at 11:45 +0100, Ian Lynagh wrote: Side note: the fingerprint script *didn't even work* for almost a year after it was introduced; see commit 73ce2e70. Which implies that wanting to go back in time is rare, so making it easy should be given low weight when considering the options? If 'git bisect' would work (out of the box) on the GHC repo, going back in time would certainly be a more common operation. I agree. Going back in time is really, really hard with fingerprints because you have to get the fingerprint files somewhere, and they don't always exist. Also, it could be the case that people used the fingerprint files to bisect but didn't notice they weren't quite right because the fingerprints were close enough. OK for bug-finding, terrible for reproduceable builds. Many people on the list have been quite vocal about wanting to be able to bisect. *I* have wanted to be able to bisect many, many times, but I don't because it's such a pain. I also want to be able to tell people how to build branches of ghc that I am working on, e.g., the simd and th-new branches. That means having to store a fingerprint file somewhere public and keep it in sync with my tree. I would much rather just tell them to check out the foo branch of ghc and be done with it. Geoff ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On 08/06/13 08:38, Geoffrey Mainland wrote: On 06/06/2013 09:44 PM, Simon Marlow wrote: On 05/06/13 16:59, Ian Lynagh wrote: On Tue, Jun 04, 2013 at 09:05:58PM -0500, Austin Seipp wrote: I know we had this discussion sometime recently I think, but can someone *please* explain why we are in this situation of half submodules, half random-floating-git-repository-checkouts? Submodules are very handy for libraries that someone else maintains: We can make a local change to the library when we need something fixed, and then, when upstream has a fix too, we can jump straight to their fix without having to do any merging. However, submodules have various disadvantages, e.g. http://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ The main one for me is that it's fairly easy to lose local changes when using submodules. This is relatively unimportant for the libraries that someone else maintains, as we don't often make any local changes to lose. Even so, I've lost changes on a couple of occasions. Drive-by-comment: 'sync-all new' doesn't work since we switched to submodules. If someone could fix that I'd be very grateful (or alternatively tell me what workflow you use to figure out what patches you have in your local repos that aren't upstream). Another thing that annoys me about submodules is that I like to keep a local mirror of the GHC repos on my computer. When I clone from it, the submodules all come from darcs.haskell.org instead of my local mirror. I know how to fix this by hand, but it's sync-all's job to get this right (it does for the other repos). Cheers, Simon Yes, I have hit this problem too. It's the cause of many of the nightly build failures at GHC HQ. Does anyone know how to get git-submodule to use a mirror? There is the --reference option to 'git submodule update', but I think it still needs a network connection. IIRC, you have to manually edit the .git/config file at the correct time (after git submodule init, but before the pull). But sync-all doesn't stop between these two steps, so it's a bit more fiddly. Cheers, Simon Geoff So the reason we entered this state is that we didn't think the advantages outweighed the disadvantages for the other repositories. Thanks Ian ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Austin, I apologize for not having read the full email yet (I'm in a hurry right now), but... * Austin Seipp ase...@pobox.com [2013-06-09 00:23:22-0500] -- Let's just put base and testsuite inside the GHC repository directly. No submodules, no floating repos. Just put it directly inside and make a super commit, I guess. GHC becomes the de facto repository. And hey, why not nofib? I know, I know. People really want to split the maintenance burdens I guess, and ideologically the Haskell community is all about clean separation but, please? All of GHC HQ are the de facto maintainers of this stuff anyway. And as Jan mentioned, testsuite is really *so* crucial GHC should have it inline. The testsuite is perhaps the most important of all. There are other candidates for this treatment too, really. For example, why is template-haskell, ghc-prim, and hpc split out? GHC is the only thing that supports them. template-haskell is especially super-intrusive of an extension to support, and arguably hpc as well. integer-simple and integer-gmp follow the exact same story. Same with hoopl and dph. They're all ours. We own them. Just put them all inside GHC and be done with it. Having active fragmentation in the VCS is not necessary when there need be none. These packages de-facto ship with GHC and are very tied to it. I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Roman, On Sun, Jun 9, 2013 at 1:44 AM, Roman Cheplyaka r...@ro-che.info wrote: I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. I hate being that person but, if the purpose of these forks is to work around specific bugs in HSE and/or fix problems with name resolution of GHC-specific terms, which sort of seems to be the case from the log, I don't think hacking base co. is a long term solution. It could potentially need infinite ongoing maintenance. I went down this road with LHC too. And my gut feeling is that hacking ghc-prim out-of-band feels so amazingly wrong I'm frankly not sure if I need to fork it can actually warrant a huge amount of sympathy, to the point of keeping the repository separate for that 1 fork in existence (granted, ghc-prim is still pretty low traffic. But base is not.) If you DO need help from GHC, is there really nothing we could easily and reasonably do to further assist you? I think asking for specific, principled solutions on our part is not out of the question here. Are there any other forks of base people have for any particular reason? What reasons are those? But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. To be clear, packages and their numbers aren't *really* the problem. It's repositories. The numbers just make this slightly worse. Adding packages and adding repositories both add overhead. Adding repositories adds a significantly *larger* amount of complexity, all things considered. The only honest, legitimate way to reduce that complexity is to fold in repositories. But this means that we have to give something up, too. If base were to get split into 5 packages or 8 packages, that's potentially fine by me, even welcomed. What I don't want is 5 more repositories that are all intimately tied to GHC's build and features, which a majority of GHC-specific work will be driven towards, and over time that we then must manage and synchronize heavily. That's just a massive amount of work. Just looking at Joachim's fork of base on github, I already have some reservations about its current implementation. Like, base-float still exports GHC-specific namespaces. Every package still has a lot GHC specific code, as opposed to some isolated substrate that we provide and base-* packages interface with. So we're going to maintain all of that, it's the sad truth. And if Joachim's patch were merged tomorrow somehow, I think that frankly so much of it would still be under GHC control, my argument would still stand. It would still be one repository. We would still own it. It makes base more granular, but this has almost nothing to do with our real problems. Fixing all of that where we're not *actually* in control of it is a ton of work. The current patches just don't solve that I think. And this was last discussed in February? So what's the timeline here? Clearly we're not even done with the API discussion at all. So, 6 months? A year? Who knows? When it's done? I'm not sure most of us want to wait that long, especially considering the need to track down bugs and have accurate historical logs is a fairly frequent occurrence. Roman -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Sun, Jun 9, 2013 at 3:47 AM, Jan Stolarek jan.stola...@p.lodz.pl wrote: I admire your talent for writing emails ;-) You can be honest and just call them what they are: horribly written novellas. As you wrote in your email I'm totally for including testsuite into GHC, because it is essentially part of GHC and it doesn't make sense to have a version of testsuite not corresponding to a version of GHC. As you pointed out the same argument can be used for other packages, but still there is one thing I don't like about that idea. What if an average haskeller wants to improve one of the libraries e.g. by adding comments or fixing a minor bug? If we have a super-repo that person would need to check out everything, which is discouraging. This is a good point I hadn't considered, but it's less of a worry for some packages than others. For example, base, ghc-prim and template-haskell are so intimately tied into GHC that reinstalling them is either impossible or a bad idea. To change them, you must build your own GHC anyway (either from source, or HEAD.) And if you're using a Haskell Platform compiler, clearly you'd have no luck with the git repository anyway (due to their strong interdependence.) But again, I'm totally OK with a lot of these other repositories being submodules. For example, process, unix, deepseq, filepath, directory. Those don't need to be folded in. Lots of them could have their own maintainers with separate upstreams. They're touched infrequently enough traffic concerns aren't as much of a deal. I just want the most high-traffic'd repositories dealt with, because in practice these are the *most* critical and the most interdependent. That in turn leads to the most problems. Another, separate issue here is that such a person needs to either register to ghc-devs or trac to send a patch. Using github would be helpful here, though I agree with Geoffrey about merge commits - we'd have to think of sth here. Also, the fact that GHC HQ is maintaining all of the mentioned packages doesn't mean that they need to be stored in one repo, at least not in git (this would make more sense to me with SVN where you can checkout a subdirectory). Not necessarily, the 'owners' of the packages are still the libraries committee. People can propose changes there as they have always done. It just so happens most of the 'libraries' maintained packages are de-facto maintained by GHC people. You're right not all of them need to be folded in. But I think several of them should be, and these are the ones that hurt the most. (Plus, my radical proposal can't be considered totally, completely radical unless I propose something which would - of course - be shot down.) Still, I strongly agree that sth should be done about current setup. I'm not a git guru so I cannot fully foresee what would be the consequences of turning everything into submodules, but I think that it cannot be worse than it is now, right? For some submodules, it could certainly be worse. Please see Ian's link in the prior discussion concerning submodules - for high-traffic repositories, some of the concerns are disconcerning. Jan Dnia niedziela, 9 czerwca 2013, Roman Cheplyaka napisał: Hi Austin, I apologize for not having read the full email yet (I'm in a hurry right now), but... * Austin Seipp ase...@pobox.com [2013-06-09 00:23:22-0500] -- Let's just put base and testsuite inside the GHC repository directly. No submodules, no floating repos. Just put it directly inside and make a super commit, I guess. GHC becomes the de facto repository. And hey, why not nofib? I know, I know. People really want to split the maintenance burdens I guess, and ideologically the Haskell community is all about clean separation but, please? All of GHC HQ are the de facto maintainers of this stuff anyway. And as Jan mentioned, testsuite is really *so* crucial GHC should have it inline. The testsuite is perhaps the most important of al There are other candidates for this treatment too, really. For example, why is template-haskell, ghc-prim, and hpc split out? GHC is the only thing that supports them. template-haskell is especially super-intrusive of an extension to support, and arguably hpc as well. integer-simple and integer-gmp follow the exact same story. Same with hoopl and dph. They're all ours. We own them. Just put them all inside GHC and be done with it. Having active fragmentation in the VCS is not necessary when there need be none. These packages de-facto ship with GHC and are very tied to it. I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Sun, Jun 09, 2013 at 11:15:37AM -0500, Austin Seipp wrote: I'm referring to Joachim Breitner's work on splitting the base. So what's the timeline here? As soon as possible after 7.8 is branched. Thanks Ian ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
You can be honest and just call them what they are: horribly written novellas. Actually, I was thinking that instead of posting to the list you might consider publishing your emails as papers on workshops or symposia ;) for high-traffic repositories, some of the concerns are disconcerning. But the high-traffic repositories (base, testsuite) are already submodules, right? For me the major problem of the current setup is that we cannot use one of the most important features of a VCS, i.e. going back in time. The only solutions to this problem that I am aware of are folding or turning into submodules all libraries that GHC depends on. I just had this moment of enlightment that the question of including a repo as a submodule (or folding it into GHC tree) is not a matter of traffic, but a matter of that library's implementation. If it uses GHC-specific API then it goes in, because it is tightly-coupled. If it is implemented in standard Haskell then it can stay out, because changes to compiler should not affect it. This is pretty simple criterium to identify libraries that we should be concerned with (perhaps this is obvious, but it only occured to me now). So a high-traffic repo that does not depend on non-standard features of GHC could still be kept as an in-tree repo, without affecting the ability to go back in time. Jan ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Oh, and I've been made aware that git 1.7 and later can checkout a subdirectory of a repo - this partially invalidates my previous argument. I'm saying partially, because it is a bit more difficult than dealing with a library that has its own repo + it seems that some potential contributors might not be aware of this feature (like me today in the morning). Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
* Austin Seipp ase...@pobox.com [2013-06-09 11:15:37-0500] Hi Roman, On Sun, Jun 9, 2013 at 1:44 AM, Roman Cheplyaka r...@ro-che.info wrote: I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. I hate being that person but, if the purpose of these forks is to work around specific bugs in HSE and/or fix problems with name resolution of GHC-specific terms, which sort of seems to be the case from the log, I don't think hacking base co. is a long term solution. It could potentially need infinite ongoing maintenance. I went down this road with LHC too. It is only partly to work around bugs in HSE. The second part is to work around bugs and quirks in base itself. There are places where CPP wouldn't produce meaningful code unless __GLASGOW_HASKELL__ is defined, for example. Even ignoring those obvious bugs for a minute, currently the large part of base is defined under GHC.* hierarchy and isn't available unless __GLASGOW_HASKELL__ is defined. But okay, let's suppose that at some point everything is fixed and we don't have to *fork* base. We still would like to use it! Should we fetch the whole GHC tree in order to get its development version? And my gut feeling is that hacking ghc-prim out-of-band feels so amazingly wrong I'm frankly not sure if I need to fork it can actually warrant a huge amount of sympathy, to the point of keeping the repository separate for that 1 fork in existence (granted, ghc-prim is still pretty low traffic. But base is not.) It *is* wrong, but who is to blame that a big part of Prelude comes from there, including all logical operations and classes Eq and Ord? If you DO need help from GHC, is there really nothing we could easily and reasonably do to further assist you? I think asking for specific, principled solutions on our part is not out of the question here. The best help would be to make and keep base relatively portable and not to introduce superfluous conditional compilation. (I realise that a lot of that has just accumulated historically, but now is a good time to get rid of it.) It is a ton of work, and I'm very happy when I see people like Joachim trying to do something in that direction. Right now I'm only asking not to make their work even harder by moving base under the ghc repository. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. To be clear, packages and their numbers aren't *really* the problem. What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
* John Lato jwl...@gmail.com [2013-06-10 07:59:55+0800] On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. After base is in GHC, separating it again will be only harder, not easier. Or do you have a specific plan in mind? Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On 06/06/2013 09:44 PM, Simon Marlow wrote: On 05/06/13 16:59, Ian Lynagh wrote: On Tue, Jun 04, 2013 at 09:05:58PM -0500, Austin Seipp wrote: I know we had this discussion sometime recently I think, but can someone *please* explain why we are in this situation of half submodules, half random-floating-git-repository-checkouts? Submodules are very handy for libraries that someone else maintains: We can make a local change to the library when we need something fixed, and then, when upstream has a fix too, we can jump straight to their fix without having to do any merging. However, submodules have various disadvantages, e.g. http://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ The main one for me is that it's fairly easy to lose local changes when using submodules. This is relatively unimportant for the libraries that someone else maintains, as we don't often make any local changes to lose. Even so, I've lost changes on a couple of occasions. Drive-by-comment: 'sync-all new' doesn't work since we switched to submodules. If someone could fix that I'd be very grateful (or alternatively tell me what workflow you use to figure out what patches you have in your local repos that aren't upstream). Another thing that annoys me about submodules is that I like to keep a local mirror of the GHC repos on my computer. When I clone from it, the submodules all come from darcs.haskell.org instead of my local mirror. I know how to fix this by hand, but it's sync-all's job to get this right (it does for the other repos). Cheers, Simon Yes, I have hit this problem too. It's the cause of many of the nightly build failures at GHC HQ. Does anyone know how to get git-submodule to use a mirror? There is the --reference option to 'git submodule update', but I think it still needs a network connection. Geoff So the reason we entered this state is that we didn't think the advantages outweighed the disadvantages for the other repositories. Thanks Ian ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On 5 Jun 2013, at 16:47, Austin Seipp wrote: testsuite and base are also useful for other compilers, such as nhc98 (and indeed, nhc uses base itself.) Useful, perhaps, but not actually used in practice. Since the base library repo moved from darcs to git, I think that ghc is the only compiler that uses it. (Maybe the jhc, uhc, or Helium people could refute that though.) For a long, long time, the close coupling between ghc and the base library has been obvious. I have long since given up trying to pretend that base is portable - it is not. It is ghc-specific. I don't think it should be. That is a crazy architecture. But it is the way it is. Maybe it is time for everyone else to stop pretending too. Regards, Malcolm ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Daniel, On Wed, 2013-06-05 at 15:49 +0200, Daniel Trstenjak wrote: Hi Nicolas, On Wed, Jun 05, 2013 at 03:27:09PM +0200, Nicolas Trangez wrote: As my experience with submodules is positive (though limimted), could you elaborate on the difficulties/hassle here? If you would like to develop some kind of feature which involves changes on multiple repositories/submodules and you would like to do it in a branch, than you have to create a branch in each repository, commit separately in each repository and than merge back each repository into its master branch. Right, thanks for the explanation. This might indeed be somewhat inconvenient. On the other hand, the current situation (with sync-all etc) doesn't seem very different from a workflow perspective, except for being unable to easily run bisect :-) Nicolas ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi Kazu, On Thu, Jun 06, 2013 at 10:42:03AM +0900, Kazu Yamamoto wrote: Please read A successful Git branching model to know why fast-forward is not used recently. I think you've to differentiate the case of merging a feature branch into the master branch and the case of merging a local with a remote branch, like just calling git pull/push on the master branch. A fast-forward in the case of merging a feature branch is loosing information, because you can't see anymore which commits have been involved in developing a feature. In the second case, merging a local with a remote branch, you gain no information by the merge commits, but just mess up your history. Therefore I'm using 'git pull --rebase' to prevent the creation of these merge commits. Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On 05/06/13 16:59, Ian Lynagh wrote: On Tue, Jun 04, 2013 at 09:05:58PM -0500, Austin Seipp wrote: I know we had this discussion sometime recently I think, but can someone *please* explain why we are in this situation of half submodules, half random-floating-git-repository-checkouts? Submodules are very handy for libraries that someone else maintains: We can make a local change to the library when we need something fixed, and then, when upstream has a fix too, we can jump straight to their fix without having to do any merging. However, submodules have various disadvantages, e.g. http://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ The main one for me is that it's fairly easy to lose local changes when using submodules. This is relatively unimportant for the libraries that someone else maintains, as we don't often make any local changes to lose. Even so, I've lost changes on a couple of occasions. Drive-by-comment: 'sync-all new' doesn't work since we switched to submodules. If someone could fix that I'd be very grateful (or alternatively tell me what workflow you use to figure out what patches you have in your local repos that aren't upstream). Another thing that annoys me about submodules is that I like to keep a local mirror of the GHC repos on my computer. When I clone from it, the submodules all come from darcs.haskell.org instead of my local mirror. I know how to fix this by hand, but it's sync-all's job to get this right (it does for the other repos). Cheers, Simon So the reason we entered this state is that we didn't think the advantages outweighed the disadvantages for the other repositories. Thanks Ian ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
RE: how to checkout proper submodules
For the avoidance of doubt, I totally support what Austin and Johan are saying: I find the current setup confusing too. I'm totally persuaded of the merits of git bisect etc. I am the opposite of a git power-user (a git weedy-user?). I will be content to do whatever I'm told workflow-wise, provided I am told clearly in words of one syllable. I *very strongly* want to reduce barriers to entry for would-be contributors, and this is clearly a barrier we could lower. Making Kazu, Austin, Johan, etc more productive is massively valuable. There may be some history to how we arrived at this point, but that should not constrain for the future. We can change our workflow. I would want Ian and Simon to be thoroughly on board, but I regard the current setup as totally open to improvement. Please! BTW, Ian has written it up quite carefully here: http://hackage.haskell.org/trac/ghc/wiki/Repositories, and the linked page http://hackage.haskell.org/trac/ghc/wiki/Repositories/Upstream. Simon | -Original Message- | From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] | On Behalf Of Austin Seipp | Sent: 05 June 2013 07:35 | To: Johan Tibell | Cc: ghc-devs@haskell.org | Subject: Re: how to checkout proper submodules | | I absolutely agree here, FWIW. We should only do this if there is a | clear consensus on doing so and everyone doing active development is | comfortable with it. And it's entirely possible submodules are | inadequate for some reason that I'm not aware of which is a | show-stopper. | | However, the notion of impact-on-contributors cuts both ways. GHC has | an extremely small team of hackers as it stands, and we are lucky to | have *amazing* contributors like Kazu, Andreas, yourself, Simon | Simon, and numerous others help make GHC what it is. Much of this is | volunteer work. But as the Haskell community grows, and we are at a | loss of other full-time contributors like Simon Marlow, I think we are | beginning to see the strain on GHC and its current contributors. So, | it's important to evaluate what we're doing right and wrong. This | feedback loop is always present even if seasoned contributors can live | with it - but new contributors will definitely be impacted. | | In this instance, I honestly find it disheartening that the answer to | things like getting older revisions of the source code in HEAD, or | techniques like bisection is basically that doesn't work. The second | is unfortunate, but the latter is pretty legitimately worrying. It | would be one thing if this was a one-off occurrence of some odd | developer-workflow. But I have answered the fundamental question here | (submodules vs free-floating clones) a handful of times myself at | least, experienced the pain of the decision myself when doing | rollbacks, and I'm sure other contributors can say the same. | | GHC is already a large, industry-strength software project with years | of work put behind it. The barrier to entry and contribution is not | exactly small, but I think we've all done a good job. I'd love to see | more people contributing. But I cannot help but find these discussions | a bit sad, where contributors are impaired due to regular/traditional | development workflows like rollbacks are rendered useless - due to | some odd source control discrepancy that nobody else on the planet | seems to suffer from. | | I guess the short version is basically that that you're absolutely | right: the time of Simon, Ian, and other high-profile contributors is | *extremely* important. But I'd also rather not have people like Kazu | potentially spend hours or even days doing what simple automation can | achieve in what is literally a few keystrokes, and not only that - par | for the course for other projects. This ultimately impacts the | development cycles of *everybody*. And even if Kazu deals with it - | what about the next person? | | On Wed, Jun 5, 2013 at 12:12 AM, Johan Tibell johan.tib...@gmail.com | wrote: | The latest git release has improved submodules support some so if we now | thing the benefits of submodules outweigh the costs we can discuss if we | want to change to policy. I don't want to make that decision for other GHC | developers that spend much more time on GHC than I (e.g. SPJ). Their | productivity is more important than any inconveniences the lack of | consistent use of submodules might cause me. | | | -- | Regards, | Austin - PGP: 4096R/0x91384671 | | ___ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) Simon Peyton-Jones simo...@microsoft.com: For the avoidance of doubt, I totally support what Austin and Johan are saying: I find the current setup confusing too. I'm totally persuaded of the merits of git bisect etc. I am the opposite of a git power-user (a git weedy-user?). I will be content to do whatever I'm told workflow-wise, provided I am told clearly in words of one syllable. I *very strongly* want to reduce barriers to entry for would-be contributors, and this is clearly a barrier we could lower. Making Kazu, Austin, Johan, etc more productive is massively valuable. There may be some history to how we arrived at this point, but that should not constrain for the future. We can change our workflow. I would want Ian and Simon to be thoroughly on board, but I regard the current setup as totally open to improvement. Please! BTW, Ian has written it up quite carefully here: http://hackage.haskell.org/trac/ghc/wiki/Repositories, and the linked page http://hackage.haskell.org/trac/ghc/wiki/Repositories/Upstream. Simon | -Original Message- | From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] | On Behalf Of Austin Seipp | Sent: 05 June 2013 07:35 | To: Johan Tibell | Cc: ghc-devs@haskell.org | Subject: Re: how to checkout proper submodules | | I absolutely agree here, FWIW. We should only do this if there is a | clear consensus on doing so and everyone doing active development is | comfortable with it. And it's entirely possible submodules are | inadequate for some reason that I'm not aware of which is a | show-stopper. | | However, the notion of impact-on-contributors cuts both ways. GHC has | an extremely small team of hackers as it stands, and we are lucky to | have *amazing* contributors like Kazu, Andreas, yourself, Simon | Simon, and numerous others help make GHC what it is. Much of this is | volunteer work. But as the Haskell community grows, and we are at a | loss of other full-time contributors like Simon Marlow, I think we are | beginning to see the strain on GHC and its current contributors. So, | it's important to evaluate what we're doing right and wrong. This | feedback loop is always present even if seasoned contributors can live | with it - but new contributors will definitely be impacted. | | In this instance, I honestly find it disheartening that the answer to | things like getting older revisions of the source code in HEAD, or | techniques like bisection is basically that doesn't work. The second | is unfortunate, but the latter is pretty legitimately worrying. It | would be one thing if this was a one-off occurrence of some odd | developer-workflow. But I have answered the fundamental question here | (submodules vs free-floating clones) a handful of times myself at | least, experienced the pain of the decision myself when doing | rollbacks, and I'm sure other contributors can say the same. | | GHC is already a large, industry-strength software project with years | of work put behind it. The barrier to entry and contribution is not | exactly small, but I think we've all done a good job. I'd love to see | more people contributing. But I cannot help but find these discussions | a bit sad, where contributors are impaired due to regular/traditional | development workflows like rollbacks are rendered useless - due to | some odd source control discrepancy that nobody else on the planet | seems to suffer from. | | I guess the short version is basically that that you're absolutely | right: the time of Simon, Ian, and other high-profile contributors is | *extremely* important. But I'd also rather not have people like Kazu | potentially spend hours or even days doing what simple automation can | achieve in what is literally a few keystrokes, and not only that - par | for the course for other projects. This ultimately impacts the | development cycles of *everybody*. And even if Kazu deals with it - | what about the next person? | | On Wed, Jun 5, 2013 at 12:12 AM, Johan Tibell johan.tib...@gmail.com | wrote: | The latest git release has improved submodules support some so if we now | thing the benefits of submodules outweigh the costs we can discuss if we | want to change to policy. I don't want to make that decision for other GHC | developers that spend much more time on GHC than I (e.g. SPJ). Their | productivity is more important than any inconveniences the lack of | consistent use of submodules
Re: how to checkout proper submodules
On 5 June 2013 01:43, Manuel M T Chakravarty c...@cse.unsw.edu.au wrote: I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) I'd be all for this. We partially use the GitHub infrastructure since trac broke and I changed the emails to point to GitHub instead. I also often do code reviews with other devs on a personal GHC fork on github before merging in. I believe it would also help encourage more contributors (especially for libraries) but others have expressed disagreement with this point of view in the past and I'm not in hold of data. Either way, I'm glad git bisect may soon work. We'll finally be able to use the whole feature set of a version control tool :) (other piece was the move from darcs - git which gave us a working annotate). Simon Peyton-Jones simo...@microsoft.com: For the avoidance of doubt, I totally support what Austin and Johan are saying: I find the current setup confusing too. I'm totally persuaded of the merits of git bisect etc. I am the opposite of a git power-user (a git weedy-user?). I will be content to do whatever I'm told workflow-wise, provided I am told clearly in words of one syllable. I *very strongly* want to reduce barriers to entry for would-be contributors, and this is clearly a barrier we could lower. Making Kazu, Austin, Johan, etc more productive is massively valuable. There may be some history to how we arrived at this point, but that should not constrain for the future. We can change our workflow. I would want Ian and Simon to be thoroughly on board, but I regard the current setup as totally open to improvement. Please! BTW, Ian has written it up quite carefully here: http://hackage.haskell.org/trac/ghc/wiki/Repositories, and the linked page http://hackage.haskell.org/trac/ghc/wiki/Repositories/Upstream. Simon | -Original Message- | From: ghc-devs-boun...@haskell.org [mailto: ghc-devs-boun...@haskell.org] | On Behalf Of Austin Seipp | Sent: 05 June 2013 07:35 | To: Johan Tibell | Cc: ghc-devs@haskell.org | Subject: Re: how to checkout proper submodules | | I absolutely agree here, FWIW. We should only do this if there is a | clear consensus on doing so and everyone doing active development is | comfortable with it. And it's entirely possible submodules are | inadequate for some reason that I'm not aware of which is a | show-stopper. | | However, the notion of impact-on-contributors cuts both ways. GHC has | an extremely small team of hackers as it stands, and we are lucky to | have *amazing* contributors like Kazu, Andreas, yourself, Simon | Simon, and numerous others help make GHC what it is. Much of this is | volunteer work. But as the Haskell community grows, and we are at a | loss of other full-time contributors like Simon Marlow, I think we are | beginning to see the strain on GHC and its current contributors. So, | it's important to evaluate what we're doing right and wrong. This | feedback loop is always present even if seasoned contributors can live | with it - but new contributors will definitely be impacted. | | In this instance, I honestly find it disheartening that the answer to | things like getting older revisions of the source code in HEAD, or | techniques like bisection is basically that doesn't work. The second | is unfortunate, but the latter is pretty legitimately worrying. It | would be one thing if this was a one-off occurrence of some odd | developer-workflow. But I have answered the fundamental question here | (submodules vs free-floating clones) a handful of times myself at | least, experienced the pain of the decision myself when doing | rollbacks, and I'm sure other contributors can say the same. | | GHC is already a large, industry-strength software project with years | of work put behind it. The barrier to entry and contribution is not | exactly small, but I think we've all done a good job. I'd love to see | more people contributing. But I cannot help but find these discussions | a bit sad, where contributors are impaired due to regular/traditional | development workflows like rollbacks are rendered useless - due to | some odd source control discrepancy that nobody else on the planet | seems to suffer from. | | I guess the short version is basically that that you're absolutely | right: the time of Simon, Ian, and other high-profile contributors is | *extremely* important. But I'd also rather not have people like Kazu | potentially spend hours or even days doing what simple
Re: how to checkout proper submodules
David Terei wrote: Either way, I'm glad git bisect may soon work. Having git bisect work on the GHC tree would be a plus! Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
BTW, this could also be a basis for solving another common pain point, that seems to afflict everyone: "Validate fails. Was it me?" Have the buildbots push only validating version-combinations (using submodules to make this precise) into the repository newbies go to could help solve that. If the buildbot also bisects the validation problem and sends an email pinpointing the problem, keeping the validated repository recent might be reasonably easy. Daniel On 06/05/2013 11:43 AM, Manuel M T Chakravarty wrote: I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) Simon Peyton-Jones simo...@microsoft.com: For the avoidance of doubt, I totally support what Austin and Johan are saying: I find the current setup confusing too. I'm totally persuaded of the merits of git bisect etc. I am the opposite of a git power-user (a git weedy-user?). I will be content to do whatever I'm told workflow-wise, provided I am told clearly in words of one syllable. I *very strongly* want to reduce barriers to entry for would-be contributors, and this is clearly a barrier we could lower. Making Kazu, Austin, Johan, etc more productive is massively valuable. There may be some history to how we arrived at this point, but that should not constrain for the future. We can change our workflow. I would want Ian and Simon to be thoroughly on board, but I regard the current setup as totally open to improvement. Please! BTW, Ian has written it up quite carefully here: http://hackage.haskell.org/trac/ghc/wiki/Repositories, and the linked page http://hackage.haskell.org/trac/ghc/wiki/Repositories/Upstream. Simon | -Original Message- | From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] | On Behalf Of Austin Seipp | Sent: 05 June 2013 07:35 | To: Johan Tibell | Cc: ghc-devs@haskell.org | Subject: Re: how to checkout proper submodules | | I absolutely agree here, FWIW. We should only do this if there is a | clear consensus on doing so and everyone doing active development is | comfortable with it. And it's entirely possible submodules are | inadequate for some reason that I'm not aware of which is a | show-stopper. | | However, the notion of impact-on-contributors cuts both ways. GHC has | an extremely small team of hackers as it stands, and we are lucky to | have *amazing* contributors like Kazu, Andreas, yourself, Simon | Simon, and numerous others help make GHC what it is. Much of this is | volunteer work. But as the Haskell community grows, and we are at a | loss of other full-time contributors like Simon Marlow, I think we are | beginning to see the strain on GHC and its current contributors. So, | it's important to evaluate what we're doing right and wrong. This | feedback loop is always present even if seasoned contributors can live | with it - but new contributors will definitely be impacted. | | In this instance, I honestly find it disheartening that the answer to | things like "getting older revisions of the source code in HEAD," or | techniques like bisection is basically "that doesn't work." The second | is unfortunate, but the latter is pretty legitimately worrying. It | would be one thing if this was a one-off occurrence of some odd | developer-workflow. But I have answered the fundamental question here | (submodules vs free-floating clones) a handful of times myself at | least, experienced the pain of the decision myself when doing | rollbacks, and I'm sure other contributors can say the same. | | GHC is already a large, industry-strength software project with years | of work put behind it. The barrier to entry and contribution is not | exactly small, but I think we've all done a good job. I'd love to see | more people contributing. But I cannot help but find these discussions | a bit sad, where contributors are impaired due to regular/traditional | development workflows like rollbacks are rendered useless - due to | some odd source control discrepancy that nobody else on the planet | seems to suffer from. | | I guess the short version is basically that that you're absolutely | right: the time of Simon, Ian, and other high-profile contributors is | *extremely* important. But I'd also rather not have people like Kazu | potentially spend hours or even days doing what simple automation can | achieve in what is literally a few keystrokes, and not only that - par | for the course for other projects. This ultimately impacts the | development c
Re: how to checkout proper submodules
For me the biggest plus of switching to submodules would be keeping GHC and testsuite in sync. If there are any reasons not to change in-tree library repos to submodules, then I would at least want testsuite to be changed to a submodule. I also use github for my daily work on GHC and being able to send patches via Pull Requests would make things easier. On the other hand it might be more difficult to attach files to a ticket (no such feature on Github AFAIK). Speaking of Github, perhaps we should put more stress on github folks to fix this: https://github.com/github/markup/issues/196 ? Jan ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On 06/05/2013 10:10 AM, David Terei wrote: On 5 June 2013 01:43, Manuel M T Chakravarty c...@cse.unsw.edu.au wrote: I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) I'd be all for this. We partially use the GitHub infrastructure since trac broke and I changed the emails to point to GitHub instead. I also often do code reviews with other devs on a personal GHC fork on github before merging in. I believe it would also help encourage more contributors (especially for libraries) but others have expressed disagreement with this point of view in the past and I'm not in hold of data. As a very recent new (try-to-be-)contributor, i'ld like to weight in, in favor of this. IMHO, having to create a trac account, and submit patches by attachment (with the confusing trac UI) instead of just pushing to some repositories and issuing pull requests is quite suboptimal. I don't think it would scare anyone enough that they wouldn't contribute, but lowering the entry cost is always useful. -- Vincent ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
David Terei davidte...@gmail.com: On 5 June 2013 01:43, Manuel M T Chakravarty c...@cse.unsw.edu.au wrote: I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) I'd be all for this. We partially use the GitHub infrastructure since trac broke and I changed the emails to point to GitHub instead. I also often do code reviews with other devs on a personal GHC fork on github before merging in. I believe it would also help encourage more contributors (especially for libraries) but others have expressed disagreement with this point of view in the past and I'm not in hold of data. For the compiler, the barriers to contribution are probably elsewhere, but for the libraries, I'm sure, it would lower the barrier to entry. For example, to fix some documentation, I personally would never bother to create a patch file and attach it to some Trac ticket (where I first have to create an account). In contrast, a pull request on GitHub is a matter of a few clicks. Manuel PS: Anybody who doubts this needs to post their GitHub account name, so we can check that they actually ever used GitHub properly ;)___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
bisection. Wouldn't it be *great* to have something like that Just Work? A tool like this could potentially boil down Kazu's bug almost automatically for example, with little-to-no frustrating intervention. And even now, looking at the repository listing of what is in libraries/, that are not submodules, I really see no reason why more - or even all - of them cannot be submodules. Is it a workflow issue of some sort? That's what I'm thinking at this point, but I also don't think it could be any worse than it is now. Realistically, very few libraries GHC needs for bootstrapping seem to change that much. unix, integer-simple, haskeline and filepath for example change *extremely* infrequently, but all are free-standing. Why? In the event they were submodules, would anything actually be lost? The maintainer - that is, not GHC HQ - would still 'own' the official repository. They can make changes to it. But if there is a necessity to pull that in for GHC (feature request, bug fix, random thing) it can be done by updating the submodule pointer to the new commit. But this must happen explicitly by a GHC committer. In the event they update the submodule pointer, they should also obviously make sure the build still works. That means we have to update the submodule pointers ourselves if things change. That sucks I guess, but really, aside from base and testsuite, the two most frequently changing repositories, is that *actually* going to cost us a lot of work? And even if it does cost us work, I'll speak for myself: I will gladly pay for that work and do it all myself if it means I can actually bisect and actually roll back my tree to some point to fix things - without needing to prepare for it months in advance using hacks. Like creating thousands of fingerprints, using fingerprint.py every day when people make commits (no, I haven't done this, but it could be done, and I really don't want to do it.) Long-term reproducible builds are, IMO, a must for any project. *Especially* a project of our size. *Especially* a compiler of all things. But as it stands, when you build GHC, you can probably reproduce *today's* results and *today's* bugs. Last month's results? Last years? Finding the difference between those months ago and today? Good luck - you will need it. On Tue, Jun 4, 2013 at 8:07 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
I don't know much about subtrees, but that might be another possibility? There are a lot of things to recommend moving to github. I do hate (non-empty) merge commits, though, so I'm not a fan of github's pull request mechanism. Geoff On 06/05/2013 09:43 AM, Manuel M T Chakravarty wrote: I agree with Austin and Johan. It's a bizarre setup. Submodules have their pain points (which we already have to deal with), but the ability to properly snapshot and branch the whole tree would be a serious benefit IMO. Manuel PS: While we are at it, why don't we just have the main repos on GitHub and use forks and pull requests like the rest of the world? (Using Git, but not GitHub's superb infrastructure, seems like a terrible waste to me.) Simon Peyton-Jones simo...@microsoft.com: For the avoidance of doubt, I totally support what Austin and Johan are saying: I find the current setup confusing too. I'm totally persuaded of the merits of git bisect etc. I am the opposite of a git power-user (a git weedy-user?). I will be content to do whatever I'm told workflow-wise, provided I am told clearly in words of one syllable. I *very strongly* want to reduce barriers to entry for would-be contributors, and this is clearly a barrier we could lower. Making Kazu, Austin, Johan, etc more productive is massively valuable. There may be some history to how we arrived at this point, but that should not constrain for the future. We can change our workflow. I would want Ian and Simon to be thoroughly on board, but I regard the current setup as totally open to improvement. Please! BTW, Ian has written it up quite carefully here: http://hackage.haskell.org/trac/ghc/wiki/Repositories, and the linked page http://hackage.haskell.org/trac/ghc/wiki/Repositories/Upstream. Simon | -Original Message- | From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] | On Behalf Of Austin Seipp | Sent: 05 June 2013 07:35 | To: Johan Tibell | Cc: ghc-devs@haskell.org | Subject: Re: how to checkout proper submodules | | I absolutely agree here, FWIW. We should only do this if there is a | clear consensus on doing so and everyone doing active development is | comfortable with it. And it's entirely possible submodules are | inadequate for some reason that I'm not aware of which is a | show-stopper. | | However, the notion of impact-on-contributors cuts both ways. GHC has | an extremely small team of hackers as it stands, and we are lucky to | have *amazing* contributors like Kazu, Andreas, yourself, Simon | Simon, and numerous others help make GHC what it is. Much of this is | volunteer work. But as the Haskell community grows, and we are at a | loss of other full-time contributors like Simon Marlow, I think we are | beginning to see the strain on GHC and its current contributors. So, | it's important to evaluate what we're doing right and wrong. This | feedback loop is always present even if seasoned contributors can live | with it - but new contributors will definitely be impacted. | | In this instance, I honestly find it disheartening that the answer to | things like getting older revisions of the source code in HEAD, or | techniques like bisection is basically that doesn't work. The second | is unfortunate, but the latter is pretty legitimately worrying. It | would be one thing if this was a one-off occurrence of some odd | developer-workflow. But I have answered the fundamental question here | (submodules vs free-floating clones) a handful of times myself at | least, experienced the pain of the decision myself when doing | rollbacks, and I'm sure other contributors can say the same. | | GHC is already a large, industry-strength software project with years | of work put behind it. The barrier to entry and contribution is not | exactly small, but I think we've all done a good job. I'd love to see | more people contributing. But I cannot help but find these discussions | a bit sad, where contributors are impaired due to regular/traditional | development workflows like rollbacks are rendered useless - due to | some odd source control discrepancy that nobody else on the planet | seems to suffer from. | | I guess the short version is basically that that you're absolutely | right: the time of Simon, Ian, and other high-profile contributors is | *extremely* important. But I'd also rather not have people like Kazu | potentially spend hours or even days doing what simple automation can | achieve in what is literally a few keystrokes, and not only that - par | for the course for other projects. This ultimately impacts the | development cycles of *everybody*. And even if Kazu deals with it - | what about the next person? | | On Wed, Jun 5, 2013 at 12:12 AM, Johan Tibell johan.tib...@gmail.com | wrote: | The latest git release has improved submodules support some so if we now | thing the benefits of submodules outweigh the costs we can
Re: how to checkout proper submodules
Hi Geoffrey, I don't know much about subtrees, but that might be another possibility? the main point about subtrees is, that you've just one repository and you're merging a directory of this repository with 'git subtree' with some other git repository. subtrees and submodules both try to handle the use case if you want to incorporate a third party repository into your own repository and would like to merge the changes in both directions. I think that subtrees are easier for the developer working on the repository, because there's only one repository, but it's a bit more hassle merging the third party repository. submodules are harder for the developer, because there're multiple repositories, but merging the third party repository might be a bit easier. GHC devs might have other reasons for using submodules, because they want to separate things or they're afraid that the resulting one repository might get too big, but I think that there should be good reasons for using submodules, because a lot of workflows (like branching) are such a hassle with submodules. Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On Wed, 2013-06-05 at 15:24 +0200, Daniel Trstenjak wrote: because a lot of workflows (like branching) are such a hassle with submodules. As my experience with submodules is positive (though limimted), could you elaborate on the difficulties/hassle here? Thanks, Nicolas ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi Nicolas, On Wed, Jun 05, 2013 at 03:27:09PM +0200, Nicolas Trangez wrote: As my experience with submodules is positive (though limimted), could you elaborate on the difficulties/hassle here? If you would like to develop some kind of feature which involves changes on multiple repositories/submodules and you would like to do it in a branch, than you have to create a branch in each repository, commit separately in each repository and than merge back each repository into its master branch. Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
I'm back after sleep. A few points: 1) Subtree is - in my opinion - basically not an option. It has a nice workflow from the small amount of time I spent with it. But it's not installed by default with git, it's unclear if it ever will be. Although subtree gives the appearance of a unified repository from my understanding, in practice all developers will probably need to touch multiple repositories for several reasons anyway (like testsuite and base.) That means the third-party merge is pretty much always going to happen for any non-sizeable work, the person who *did* the work will be the one doing it, essentially amounting to basically everyone needing subtree in the long term. I may be wrong about this. If I am please call me out on it. And there may be alternative workflows for patch-submitters to help this. But in general, I'd rather not have to tell GHC developers they probably need a special git build in the long haul. 2) I agree with John Lato. I think the immediate problem of fixing the submodule situation is a core issue, and GitHub can come later. Or at the very least, we should discuss GitHub in its own email thread. That's because while I see the problem of our current setup is bad as rather obvious and with a clear mitigation/fix, there *are* some legitimate complaints about GitHub that won't be resolved so easily. We should tackle each separately (remember: we have thousands of existing tickets, wiki pages, historical existing links, etc. All of these are pretty important in a lot of ways. It's not clear what the movement-strategy here is and it is definitely not going to be free, or painless.) This is definitely a more touchy issue, but I can see both sides. 3) Regarding Daniel Trstenjak's complaint: submodules from a workflow perspective may suck a little, but realistically we use their *exact* workflow anyway as it stands. We just don't get any of the benefits: in practice developers will make branches in each affected repo and push them and maintain them concurrently. Eventually they will be merged into master for each respective repository. This process will not change if we move entirely to submodules as you said. Some extra food for thought: 1) We could now delete ./sync-all if this happened. It's almost 1000 lines of code dedicated to managing this stuff. Instead, we merely tell all hackers to clone with 'git clone --init --recursive' and viola! After a git clone, you can immediately start building. That'd be great. 2) One thing this *does* complicate is that currently, some repositories are optional. Submodules effectively make them 100% non-optional. Now, normally, I would say all developers should have every relevant library anyway. In this case however, it is a tad bit annoying. On my ARM machines for example, DPH regularly fails late-in-build due to a bug in the (custom) linker, because dph requires stage2+ghci. But it also takes a long time to build DPH, so in practice I just remove it to save myself that time. Some others do the same. That said, I'm potentially the vast minority here, and I'd be willing to just deal with it in the mean time if we can do this (this is the *exception* and certainly not the rule.) Not that big a deal, and it can also be fixed later. There are probably other things that I can't think of, but I'm sure you can all think of other stuff too. :) On Wed, Jun 5, 2013 at 8:49 AM, Daniel Trstenjak daniel.trsten...@gmail.com wrote: Hi Nicolas, On Wed, Jun 05, 2013 at 03:27:09PM +0200, Nicolas Trangez wrote: As my experience with submodules is positive (though limimted), could you elaborate on the difficulties/hassle here? If you would like to develop some kind of feature which involves changes on multiple repositories/submodules and you would like to do it in a branch, than you have to create a branch in each repository, commit separately in each repository and than merge back each repository into its master branch. Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
1) We could now delete ./sync-all if this happened. In that case I would vote for replacing sync-all with a script that aids in managing branches in multiple subrepos. I implemented such a script for myself in a very ad hoc way. Having something more robust would be great. 2) One thing this *does* complicate is that currently, some repositories are optional. (...) I believe this could be solved by changes in the build system, so that some components can be optional (yes, I also delete DPH to speed up building). Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Hi Austin, On Wed, Jun 05, 2013 at 09:41:56AM -0500, Austin Seipp wrote: But it's not installed by default with git, it's unclear if it ever will be. I think subtree has been part of git since 1.7.x . I have just installed the default git package (git 1.8.1.2) of Ubuntu 13.04 and the subtree command is just there. Although subtree gives the appearance of a unified repository from my understanding, in practice all developers will probably need to touch multiple repositories for several reasons anyway (like testsuite and base.) That means the third-party merge is pretty much always going to happen for any non-sizeable work, the person who *did* the work will be the one doing it, essentially amounting to basically everyone needing subtree in the long term. Sorry that I'm not aware of the GHC development process, but why are the testsuite and base in separate repositories? submodules are fine for tracking repositories, but if you're all the time changing multiple submodules, than it's a sign that you've a strong dependency between the repositories, so why not just having one repository? 2) One thing this *does* complicate is that currently, some repositories are optional. Submodules effectively make them 100% non-optional. Now, normally, I would say all developers should have every relevant library anyway. In this case however, it is a tad bit annoying. On my ARM machines for example, DPH regularly fails late-in-build due to a bug in the (custom) linker, because dph requires stage2+ghci. But it also takes a long time to build DPH, so in practice I just remove it to save myself that time. Some others do the same. Isn't this more a build system issue, that you're able to specify what should/shouldn't be build, than a repository issue? Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On Wed, Jun 5, 2013 at 10:20 AM, Daniel Trstenjak daniel.trsten...@gmail.com wrote: I think subtree has been part of git since 1.7.x . I have just installed the default git package (git 1.8.1.2) of Ubuntu 13.04 and the subtree command is just there. It's *part* of mainline git, but it is not installed with git. It's part of git's contrib functionality package which requires that your package maintainer be gracious enough to include it and install it by default, which requires extra intervention at build-time. As a counter-example, my 'git' from Ubuntu 12.04 LTS machine has no subtree and there are no existing instances of it in any 'precise' repositories. I'm hesitant to require developers en masse to use it for reasons like this. (Frankly I also don't know how this would work out on windows. Like, I don't know how to get a git-build-with-subtree-for-windows, much less if it works on windows at all.) Sorry that I'm not aware of the GHC development process, but why are the testsuite and base in separate repositories? Because GHC does not technically 'own' them by the most strict definition. testsuite and base are also useful for other compilers, such as nhc98 (and indeed, nhc uses base itself.) The same can be said of nofib. As a result, there is a separation. Now, in practice everybody working on base is a GHC hacker pretty much, and ditto with testsuite/nofib. Regardless of all that, to change *this* part of the equation is a much, much bigger argument. One I don't intend to wage at the moment. submodules are fine for tracking repositories, but if you're all the time changing multiple submodules, than it's a sign that you've a strong dependency between the repositories, so why not just having one repository? I would agree. In practice many of the submodules are touched extremely rarely - one change every several months. Sometimes, no changes at all between entire releases spanning a year. testsuite and base are definitely the exception, but they are also what most people spend their time with in terms of hacking (pareto in action; 80% of peoples work, 20% of the code.) But again, to change this is a far larger argument with historical implications, and implications beyond GHC. Malcolm would certainly have input as he maintains nhc. (In the past, from my understanding, nhc etc were more prevalent. But over time we've moved more and more to GHC, and 'cruft' has arguably lingered.) I think folding base and testsuite into GHC 'for good' is a separate discussion entirely. Isn't this more a build system issue, that you're able to specify what should/shouldn't be build, than a repository issue? Yes. It is not insurmountable, my point is more it's an immediate loss for some small reasons, but really nothing more than a minor annoyance. It's just something to remind people of, should we make the change. Greetings, Daniel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On Tue, Jun 04, 2013 at 09:05:58PM -0500, Austin Seipp wrote: I know we had this discussion sometime recently I think, but can someone *please* explain why we are in this situation of half submodules, half random-floating-git-repository-checkouts? Submodules are very handy for libraries that someone else maintains: We can make a local change to the library when we need something fixed, and then, when upstream has a fix too, we can jump straight to their fix without having to do any merging. However, submodules have various disadvantages, e.g. http://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ The main one for me is that it's fairly easy to lose local changes when using submodules. This is relatively unimportant for the libraries that someone else maintains, as we don't often make any local changes to lose. Even so, I've lost changes on a couple of occasions. So the reason we entered this state is that we didn't think the advantages outweighed the disadvantages for the other repositories. Thanks Ian -- Ian Lynagh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
I think that testsuite should be included in the main GHC repo. I don't recall any other project that has its tests placed in a separate repository. The nhc argument doesn't convince me - after all, most test that are added nowadays are GHC specific. Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
There are a lot of things to recommend moving to github. I do hate (non-empty) merge commits, though, so I'm not a fan of github's pull request mechanism. Please read A successful Git branching model to know why fast-forward is not used recently. Git flow: http://nvie.com/posts/a-successful-git-branching-model/ Another relating article is here: Github flow: http://scottchacon.com/2011/08/31/github-flow.html --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Unfortunately we don't use submodules for all repos e.g. base. This makes it very hard to accurately check out a previous state and bisect errors unfortunately. On Tue, Jun 4, 2013 at 6:07 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
Is the way forward then to manually bisect by timestamp? Perhaps there are scripts out there to assist with stuck a task. On Jun 4, 2013 8:47 PM, Johan Tibell johan.tib...@gmail.com wrote: Unfortunately we don't use submodules for all repos e.g. base. This makes it very hard to accurately check out a previous state and bisect errors unfortunately. On Tue, Jun 4, 2013 at 6:07 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/06/13 02:46, Johan Tibell wrote: Unfortunately we don't use submodules for all repos e.g. base. This makes it very hard to accurately check out a previous state and bisect errors unfortunately. On Tue, Jun 4, 2013 at 6:07 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu Is there a reason why some submodules are proper git repos and some aren't? Benefits of having git repos as submodules are hopefully clear so I'm interested why this isn't the case here. - -- Mateusz K. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAEBAgAGBQJRrpwJAAoJEM1mucMq2pqX8fcP/iNexwoV425kxRh5uPH0/Rrc hP0a9li5z4ddzYHjCaZTFc25HxVK6f6FqX05nbfUH8Uc39a71g+A2qntdpQ0JI7S SO5EBH39i/ehCmyUDdM/tcdF4jvdk+1iVmiyXmzsefnC+WC4vlMSEwNnOeWUxNok 79AUw8cC/7yAT88q3Ktvs2hgPKmpQ/90nQnNvLceYgSu19UgGCilmfVn0KuOCtda wBEO32xC61MJdDVrPgQqqo/niW4s67ECF5yEZEvtBKY8sBBtJQhR+nOTtiaBqTl5 q8DHz+6V8djGAZ89xiDjFakGA1E5+VhKkCZhwwvsH3DqzfVn/q9G2IH9pomdxYCy COhefxxN2Fsqe5V5rqBhZEdASJuraPhnD6Wh2cHTHgCrYC39RjgHGdUsZ304ufaN P9CDxBn2uJtPaW5klL8yMvRAjL78myljdozZMmeqZ/Jdwi28iCJ+T8Bg2ZTnwncm J1BRKHdx84AhVqQtJEv2fl6jX7XX3Mh2Iuoe9Vkr2WoO7UaqkJQUE0rhlExHrh9/ NQHKQhDxeinHtc5DRJBFA6n1eKhb1CKm/XPA0k2xQMjTaC6GamwOD1BpKekhHrxk yExUIINGmDBr0PaitTJq85NRFsBzLciCbO2oPVnVVTkCJdnZf0xSuetkrnh1hSgM NAhVIIZikZgPKEnJlP/E =YFFN -END PGP SIGNATURE- ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
- without needing to prepare for it months in advance using hacks. Like creating thousands of fingerprints, using fingerprint.py every day when people make commits (no, I haven't done this, but it could be done, and I really don't want to do it.) Long-term reproducible builds are, IMO, a must for any project. *Especially* a project of our size. *Especially* a compiler of all things. But as it stands, when you build GHC, you can probably reproduce *today's* results and *today's* bugs. Last month's results? Last years? Finding the difference between those months ago and today? Good luck - you will need it. On Tue, Jun 4, 2013 at 8:07 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, Andreas and I found that the new IO manager is not working properly in the current GHC head. I'm sure that it worked well at least on May 7. We need to narrow the range of commits, so I did: % git checkout bb2795db36b36966697c228315ae20767c4a8753 % git submodule update But this does not checkout proper submodules. For instance, libraries/base has newer commits. And of cource, building fails. Please tell us how to checkout proper submodules against a specific GHC tree. --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: how to checkout proper submodules
On Tue, Jun 4, 2013 at 7:05 PM, Austin Seipp ase...@pobox.com wrote: I know we had this discussion sometime recently I think, but can someone *please* explain why we are in this situation of half submodules, half random-floating-git-repository-checkouts? It's terrible. I'm frankly surprised we've even been doing it this long, over a year or more? It is literally the worst of submodules, and free-standing-repositories put together, with none of the advantages of either. This is my understanding of what happened: we started out with only plain repos. This avoids some of the pitfalls of submodules and we believed it was the least disruptive workflow (when switching form darcs) for the core contributors. Eventually we needed GHC to track upstream releases of libraries (e.g. Cabal) instead of jus tracking HEAD, which it did before. To achieve that, we switched the libraries that GHC just tracks (e.g. Cabal) to submodules. The libraries maintained by GHC HQ (e.g. base) we're still kept as plain repos to avoid disrupting anyones workflow. The latest git release has improved submodules support some so if we now thing the benefits of submodules outweigh the costs we can discuss if we want to change to policy. I don't want to make that decision for other GHC developers that spend much more time on GHC than I (e.g. SPJ). Their productivity is more important than any inconveniences the lack of consistent use of submodules might cause me. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs