Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 09/06/13 17:51, Ian Lynagh wrote: On Sun, Jun 09, 2013 at 11:15:37AM -0500, Austin Seipp wrote: I'm referring to Joachim Breitner's work on splitting the base. So what's the timeline here? As soon as possible after 7.8 is branched. Has there been a decision somewhere on what to do? The wiki page sets out the parameters of the design, but doesn't have any conclusions that I could see. Splitting base has the potential to be extremely destabilising, I want to make sure that we're getting appreciable benefits in exchange. Cheers, Simon ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 06/12/2013 12:37 PM, Ian Lynagh wrote: On Wed, Jun 12, 2013 at 12:54:38AM +0200, Daniel Trstenjak wrote: I guess [the merge commits] may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). Just to clarify, my problem isn't so much that there are merge commits (although it would still be nicer if there weren't), but that it is hard to see whether we are in the same state as upstream, or to see what the differences between us and upstream are. I don't quite understand how you should get rid of these merge commits by using submodules, With submodules we can do cd libraries/Cabal git reset --hard an upstream commit id cd .. git commit -a and we will jump to that commit, without needing to merge it with the commit that we were at before. You can get rid of these merge commit by using the '--rebase' option of git-pull. We can't rebase, as these patches are in everyone else's GHC tree. Only if you have pushed the ghc tree. If it is only local, then rebasing is just fine. And, I would argue, desirable. For the record, I am in favor of moving everything to submodules. Geoff ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 01:13:37PM +0200, Daniel Trstenjak wrote: On Mon, Jun 10, 2013 at 11:45:22AM +0100, Ian Lynagh wrote: Is this possible with subtrees?: * Initially ghc's Cabal repo is at the same commit as upstream * We make a local commit 123 in Cabal to fix some bug * Cabal upstream makes a commit 456 to fix the same bug differently * We jump to commit 456, in such a way that we don't end up merging with our 123 commit every time we pull from Cabal in the future Yes. Every repository that's added by git-subtree to your repository is represented as a separate branch. So everything that applies to the merging of branches also applies to the merging by git-subtree. I didn't follow that. Here's an example of what happens with just a plain git repo, with no branches, submodules or subrepos involved: -8--8--8--8- upstream$ git init upstream$ echo content file upstream$ git add file upstream$ git commit -a -m initial $ git clone upstream ghc $ cd ghc ghc$ echo fix1 file ghc$ git commit -a -m fix1 upstream$ echo fix2 file upstream$ git commit -a -m fix2 ghc$ git pull --no-edit -X theirs upstream$ echo feature1 file upstream$ git commit -a -m feature1 ghc$ git pull --no-edit -X theirs upstream$ echo feature2 file upstream$ git commit -a -m feature2 ghc$ git pull --no-edit -X theirs -8--8--8--8- At the end of this, you'll see that the ghc repo has a number of merge commits. I guess they may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). Thanks Ian -- Ian Lynagh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Ian, I guess they may not cause any actual problems, but it's certainly nicer not having them (which is what using submodules gives us). I don't quite understand how you should get rid of these merge commits by using submodules, because at the end every submodule is just a git repository and behaves in the same way as every other git repository for merges. You can get rid of these merge commit by using the '--rebase' option of git-pull. I put your git command lines into the attached script 'ghc_git_test'. Now you can get your version and the version using '--rebase' by calling: mkdir your_version rebase_version cd your_version ghc_git_test -X theirs cd ../rebase_version ghc_git_test --rebase -X ours You will certainly ask why it's 'ours' instead of 'theirs' for the rebase case, well, that's one of the quite counterintuitive things in the git user interface. Greetings, Daniel #!/usr/bin/env bash mkdir upstream ghc cd upstream git init echo content file git add file git commit -a -m initial cd .. git clone upstream ghc cd ghc echo fix1 file git commit -a -m fix1 cd ../upstream echo fix2 file git commit -a -m fix2 cd ../ghc git pull --no-edit $@ cd ../upstream echo feature1 file git commit -a -m feature1 cd ../ghc git pull --no-edit $@ cd ../upstream echo feature2 file git commit -a -m feature2 cd ../ghc git pull --no-edit $@ ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 1:32 PM, Roman Cheplyaka r...@ro-che.info wrote: * John Lato jwl...@gmail.com [2013-06-10 07:59:55+0800] On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. After base is in GHC, separating it again will be only harder, not easier. Or do you have a specific plan in mind? It's more about motivation. It seems to me right now base is in a halfway state. People think that moving it further away from ghc is The Right Thing To Do, but nobody is feeling enough pain to be sufficiently motivated to do it. If we apply pain, then someone will be motivated to do it properly. And if nobody steps up, maybe having a platform-agnostic base isn't really very important. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, 2013-06-10 at 11:45 +0100, Ian Lynagh wrote: Side note: the fingerprint script *didn't even work* for almost a year after it was introduced; see commit 73ce2e70. Which implies that wanting to go back in time is rare, so making it easy should be given low weight when considering the options? If 'git bisect' would work (out of the box) on the GHC repo, going back in time would certainly be a more common operation. Nicolas ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On 06/10/2013 11:49 AM, Nicolas Trangez wrote: On Mon, 2013-06-10 at 11:45 +0100, Ian Lynagh wrote: Side note: the fingerprint script *didn't even work* for almost a year after it was introduced; see commit 73ce2e70. Which implies that wanting to go back in time is rare, so making it easy should be given low weight when considering the options? If 'git bisect' would work (out of the box) on the GHC repo, going back in time would certainly be a more common operation. I agree. Going back in time is really, really hard with fingerprints because you have to get the fingerprint files somewhere, and they don't always exist. Also, it could be the case that people used the fingerprint files to bisect but didn't notice they weren't quite right because the fingerprints were close enough. OK for bug-finding, terrible for reproduceable builds. Many people on the list have been quite vocal about wanting to be able to bisect. *I* have wanted to be able to bisect many, many times, but I don't because it's such a pain. I also want to be able to tell people how to build branches of ghc that I am working on, e.g., the simd and th-new branches. That means having to store a fingerprint file somewhere public and keep it in sync with my tree. I would much rather just tell them to check out the foo branch of ghc and be done with it. Geoff ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Austin, I apologize for not having read the full email yet (I'm in a hurry right now), but... * Austin Seipp ase...@pobox.com [2013-06-09 00:23:22-0500] -- Let's just put base and testsuite inside the GHC repository directly. No submodules, no floating repos. Just put it directly inside and make a super commit, I guess. GHC becomes the de facto repository. And hey, why not nofib? I know, I know. People really want to split the maintenance burdens I guess, and ideologically the Haskell community is all about clean separation but, please? All of GHC HQ are the de facto maintainers of this stuff anyway. And as Jan mentioned, testsuite is really *so* crucial GHC should have it inline. The testsuite is perhaps the most important of all. There are other candidates for this treatment too, really. For example, why is template-haskell, ghc-prim, and hpc split out? GHC is the only thing that supports them. template-haskell is especially super-intrusive of an extension to support, and arguably hpc as well. integer-simple and integer-gmp follow the exact same story. Same with hoopl and dph. They're all ours. We own them. Just put them all inside GHC and be done with it. Having active fragmentation in the VCS is not necessary when there need be none. These packages de-facto ship with GHC and are very tied to it. I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Hi Roman, On Sun, Jun 9, 2013 at 1:44 AM, Roman Cheplyaka r...@ro-che.info wrote: I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. I hate being that person but, if the purpose of these forks is to work around specific bugs in HSE and/or fix problems with name resolution of GHC-specific terms, which sort of seems to be the case from the log, I don't think hacking base co. is a long term solution. It could potentially need infinite ongoing maintenance. I went down this road with LHC too. And my gut feeling is that hacking ghc-prim out-of-band feels so amazingly wrong I'm frankly not sure if I need to fork it can actually warrant a huge amount of sympathy, to the point of keeping the repository separate for that 1 fork in existence (granted, ghc-prim is still pretty low traffic. But base is not.) If you DO need help from GHC, is there really nothing we could easily and reasonably do to further assist you? I think asking for specific, principled solutions on our part is not out of the question here. Are there any other forks of base people have for any particular reason? What reasons are those? But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. To be clear, packages and their numbers aren't *really* the problem. It's repositories. The numbers just make this slightly worse. Adding packages and adding repositories both add overhead. Adding repositories adds a significantly *larger* amount of complexity, all things considered. The only honest, legitimate way to reduce that complexity is to fold in repositories. But this means that we have to give something up, too. If base were to get split into 5 packages or 8 packages, that's potentially fine by me, even welcomed. What I don't want is 5 more repositories that are all intimately tied to GHC's build and features, which a majority of GHC-specific work will be driven towards, and over time that we then must manage and synchronize heavily. That's just a massive amount of work. Just looking at Joachim's fork of base on github, I already have some reservations about its current implementation. Like, base-float still exports GHC-specific namespaces. Every package still has a lot GHC specific code, as opposed to some isolated substrate that we provide and base-* packages interface with. So we're going to maintain all of that, it's the sad truth. And if Joachim's patch were merged tomorrow somehow, I think that frankly so much of it would still be under GHC control, my argument would still stand. It would still be one repository. We would still own it. It makes base more granular, but this has almost nothing to do with our real problems. Fixing all of that where we're not *actually* in control of it is a ton of work. The current patches just don't solve that I think. And this was last discussed in February? So what's the timeline here? Clearly we're not even done with the API discussion at all. So, 6 months? A year? Who knows? When it's done? I'm not sure most of us want to wait that long, especially considering the need to track down bugs and have accurate historical logs is a fairly frequent occurrence. Roman -- Regards, Austin - PGP: 4096R/0x91384671 ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Sun, Jun 9, 2013 at 3:47 AM, Jan Stolarek jan.stola...@p.lodz.pl wrote: I admire your talent for writing emails ;-) You can be honest and just call them what they are: horribly written novellas. As you wrote in your email I'm totally for including testsuite into GHC, because it is essentially part of GHC and it doesn't make sense to have a version of testsuite not corresponding to a version of GHC. As you pointed out the same argument can be used for other packages, but still there is one thing I don't like about that idea. What if an average haskeller wants to improve one of the libraries e.g. by adding comments or fixing a minor bug? If we have a super-repo that person would need to check out everything, which is discouraging. This is a good point I hadn't considered, but it's less of a worry for some packages than others. For example, base, ghc-prim and template-haskell are so intimately tied into GHC that reinstalling them is either impossible or a bad idea. To change them, you must build your own GHC anyway (either from source, or HEAD.) And if you're using a Haskell Platform compiler, clearly you'd have no luck with the git repository anyway (due to their strong interdependence.) But again, I'm totally OK with a lot of these other repositories being submodules. For example, process, unix, deepseq, filepath, directory. Those don't need to be folded in. Lots of them could have their own maintainers with separate upstreams. They're touched infrequently enough traffic concerns aren't as much of a deal. I just want the most high-traffic'd repositories dealt with, because in practice these are the *most* critical and the most interdependent. That in turn leads to the most problems. Another, separate issue here is that such a person needs to either register to ghc-devs or trac to send a patch. Using github would be helpful here, though I agree with Geoffrey about merge commits - we'd have to think of sth here. Also, the fact that GHC HQ is maintaining all of the mentioned packages doesn't mean that they need to be stored in one repo, at least not in git (this would make more sense to me with SVN where you can checkout a subdirectory). Not necessarily, the 'owners' of the packages are still the libraries committee. People can propose changes there as they have always done. It just so happens most of the 'libraries' maintained packages are de-facto maintained by GHC people. You're right not all of them need to be folded in. But I think several of them should be, and these are the ones that hurt the most. (Plus, my radical proposal can't be considered totally, completely radical unless I propose something which would - of course - be shot down.) Still, I strongly agree that sth should be done about current setup. I'm not a git guru so I cannot fully foresee what would be the consequences of turning everything into submodules, but I think that it cannot be worse than it is now, right? For some submodules, it could certainly be worse. Please see Ian's link in the prior discussion concerning submodules - for high-traffic repositories, some of the concerns are disconcerning. Jan Dnia niedziela, 9 czerwca 2013, Roman Cheplyaka napisał: Hi Austin, I apologize for not having read the full email yet (I'm in a hurry right now), but... * Austin Seipp ase...@pobox.com [2013-06-09 00:23:22-0500] -- Let's just put base and testsuite inside the GHC repository directly. No submodules, no floating repos. Just put it directly inside and make a super commit, I guess. GHC becomes the de facto repository. And hey, why not nofib? I know, I know. People really want to split the maintenance burdens I guess, and ideologically the Haskell community is all about clean separation but, please? All of GHC HQ are the de facto maintainers of this stuff anyway. And as Jan mentioned, testsuite is really *so* crucial GHC should have it inline. The testsuite is perhaps the most important of al There are other candidates for this treatment too, really. For example, why is template-haskell, ghc-prim, and hpc split out? GHC is the only thing that supports them. template-haskell is especially super-intrusive of an extension to support, and arguably hpc as well. integer-simple and integer-gmp follow the exact same story. Same with hoopl and dph. They're all ours. We own them. Just put them all inside GHC and be done with it. Having active fragmentation in the VCS is not necessary when there need be none. These packages de-facto ship with GHC and are very tied to it. I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Sun, Jun 09, 2013 at 11:15:37AM -0500, Austin Seipp wrote: I'm referring to Joachim Breitner's work on splitting the base. So what's the timeline here? As soon as possible after 7.8 is branched. Thanks Ian ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
You can be honest and just call them what they are: horribly written novellas. Actually, I was thinking that instead of posting to the list you might consider publishing your emails as papers on workshops or symposia ;) for high-traffic repositories, some of the concerns are disconcerning. But the high-traffic repositories (base, testsuite) are already submodules, right? For me the major problem of the current setup is that we cannot use one of the most important features of a VCS, i.e. going back in time. The only solutions to this problem that I am aware of are folding or turning into submodules all libraries that GHC depends on. I just had this moment of enlightment that the question of including a repo as a submodule (or folding it into GHC tree) is not a matter of traffic, but a matter of that library's implementation. If it uses GHC-specific API then it goes in, because it is tightly-coupled. If it is implemented in standard Haskell then it can stay out, because changes to compiler should not affect it. This is pretty simple criterium to identify libraries that we should be concerned with (perhaps this is obvious, but it only occured to me now). So a high-traffic repo that does not depend on non-standard features of GHC could still be kept as an in-tree repo, without affecting the ability to go back in time. Jan ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
Oh, and I've been made aware that git 1.7 and later can checkout a subdirectory of a repo - this partially invalidates my previous argument. I'm saying partially, because it is a bit more difficult than dealing with a library that has its own repo + it seems that some potential contributors might not be aware of this feature (like me today in the morning). Janek ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
* Austin Seipp ase...@pobox.com [2013-06-09 11:15:37-0500] Hi Roman, On Sun, Jun 9, 2013 at 1:44 AM, Roman Cheplyaka r...@ro-che.info wrote: I'm a strong -1 on this. As one example, we have forks of base and ghc-prim for Haskell suite: https://github.com/haskell-suite/base https://github.com/haskell-suite/ghc-prim which would be much more complicated if these were not independent repositories. I hate being that person but, if the purpose of these forks is to work around specific bugs in HSE and/or fix problems with name resolution of GHC-specific terms, which sort of seems to be the case from the log, I don't think hacking base co. is a long term solution. It could potentially need infinite ongoing maintenance. I went down this road with LHC too. It is only partly to work around bugs in HSE. The second part is to work around bugs and quirks in base itself. There are places where CPP wouldn't produce meaningful code unless __GLASGOW_HASKELL__ is defined, for example. Even ignoring those obvious bugs for a minute, currently the large part of base is defined under GHC.* hierarchy and isn't available unless __GLASGOW_HASKELL__ is defined. But okay, let's suppose that at some point everything is fixed and we don't have to *fork* base. We still would like to use it! Should we fetch the whole GHC tree in order to get its development version? And my gut feeling is that hacking ghc-prim out-of-band feels so amazingly wrong I'm frankly not sure if I need to fork it can actually warrant a huge amount of sympathy, to the point of keeping the repository separate for that 1 fork in existence (granted, ghc-prim is still pretty low traffic. But base is not.) It *is* wrong, but who is to blame that a big part of Prelude comes from there, including all logical operations and classes Eq and Ord? If you DO need help from GHC, is there really nothing we could easily and reasonably do to further assist you? I think asking for specific, principled solutions on our part is not out of the question here. The best help would be to make and keep base relatively portable and not to introduce superfluous conditional compilation. (I realise that a lot of that has just accumulated historically, but now is a good time to get rid of it.) It is a ton of work, and I'm very happy when I see people like Joachim trying to do something in that direction. Right now I'm only asking not to make their work even harder by moving base under the ghc repository. But more generally, I think there's still hope that the core packages will be made portable — I'm referring to Joachim Breitner's work on splitting the base. To be clear, packages and their numbers aren't *really* the problem. What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Proposal: better library management ideas (was: how to checkout proper submodules)
* John Lato jwl...@gmail.com [2013-06-10 07:59:55+0800] On Mon, Jun 10, 2013 at 1:32 AM, Roman Cheplyaka r...@ro-che.info wrote: What I'm trying to say here is that there's hope for a portable base. Maybe not in the form of split base — I don't know. But it's the direction we should be moving anyways. And usurping base by GHC is a move in the opposite direction. Maybe that's a good thing? The current situation doesn't really seem to be working. Keeping base separate negatively impacts workflow of GHC devs (as evidenced by these threads), just to support something that other compilers don't use anyway. Maybe it would be easier to fold base back into ghc and try again, perhaps after some code cleanup? Having base in ghc may provide more motivation to separate it properly. After base is in GHC, separating it again will be only harder, not easier. Or do you have a specific plan in mind? Roman ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs