Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-21 at 17:51 -0400, Isaac Dupree wrote: On 03/18/2013 12:55 PM, Duncan Coutts wrote: [...] it is not simply the outline parser for cabal-style files that we're talking about. We also need parsers/pretty printers for all the various little types that make up the info about packages, like versions, package names, package ids, version constraints, module names, licenses etc etc. (ignorant musing that doesn't help the general difficult of writing a Happy parser: ) Can they not use multiple Happy parsers generated from the same Happy file? http://www.haskell.org/happy/doc/html/sec-multiple-parsers.html Well the compositionality is there for the benefit of other packages, not just as an internal convenience for the Cabal lib. If we dropped that feature then yes we could use monolithic parsers for each of these types. Other packages do use the ability to build new parsers out of old however, in particular cabal-install does. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 03/18/2013 12:55 PM, Duncan Coutts wrote: [...] it is not simply the outline parser for cabal-style files that we're talking about. We also need parsers/pretty printers for all the various little types that make up the info about packages, like versions, package names, package ids, version constraints, module names, licenses etc etc. (ignorant musing that doesn't help the general difficult of writing a Happy parser: ) Can they not use multiple Happy parsers generated from the same Happy file? http://www.haskell.org/happy/doc/html/sec-multiple-parsers.html -Isaac ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 18 March 2013 03:08, Simon Peyton-Jones simo...@microsoft.com wrote: Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. Good idea -- esp if it makes the packaging story simpler. GHC already uses a binary format for interface files, so there’s no good reason to use a human-readable format for package data base stuff. For interface files you can read them with ghc --show-iface, and as Ian remarks something similar is already true for the package data base. A bit of background here: the binary serialisation of packages is an optimisation only (though an important one), and is done independently of Cabal. To install a Cabal package you can put the package description file that Cabal generates into GHC's database directory, and it is picked up automatically. The binary cache can be updated separately with 'ghc-pkg recache'. It was done this way to make it easier for Linux distros that want to install packages by moving files into place and then running comands. So I don't think you want Cabal to know about the binary serialization format, it's a GHC-only optimisation. Cheers, Simon Simon From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Mark Lentczner Sent: 17 March 2013 16:57 To: dag.odenh...@gmail.com Cc: Haskell Libraries; cabal-devel; Duncan Coutts; ghc-d...@haskell.org; Antoine Latter Subject: Re: Advance notice that I'd like to make Cabal depend on parsec This thread is raising all sorts of questions for me: Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. The issue of putting the yet one more HP package into GHC's core packages is increasing the exposure of the difficulty of the current GHC/HP relationship. See also threads in HP's mailing list for why can't we bump some packages in GHC's core set for the next HP release. The split arrangement is strange because we have two groups making up what is in the HP, but they have different processes and aims. The complex technical relationship between the moving parts only heightens the difficulty. Perhaps the major cause is that because GHC is shipped as a library itself, it exposes all it's package dependencies. And as it is a large, and growing, piece of software, the list only wants to grow. But I wonder how often GHC is used as a library itself? If not often, then perhaps GHC should be shipped as two parts: Just a compiler (plus the small number of packages that the compiler forces), and ghc-lib as an optional, even separate, package - perhaps one with even a traditional way of depending on other packages. In otherwords, users that wanted to incorporate the ghc-lib into their programs would depend, and download, and configure, and build, ghc-lib indpenendant of the GHC binaries installed on their system. Perhaps then, GHC, the compiler, built from ghc-lib, would be bootstrapped not from the past compiler, but from the past HP. Okay, perhaps that is all just fantasy. But, no other programming system operates the way we do. They all fall into one of two camps: The dominant implementation is maintained, built, and shipped along with a large collection of common packages. Examples: Python, Ruby, PHP, Java. The dominant implementation is shipped as a bare tool, and large common libraries are maintained and shipped independently. Examples: C++ (think g++ and boost), JavaScript (think browsers, and jQuery). We are in the middle and, I think, experiencing growing pains because of it. - Mark On Sat, Mar 16, 2013 at 3:42 PM, dag.odenh...@gmail.com dag.odenh...@gmail.com wrote: I'd love to have a proper parser and source-location-aware AST for sake of editor/IDE tools, so +1 from me. If you don't end up doing this after all, I'd still like to see your parser in a separate package, although I understand if you don't feel like maintaining two parsers especially given the tedious process for verifying they work similarly. I guess it could still be useful in the same way we find haskell-src-exts useful despite some incompatibilities with GHC. On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC
RE: Advance notice that I'd like to make Cabal depend on parsec
Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. Good idea -- esp if it makes the packaging story simpler. GHC already uses a binary format for interface files, so there’s no good reason to use a human-readable format for package data base stuff. For interface files you can read them with ghc --show-iface, and as Ian remarks something similar is already true for the package data base. Simon From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Mark Lentczner Sent: 17 March 2013 16:57 To: dag.odenh...@gmail.com Cc: Haskell Libraries; cabal-devel; Duncan Coutts; ghc-d...@haskell.org; Antoine Latter Subject: Re: Advance notice that I'd like to make Cabal depend on parsec This thread is raising all sorts of questions for me: Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. The issue of putting the yet one more HP package into GHC's core packages is increasing the exposure of the difficulty of the current GHC/HP relationship. See also threads in HP's mailing list for why can't we bump some packages in GHC's core set for the next HP release. The split arrangement is strange because we have two groups making up what is in the HP, but they have different processes and aims. The complex technical relationship between the moving parts only heightens the difficulty. Perhaps the major cause is that because GHC is shipped as a library itself, it exposes all it's package dependencies. And as it is a large, and growing, piece of software, the list only wants to grow. But I wonder how often GHC is used as a library itself? If not often, then perhaps GHC should be shipped as two parts: Just a compiler (plus the small number of packages that the compiler forces), and ghc-lib as an optional, even separate, package - perhaps one with even a traditional way of depending on other packages. In otherwords, users that wanted to incorporate the ghc-lib into their programs would depend, and download, and configure, and build, ghc-lib indpenendant of the GHC binaries installed on their system. Perhaps then, GHC, the compiler, built from ghc-lib, would be bootstrapped not from the past compiler, but from the past HP. Okay, perhaps that is all just fantasy. But, no other programming system operates the way we do. They all fall into one of two camps: * The dominant implementation is maintained, built, and shipped along with a large collection of common packages. Examples: Python, Ruby, PHP, Java. * The dominant implementation is shipped as a bare tool, and large common libraries are maintained and shipped independently. Examples: C++ (think g++ and boost), JavaScript (think browsers, and jQuery). We are in the middle and, I think, experiencing growing pains because of it. - Mark On Sat, Mar 16, 2013 at 3:42 PM, dag.odenh...@gmail.commailto:dag.odenh...@gmail.com dag.odenh...@gmail.commailto:dag.odenh...@gmail.com wrote: I'd love to have a proper parser and source-location-aware AST for sake of editor/IDE tools, so +1 from me. If you don't end up doing this after all, I'd still like to see your parser in a separate package, although I understand if you don't feel like maintaining two parsers especially given the tedious process for verifying they work similarly. I guess it could still be useful in the same way we find haskell-src-exts useful despite some incompatibilities with GHC. On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.commailto:duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. The rationale is that Cabal needs to parse things, like .cabal files and currently we do not have a decent parser in the core libraries. By decent I mean one that can produce error messages with source locations and that doesn't have unpredictable memory use. The only parser in the core libraries at the moment is Text.ParserCombinators.ReadP from the base package and that fails my decent criteria on both counts. Its idea of an error message is (), and on some largish .cabal
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, 2013-03-17 at 21:04 +0100, Henning Thielemann wrote: On Sun, 17 Mar 2013, Ian Lynagh wrote: I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. I already needed the human readable format in order to check what information a custom configure file generated. Or more generally, the classic way to make the pkg info if you were not using the simple cabal build system, but were using configure + make (e.g. wrapped in the cabal make build-type) was to generate the input file using configure/m4 text substitutions. So that did/does need to be human readable. As for the binary format, that's ghc's internal representation and not something I think we would want to standardise between Haskell implementations. Note that other Haskell impls use a package database that just uses these human readable files, with no hc-pkg style program. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, 2013-03-17 at 19:27 +, Ian Lynagh wrote: On Sun, Mar 17, 2013 at 09:57:25AM -0700, Mark Lentczner wrote: Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. The change in functionality to enable that would be that the binary cache would always have to be up to date, so ghc would only ever have to read the cache and never have to read the human-readable package files. Then you can have ghc-pkg depend on Cabal and use that for the human-readable bits, but since that's a program then it doesn't expose the Cabal lib dependency. Then ghc (and hence the ghc lib) would not depend on Cabal, but it would need a copy of the InstalledPackageInfo type and the other types that it uses. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Mon, Mar 18, 2013 at 12:34:16PM +, Duncan Coutts wrote: Then you can have ghc-pkg depend on Cabal and use that for the human-readable bits, but since that's a program then it doesn't expose the Cabal lib dependency. Then ghc (and hence the ghc lib) would not depend on Cabal, but it would need a copy of the InstalledPackageInfo type and the other types that it uses. Right, exactly. But we don't want to have 2 copies of the types, so could we move them into a Cabal-datatypes package which can be shared by both Cabal and GHC please? Thanks Ian ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Mon, 2013-03-18 at 12:43 +, Ian Lynagh wrote: On Mon, Mar 18, 2013 at 12:34:16PM +, Duncan Coutts wrote: Then you can have ghc-pkg depend on Cabal and use that for the human-readable bits, but since that's a program then it doesn't expose the Cabal lib dependency. Then ghc (and hence the ghc lib) would not depend on Cabal, but it would need a copy of the InstalledPackageInfo type and the other types that it uses. Right, exactly. But we don't want to have 2 copies of the types, so could we move them into a Cabal-datatypes package which can be shared by both Cabal and GHC please? That would be a rather annoying split. The cabal-lib package itself is supposed to be just types + parsers + pretty printers ( related utils). It'd end up looking like: cabal-types: types: InstalledPackageInfo, PackageName, Version, PackageId, InstalledPackageId, License cabal-lib: parser for InstalledPackageInfo, PackageName, Version, PackageId, InstalledPackageId, License modules Distribution.* cabal-build-simple: modules Distribution.Simple.* It's not as if one could frame this as a the aspects of the Cabal spec that compilers need because the other impls will want the parser + printers as well. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, Mar 17, 2013 at 09:57:25AM -0700, Mark Lentczner wrote: Is it essential, or even sensical, that the serialization format GHC needs for storing package info bear any relation to the human authored form? If not, the split out of the package types could be accomplished in a way where GHC uses simple show/read(P) style serialization for storage of package info, where as cabal-lib would use a lovely parsec parser for humans. I'd like this approach. I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. It would be a little less user-friendly, but maybe worth it to remove the ghc library dependencies on most-of-Cabal, mtl and parsec. Perhaps the major cause is that because GHC is shipped as a library itself, it exposes all it's package dependencies. Yes. In otherwords, users that wanted to incorporate the ghc-lib into their programs would depend, and download, and configure, and build, ghc-lib indpenendant of the GHC binaries I think this would create more problems than it solves. Okay, perhaps that is all just fantasy. But, no other programming system operates the way we do. They all fall into one of two camps: - The dominant implementation is maintained, built, and shipped along with a large collection of common packages. Examples: Python, Ruby, PHP, Java. - The dominant implementation is shipped as a bare tool, and large common libraries are maintained and shipped independently. Examples: C++ (think g++ and boost), JavaScript (think browsers, and jQuery). We are in the middle and, I think, experiencing growing pains because of it. I would say that we are doing the first option, in the form of the HP. It's just that the core gets frozen (i.e., ghc + libs gets released) earlier than the higher level libraries. I don't think that moving (back) to trying to freeze/release everything all at once would be an improvement. You just need to remain strong, and keep saying no :-) (you're doing a great job, BTW!) Thanks Ian ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, 17 Mar 2013, Ian Lynagh wrote: I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. I already needed the human readable format in order to check what information a custom configure file generated. ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, Mar 17, 2013 at 09:04:58PM +0100, Henning Thielemann wrote: On Sun, 17 Mar 2013, Ian Lynagh wrote: I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. I already needed the human readable format in order to check what information a custom configure file generated. You can use ghc-pkg describe p for that. I don't think you should ever need the human readable format unless you need to alter the package database by hand. -- Ian Lynagh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Sun, 17 Mar 2013, Ian Lynagh wrote: On Sun, Mar 17, 2013 at 09:04:58PM +0100, Henning Thielemann wrote: On Sun, 17 Mar 2013, Ian Lynagh wrote: I think it would be feasible to stop GHC itself from using the human readable format. The only place I can think of it being used is in the package database, but we could use either Read/Show for that, or just exclusively use the binary format. I already needed the human readable format in order to check what information a custom configure file generated. You can use ghc-pkg describe p for that. I don't think you should ever need the human readable format unless you need to alter the package database by hand. I think I also altered these package descriptions in order to check what the correct content should be. ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 03/15/2013 04:33 PM, Duncan Coutts wrote: On Fri, 2013-03-15 at 05:19 +0100, Bardur Arantsson wrote: On 03/14/2013 11:01 PM, Duncan Coutts wrote: On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote: On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote: On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? Of course .cabal files are mainly written by humans, not machines, so we should optimise for them. I though we were mostly talking about InstalledPackageInfo. That could be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right? In principle it could be any format. But it is a format specified in the Cabal spec, and shared between all the Haskell implementations. Unless there's a compelling reason to change all that, I'd rather not. Not having GHC core depend on parsec(*) sounds like a compelling reason to me...? (*) And the potential ensuing Cabal hell when a package depends on anything in GHC.*. ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
I'd love to have a proper parser and source-location-aware AST for sake of editor/IDE tools, so +1 from me. If you don't end up doing this after all, I'd still like to see your parser in a separate package, although I understand if you don't feel like maintaining two parsers especially given the tedious process for verifying they work similarly. I guess it could still be useful in the same way we find haskell-src-exts useful despite some incompatibilities with GHC. On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. The rationale is that Cabal needs to parse things, like .cabal files and currently we do not have a decent parser in the core libraries. By decent I mean one that can produce error messages with source locations and that doesn't have unpredictable memory use. The only parser in the core libraries at the moment is Text.ParserCombinators.ReadP from the base package and that fails my decent criteria on both counts. Its idea of an error message is (), and on some largish .cabal files we take 100s of MB to parse (I realise that the ReadP in the base package is a cutdown version so I don't mean to malign all ReadP-style libs out there). Partly due to the performance problem, the terrible .cabal file error messages, and partly because Doaitse Swierstra keeps asking me if .cabal files have a grammar, I've been writing a new .cabal parser. It uses an alex lexer and a parsec parser. It's fast and the error messages are pretty good. I have reverse engineered a grammar that closely matches the existing parser and .cabal files in the wild, though I'm not sure Doaitse will be satisfied with the approach I've taken to handling layout. Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). I've been doing regression testing against hackage and I'm satisfied that the new parser matches close enough. I've uncovered all kinds of horrors with .cabal files in the wild relying on quirks of the old parser. I've made adjustments for most of them but I will be breaking a half dozen old packages (most of those don't actually build correctly because though their syntax errors are not picked up by the parser, they do cause failure eventually). So far I've just done the outline parser, not the individual field parsers. I'll be doing those next and then integrate. So this change is still a bit of a ways off, but I thought it'd be useful to warn people now. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
Duncan Coutts wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. [..] Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Reuse is good, but the implication I'm worried about is this: Can I upgrade the parsec package installed on my system by doing a user install from hackage ? Without an implementation of more flexible package installations (multiple versions installed simultaneously), any dependency of GHC has its version number essentially set into stone. From this point of view, this proposal is not about making Cabal depend on parsec , but about fixing the canonical version of parsec . Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
* Heinrich Apfelmus apfel...@quantentunnel.de [2013-03-15 10:38:37+0100] Duncan Coutts wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. [..] Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Reuse is good, but the implication I'm worried about is this: Can I upgrade the parsec package installed on my system by doing a user install from hackage ? Without an implementation of more flexible package installations (multiple versions installed simultaneously), any dependency of GHC has its version number essentially set into stone. We've had that working for a long time. Right now I even have multiple installed versions of Cabal-the-library itself. It's not that Parsec would be automatically linked into each executable. It's just that ghc-the-program would have Parsec linked into it. Roman ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
* Ivan Lazar Miljenovic ivan.miljeno...@gmail.com [2013-03-15 22:12:47+1100] On 15 March 2013 22:05, Roman Cheplyaka r...@ro-che.info wrote: * Heinrich Apfelmus apfel...@quantentunnel.de [2013-03-15 10:38:37+0100] Duncan Coutts wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. [..] Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Reuse is good, but the implication I'm worried about is this: Can I upgrade the parsec package installed on my system by doing a user install from hackage ? Without an implementation of more flexible package installations (multiple versions installed simultaneously), any dependency of GHC has its version number essentially set into stone. We've had that working for a long time. Right now I even have multiple installed versions of Cabal-the-library itself. It's not that Parsec would be automatically linked into each executable. It's just that ghc-the-program would have Parsec linked into it. And ghc-the-library, which means that anything that uses ghc-as-a-library (and indeed even Cabal-as-a-library) no longer has a choice of which version of parsec they use. Right. But in this regard, GHC API and Cabal are no different from any other libraries that suffer from the same issue. (Except that it's hard to recompile GHC to use an alternative Parsec version.) And these are not exactly the most popular libraries either — so I doubt this change will have a large impact. Roman ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 14 Mar 2013, at 14:53, Duncan Coutts wrote: Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I fully agree that a real parser is needed for Cabal files. I implemented one myself, many years ago, using the polyparse library, and using a hand-written lexer. Feel free to reuse it (attached, together with a sample program) if you like, although I expect it has bit-rotted a little over time. Regards, Malcolm cabal-parse2.hs Description: Binary data CabalParse2.hs Description: Binary data ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Fri, 2013-03-15 at 10:10 +0100, Herbert Valerio Riedel wrote: Duncan Coutts duncan.cou...@googlemail.com writes: [...] I've been doing regression testing against hackage and I'm satisfied that the new parser matches close enough. I've uncovered all kinds of horrors with .cabal files in the wild relying on quirks of the old parser. I've made adjustments for most of them but I will be breaking a half dozen old packages (most of those don't actually build correctly because though their syntax errors are not picked up by the parser, they do cause failure eventually). btw, why not just keep the current parser as a legacy parser in the code, for older .cabal files (or as a fallback parser, in case the new stricter parsec-parser fails)? I'm satisfied at this point that the number of packages affected by the change is so low that it's not worth the extra maintenance. As I mentioned, most of the ones that break in the new parser are actually already broken in the sense that they will not build (because of mistakes in the .cabal file that just happen not to be caught by the old parser). So the amount of real breakage is trivial. Also, with the new hackage server we will be able to fix .cabal files post-release so if we do care about those few older packages we can actually fix them. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Fri, 2013-03-15 at 12:37 +0800, Conrad Parker wrote: On 14 March 2013 22:53, Duncan Coutts duncan.cou...@googlemail.com wrote: I've been doing regression testing against hackage and I'm satisfied that the new parser matches close enough. I've uncovered all kinds of horrors with .cabal files in the wild relying on quirks of the old parser. I've made adjustments for most of them but I will be breaking a half dozen old packages When you say you've made adjustments for dodgy .cabal files in the wild, do you mean that you'll send those maintainers patches that make their cabal files less dodgy, or do you mean you've added hacks to your parser to reproduce the quirky behaviour? The latter, but the egregiousness of the hacks is actually not too bad in the end. I don't find it revolting. For the worst examples I didn't make adjustments and those ones will break. I think I've made a reasonable judgement about the where to draw the line between the two. I can look into generating warnings in those cases (which is probably better than me emailing them). Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Fri, 2013-03-15 at 05:19 +0100, Bardur Arantsson wrote: On 03/14/2013 11:01 PM, Duncan Coutts wrote: On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote: On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote: On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? Of course .cabal files are mainly written by humans, not machines, so we should optimise for them. I though we were mostly talking about InstalledPackageInfo. That could be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right? In principle it could be any format. But it is a format specified in the Cabal spec, and shared between all the Haskell implementations. Unless there's a compelling reason to change all that, I'd rather not. Another option if GHC really also needs to parse .cabal files: That's ok, it doesn't. GHC use Cabal when building ghc, but at runtime it's just using the InstalledPackageInfo type, parser (and perhaps some index utils). -- Duncan Coutts, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Advance notice that I'd like to make Cabal depend on parsec
Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. The rationale is that Cabal needs to parse things, like .cabal files and currently we do not have a decent parser in the core libraries. By decent I mean one that can produce error messages with source locations and that doesn't have unpredictable memory use. The only parser in the core libraries at the moment is Text.ParserCombinators.ReadP from the base package and that fails my decent criteria on both counts. Its idea of an error message is (), and on some largish .cabal files we take 100s of MB to parse (I realise that the ReadP in the base package is a cutdown version so I don't mean to malign all ReadP-style libs out there). Partly due to the performance problem, the terrible .cabal file error messages, and partly because Doaitse Swierstra keeps asking me if .cabal files have a grammar, I've been writing a new .cabal parser. It uses an alex lexer and a parsec parser. It's fast and the error messages are pretty good. I have reverse engineered a grammar that closely matches the existing parser and .cabal files in the wild, though I'm not sure Doaitse will be satisfied with the approach I've taken to handling layout. Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). I've been doing regression testing against hackage and I'm satisfied that the new parser matches close enough. I've uncovered all kinds of horrors with .cabal files in the wild relying on quirks of the old parser. I've made adjustments for most of them but I will be breaking a half dozen old packages (most of those don't actually build correctly because though their syntax errors are not picked up by the parser, they do cause failure eventually). So far I've just done the outline parser, not the individual field parsers. I'll be doing those next and then integrate. So this change is still a bit of a ways off, but I thought it'd be useful to warn people now. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 14:53 +, Duncan Coutts wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. It's already been pointed out to me that this also implies the following dependencies: text, deepseq, mtl, transformers deepseq is a core package already I think, though ghc doesn't actually depend on it currently. I should also say that I want to make Cabal depend on bytestring and text too. -- Duncan Coutts, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. +1 from me, although the amount of potential knock-on work might be discouraging. The current cabal-install bootstrap process (which is currently pretty easy and is necessary at times) will get a bunch more deps as a result of this change, no? -- Gregory Collins g...@gregorycollins.net ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote: On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. +1 from me, although the amount of potential knock-on work might be discouraging. The current cabal-install bootstrap process (which is currently pretty easy and is necessary at times) will get a bunch more deps as a result of this change, no? Yes it will, but given that we do have a script it's not too bad I think. And overall I think its worth it to have the better error messages, performance and memory use. Do you have any idea how slow it is to parse all the .cabal files on hackage, and how much memory that takes? You'd be horrified :-) Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
This GHC dependency on Cabal is putting a rather troubling constraint in Cabal's evolution, which in my opinion is a serious problem. When I first took a look at the dependencies between GHC and Cabal I found it a bit strange that GHC would depend on Cabal as I would expect GHC to be as low in the dependency tree as possible to avoid exactly these kinds of problems. These GHC dependencies on Cabal are in fact small (see http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png for a summary) and with a little bit of refactoring it would be possible to split these dependencies into a very small shared package with minimal or no further dependencies. This would liberate Cabal to make the necessary refactoring. IMHO, the addition of these new dependencies to Cabal should go together with splitting the GHC-Cabal shared dependencies into a separate package so that there would be no additional coordination needed from then on between these two development efforts (except when dealing with this new package). On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote: On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. +1 from me, although the amount of potential knock-on work might be discouraging. The current cabal-install bootstrap process (which is currently pretty easy and is necessary at times) will get a bunch more deps as a result of this change, no? Yes it will, but given that we do have a script it's not too bad I think. And overall I think its worth it to have the better error messages, performance and memory use. Do you have any idea how slow it is to parse all the .cabal files on hackage, and how much memory that takes? You'd be horrified :-) Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 12:22 -0300, Administrator wrote: This GHC dependency on Cabal is putting a rather troubling constraint in Cabal's evolution, which in my opinion is a serious problem. When I first took a look at the dependencies between GHC and Cabal I found it a bit strange that GHC would depend on Cabal as I would expect GHC to be as low in the dependency tree as possible to avoid exactly these kinds of problems. The problem is that a compiler is a rather sophisticated application and so though you'd like it to have minimal deps, it needs to do so much stuff that it ends up needing lots of deps to support its features. Things would be easier if that were not the case, and it's made harder by the fact that ghc is not just a program, but it's exposed as a library, which exposes all of its dependencies. These GHC dependencies on Cabal are in fact small (see http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png for a summary) and with a little bit of refactoring it would be possible to split these dependencies into a very small shared package with minimal or no further dependencies. This would liberate Cabal to make the necessary refactoring. Except that the bits of Cabal that ghc needs are exactly the bits that will now need parsec, text etc. The shared part would be the part that defines the InstalledPackageInfo and the parser for that. Also, though the ghc library has only relatively small dependencies on Cabal, the ghc build process uses Cabal extensively, and currently the system is that libraries that ghc needs to build get included as core libraries and shipped with ghc. That itself could change but it's also more work. IMHO, the addition of these new dependencies to Cabal should go together with splitting the GHC-Cabal shared dependencies into a separate package so that there would be no additional coordination needed from then on between these two development efforts (except when dealing with this new package). So I would consider this if I thought it'd make a difference. In particular at some point we'll want to split the Cabal lib into the bit that just defines types and parsers etc, and the part that is a build system. But even that wouldn't save us any dependencies in this situation. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, Mar 14, 2013 at 7:53 AM, Duncan Coutts duncan.cou...@googlemail.com wrote: Hi folks, I want to give you advance notice that I would like to make Cabal depend on parsec. The implication is that GHC would therefore depend on parsec and thus it would become a core package, rather than just a HP package. So this would affect both GHC and the HP, though I hope not too much. The rationale is that Cabal needs to parse things, like .cabal files and currently we do not have a decent parser in the core libraries. By decent I mean one that can produce error messages with source locations and that doesn't have unpredictable memory use. The only parser in the core libraries at the moment is Text.ParserCombinators.ReadP from the base package and that fails my decent criteria on both counts. Its idea of an error message is (), and on some largish .cabal files we take 100s of MB to parse (I realise that the ReadP in the base package is a cutdown version so I don't mean to malign all ReadP-style libs out there). Partly due to the performance problem, the terrible .cabal file error messages, and partly because Doaitse Swierstra keeps asking me if .cabal files have a grammar, I've been writing a new .cabal parser. It uses an alex lexer and a parsec parser. It's fast and the error messages are pretty good. I have reverse engineered a grammar that closely matches the existing parser and .cabal files in the wild, though I'm not sure Doaitse will be satisfied with the approach I've taken to handling layout. Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Failed attempt aside for a moment, I think you should reconsider happy. Can you learn how to do layout from reading the GHC source? The happy documentation that explains how to attach a monad (you could use it to communicate between alex and happy for layout info) is a bit misleading but I have examples I can share with you. I haven't specifically tackled the layout problem but I could try to make a parser if it would help. One major benefit of using happy is that the productions of the grammar can be analyzed for shift/shift and shift/reduce conflicts. The equivalent analysis doesn't appear to be possible in parsec. In theory, applicative parsers should allow for this but my understanding is that parsec does not have this feature for its applicative subset. Other benefits are: a) GHC can certainly use parers generated by it, b) the generated code uses common dependencies, c) it's fast, d) it's expressive. What is it about happy parser errors that you don't like? Do you know examples where parsec does a better job? I have an alex + happy parser for a tiny functional language that I can share with you if you'd like to give it another go. It doesn't support layout at the moment, but I think I could add that. Jason ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
RE: Advance notice that I'd like to make Cabal depend on parsec
Yes I think that'd be a great plan. It's bizarre that GHC depends on *all* of Cabal, but only uses a tiny part of it (more or less the Package data type I think). Simon | -Original Message- | From: cabal-devel-boun...@haskell.org [mailto:cabal-devel-boun...@haskell.org] | On Behalf Of Administrator | Sent: 14 March 2013 15:23 | To: Duncan Coutts | Cc: Lentczner; cabal-devel; Haskell Libraries; ghc-d...@haskell.org | Subject: Re: Advance notice that I'd like to make Cabal depend on parsec | | This GHC dependency on Cabal is putting a rather troubling constraint | in Cabal's evolution, which in my opinion is a serious problem. When I | first took a look at the dependencies between GHC and Cabal I found it | a bit strange that GHC would depend on Cabal as I would expect GHC to | be as low in the dependency tree as possible to avoid exactly these | kinds of problems. | | These GHC dependencies on Cabal are in fact small (see | http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png | for a summary) and with a little bit of refactoring it would be | possible to split these dependencies into a very small shared package | with minimal or no further dependencies. This would liberate Cabal to | make the necessary refactoring. | | IMHO, the addition of these new dependencies to Cabal should go | together with splitting the GHC-Cabal shared dependencies into a | separate package so that there would be no additional coordination | needed from then on between these two development efforts (except when | dealing with this new package). | | | On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts | duncan.cou...@googlemail.com wrote: | On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote: | On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts | duncan.cou...@googlemail.com |wrote: | |Hi folks, | |I want to give you advance notice that I would like to make Cabal depend |on parsec. The implication is that GHC would therefore depend on parsec |and thus it would become a core package, rather than just a HP package. |So this would affect both GHC and the HP, though I hope not too much. | | | +1 from me, although the amount of potential knock-on work might be | discouraging. The current cabal-install bootstrap process (which is | currently pretty easy and is necessary at times) will get a bunch more deps | as a result of this change, no? | | Yes it will, but given that we do have a script it's not too bad I | think. And overall I think its worth it to have the better error | messages, performance and memory use. Do you have any idea how slow it | is to parse all the .cabal files on hackage, and how much memory that | takes? You'd be horrified :-) | | Duncan | | | ___ | cabal-devel mailing list | cabal-devel@haskell.org | http://www.haskell.org/mailman/listinfo/cabal-devel | | ___ | cabal-devel mailing list | cabal-devel@haskell.org | http://www.haskell.org/mailman/listinfo/cabal-devel ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 09:39 -0700, Jason Dagit wrote: Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Failed attempt aside for a moment, I think you should reconsider happy. Can you learn how to do layout from reading the GHC source? Yes I looked at it, though Haskell's layout is a bit different. The happy documentation that explains how to attach a monad (you could use it to communicate between alex and happy for layout info) is a bit misleading but I have examples I can share with you. Yes, that's what I was doing. I've used happy with monadic lexers with feedback between the lexer and parser before, e.g. when I wrote the C parser now used in language-c. I haven't specifically tackled the layout problem but I could try to make a parser if it would help. One major benefit of using happy is that the productions of the grammar can be analyzed for shift/shift and shift/reduce conflicts. Right, I know and that's great. For example there's no way I could have extended the C89 grammar I started with to cover C99 and GNU C extensions without the aid of that analysis. In this case I could not for the life of me construct a grammar that didn't have conflicts. Now it's plausible that now that I have worked out a grammar using parsec that I could have another go with happy and make it work, though I'd have to do the layout rather differently from how I do it with parsec. I was so pleased to finally have something work, I didn't feel like going back and trying it with happy again. I'd be happy to show you the code I've got with parsec and you can have a go with happy. The equivalent analysis doesn't appear to be possible in parsec. In theory, applicative parsers should allow for this but my understanding is that parsec does not have this feature for its applicative subset. Right, it doesn't. Other benefits are: a) GHC can certainly use parers generated by it, b) the generated code uses common dependencies, c) it's fast, d) it's expressive. Yes, I started with happy for all those reasons. The speed isn't a problem here. I'm using a fast lexer using alex and profiling indicates that still almost all the time is spent in the lexer and very little in the parser. (And that's after I submitted a patch to alex which gets us a 30% perf improvement.) About dependencies. So if we got it working with happy, there is still the issue that we need to parse the individual fields. The way the .cabal (and other files like ghc-pkg input files) work is that we parse the outline and then use individual parsers on the fields. For the latter we use a type class with a parser and pretty printer. That approach using a type class more or less requires that we use a parser combinator approach, rather than a monolithic happy style parser. And it's actually the field parsers that are a large part of the problem: they give us no error messages and their performance is atrocious (that's where we get the massive memory blowups). I think happy just isn't suitable there, so I'd want to use parsec (or any other decent combinator lib) for that part anyway. What is it about happy parser errors that you don't like? Do you know examples where parsec does a better job? Happy doesn't really give parser errors at all as such. It tells you where it failed and you can poke at the token stream and do what you like. It doesn't tell you what production you're in, what set of tokens it was expecting, nothing. Parsec tells us what tokens it was expecting and it tells us what production it was in and it has code to take that info and generate reasonable error messages from it (which I've extended to include the line in question and a visual position indicator). The reason ghc's parser error messages are so bad is exactly because happy doesn't really give us anything to work with. See frown for an example of how we can do better, while still using an LALR(1) approach. Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 16:44 +, Simon Peyton-Jones wrote: Yes I think that'd be a great plan. It's bizarre that GHC depends on *all* of Cabal, but only uses a tiny part of it (more or less the Package data type I think). The sensible way to split it (I think) would be like this: cabal-lib: Distribution.* -- containing definitions of types and parsers pretty printers -- including the InstalledPackageInfo cabal-build-simple Distribution.Simple.* -- the build system for Simple packages cabal -- the program, what is currently called cabal-install And then the ghc package would only depend on the cabal-lib package. But it's that package that is going to use bytestring, text, parsec etc, for its type definitions and parser. The InstalledPackageInfo and its parser is what ghc and ghc-pkg primarily use (though there's the opportunity to share code for handling package indexes) and that type and that parser are also going to end up using text and parsec etc. It'd be possible to split things out further and have InstalledPackageInfo and the types it uses and a special parser just for that with fewer dependencies, but I'm not sure that's really worth it and it would duplicate things (the types and/or parsers shared by InstalledPackageInfo and the source package description). So all in all, the split I suggest above makes sense for its own reasons but it wouldn't help ghc here, and a further split just to help ghc would be rather annoying. Duncan | -Original Message- | From: cabal-devel-boun...@haskell.org [mailto:cabal-devel-boun...@haskell.org] | On Behalf Of Administrator | Sent: 14 March 2013 15:23 | To: Duncan Coutts | Cc: Lentczner; cabal-devel; Haskell Libraries; ghc-d...@haskell.org | Subject: Re: Advance notice that I'd like to make Cabal depend on parsec | | This GHC dependency on Cabal is putting a rather troubling constraint | in Cabal's evolution, which in my opinion is a serious problem. When I | first took a look at the dependencies between GHC and Cabal I found it | a bit strange that GHC would depend on Cabal as I would expect GHC to | be as low in the dependency tree as possible to avoid exactly these | kinds of problems. | | These GHC dependencies on Cabal are in fact small (see | http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png | for a summary) and with a little bit of refactoring it would be | possible to split these dependencies into a very small shared package | with minimal or no further dependencies. This would liberate Cabal to | make the necessary refactoring. | | IMHO, the addition of these new dependencies to Cabal should go | together with splitting the GHC-Cabal shared dependencies into a | separate package so that there would be no additional coordination | needed from then on between these two development efforts (except when | dealing with this new package). | | | On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts | duncan.cou...@googlemail.com wrote: | On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote: | On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts | duncan.cou...@googlemail.com |wrote: | |Hi folks, | |I want to give you advance notice that I would like to make Cabal depend |on parsec. The implication is that GHC would therefore depend on parsec |and thus it would become a core package, rather than just a HP package. |So this would affect both GHC and the HP, though I hope not too much. | | | +1 from me, although the amount of potential knock-on work might be | discouraging. The current cabal-install bootstrap process (which is | currently pretty easy and is necessary at times) will get a bunch more deps | as a result of this change, no? | | Yes it will, but given that we do have a script it's not too bad I | think. And overall I think its worth it to have the better error | messages, performance and memory use. Do you have any idea how slow it | is to parse all the .cabal files on hackage, and how much memory that | takes? You'd be horrified :-) | | Duncan | | | ___ | cabal-devel mailing list | cabal-devel@haskell.org | http://www.haskell.org/mailman/listinfo/cabal-devel | | ___ | cabal-devel mailing list | cabal-devel@haskell.org | http://www.haskell.org/mailman/listinfo/cabal-devel ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? Regards, ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote: On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? I thought I heard someone say that most existing cabal files can be converted to valid yaml by adding one token at the start? If the change was that simple it might be doable. I think the trick is that we'd need to expose this by only treating the file as yaml if the minimum cabal version is = 1.17 (or so). In general these sorts of format changes are painful for users and I sense that now might be a bad time to change it (user morale is already a bit low with complaints of cabal hell, let's not exacerbate that by breaking existing .cabal files). Jason ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
* Duncan Coutts duncan.cou...@googlemail.com [2013-03-14 17:12:14+] The InstalledPackageInfo and its parser is what ghc and ghc-pkg primarily use (though there's the opportunity to share code for handling package indexes) and that type and that parser are also going to end up using text and parsec etc. Correct me if I'm wrong, but isn't it just a strange coincidence that InstalledPackageInfo is serialised in the format similar to .cabal format? InstalledPackageInfos aren't supposed to be edited by hand and do not need good error reporting. They can be serialized using any serialization library. (Then again, any serialization library like aeson would probably bring more dependencies than you're considering...) Roman ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 21:29 +0200, Roman Cheplyaka wrote: * Duncan Coutts duncan.cou...@googlemail.com [2013-03-14 17:12:14+] The InstalledPackageInfo and its parser is what ghc and ghc-pkg primarily use (though there's the opportunity to share code for handling package indexes) and that type and that parser are also going to end up using text and parsec etc. Correct me if I'm wrong, but isn't it just a strange coincidence that InstalledPackageInfo is serialised in the format similar to .cabal format? It's not a very strange coincidence. The type is not specific to ghc, it's defined in a compiler-neutral way by the original Cabal spec. So since both the source package and installed package info was defined in the Cabal spec, using the same kind of external syntax and sharing many of the same types, then they both ended up in the Cabal lib and share the same parsers pretty printers. InstalledPackageInfos aren't supposed to be edited by hand and do not need good error reporting. They can be serialized using any serialization library. Right, it doesn't need good error reporting (though it's nice if it's fast, which it isn't currently). The main advantage of the current arrangement is that the source and installed package descriptions get to share the same types and parser/pretty printer. I think there's a slightly more general point here though. Why is it that we don't have any good parser in the core packages? It's not just Cabal that needs to parse things. We have two useless parsers in the base package, ReadS and ReadP. Haskell is famous for its parser combinators and yet our core infrastructure is stuck with only useless ones! Duncan ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote: On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote: On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? Of course .cabal files are mainly written by humans, not machines, so we should optimise for them. The grammar I've got now really isn't that bad. In fact if we wanted to simplify it we'd rip out the bits that are designed to make it easier to generate by programs: we'd eliminate the explicit {} syntax and just use layout. Allowing either is what makes the grammar more complex. But as I say, I'm satisfied that the grammar is ok. I thought I heard someone say that most existing cabal files can be converted to valid yaml by adding one token at the start? If the change was that simple it might be doable. I think the trick is that we'd need to expose this by only treating the file as yaml if the minimum cabal version is = 1.17 (or so). I know people have compared it to yaml and suggested we just use yaml, but I don't think it's that close syntactically. I did look into this when I started and I think there are too many differences to make it practical to switch to yaml (or a subset). In general these sorts of format changes are painful for users and I sense that now might be a bad time to change it (user morale is already a bit low with complaints of cabal hell, let's not exacerbate that by breaking existing .cabal files). Right. I'm satisfied the format is basically ok, we don't need any breaking changes. -- Duncan Coutts, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 03/14/2013 11:01 PM, Duncan Coutts wrote: On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote: On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote: On 03/14/2013 03:53 PM, Duncan Coutts wrote: Hi folks, Why did I choose parsec? Practicality dictates that I can only use things in the core libraries, and the nearest thing we have to that is the parser lib that is in the HP. I tried to use happy but I could not construct a grammar/lexer combo to handle the layout (also, happy is not exactly known for its great error messages). Just thinking out loud here, but what about ditching the current format for something that's simpler to parse/generate? Like, say, JSON? Of course .cabal files are mainly written by humans, not machines, so we should optimise for them. I though we were mostly talking about InstalledPackageInfo. That could be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right? Another option if GHC really also needs to parse .cabal files: - Introduce a format for Cabal files that's trivial to hand-code a recursive descent parser for. - Add a command in Cabal to generate that format from a .cabal file. - Have cabal sdist automatically generate that file and put it into the uploaded archive. Regards, ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
Re: Advance notice that I'd like to make Cabal depend on parsec
On 14 March 2013 22:53, Duncan Coutts duncan.cou...@googlemail.com wrote: I've been doing regression testing against hackage and I'm satisfied that the new parser matches close enough. I've uncovered all kinds of horrors with .cabal files in the wild relying on quirks of the old parser. I've made adjustments for most of them but I will be breaking a half dozen old packages When you say you've made adjustments for dodgy .cabal files in the wild, do you mean that you'll send those maintainers patches that make their cabal files less dodgy, or do you mean you've added hacks to your parser to reproduce the quirky behaviour? Conrad. ___ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel