subject:"Advance notice that I'd like to make Cabal depend on parsec"

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-22 Thread Duncan Coutts

On Thu, 2013-03-21 at 17:51 -0400, Isaac Dupree wrote:
 On 03/18/2013 12:55 PM, Duncan Coutts wrote:
  [...]  it
  is not simply the outline parser for cabal-style files that we're
  talking about. We also need parsers/pretty printers for all the various
  little types that make up the info about packages, like versions,
  package names, package ids, version constraints, module names, licenses
  etc etc.
 
 (ignorant musing that doesn't help the general difficult of writing a 
 Happy parser: )
 Can they not use multiple Happy parsers generated from the same Happy file?
 http://www.haskell.org/happy/doc/html/sec-multiple-parsers.html

Well the compositionality is there for the benefit of other packages,
not just as an internal convenience for the Cabal lib. If we dropped
that feature then yes we could use monolithic parsers for each of these
types. Other packages do use the ability to build new parsers out of old
however, in particular cabal-install does.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-21 Thread Isaac Dupree


On 03/18/2013 12:55 PM, Duncan Coutts wrote:

[...]  it
is not simply the outline parser for cabal-style files that we're
talking about. We also need parsers/pretty printers for all the various
little types that make up the info about packages, like versions,
package names, package ids, version constraints, module names, licenses
etc etc.


(ignorant musing that doesn't help the general difficult of writing a 
Happy parser: )

Can they not use multiple Happy parsers generated from the same Happy file?
http://www.haskell.org/happy/doc/html/sec-multiple-parsers.html

-Isaac


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-21 Thread Simon Marlow

On 18 March 2013 03:08, Simon Peyton-Jones simo...@microsoft.com wrote:
 Is it essential, or even sensical, that the serialization format GHC needs
 for storing package info bear any relation to the human authored form? If
 not, the split out of the package types could be accomplished in a way where
 GHC uses simple show/read(P) style serialization for storage of package
 info, where as cabal-lib would use a lovely parsec parser for humans. I'd
 like this approach.



 Good idea -- esp if it makes the packaging story simpler.  GHC already uses
 a binary format for interface files, so there’s no good reason to use a
 human-readable format for package data base stuff.  For interface files you
 can read them with ghc --show-iface, and as Ian remarks something similar is
 already true for the package data base.

A bit of background here: the binary serialisation of packages is an
optimisation only (though an important one), and is done independently
of Cabal.  To install a Cabal package you can put the package
description file that Cabal generates into GHC's database directory,
and it is picked up automatically.  The binary cache can be updated
separately with 'ghc-pkg recache'.  It was done this way to make it
easier for Linux distros that want to install packages by moving files
into place and then running comands.

So I don't think you want Cabal to know about the binary serialization
format, it's a GHC-only optimisation.

Cheers,
Simon



 Simon



 From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] On
 Behalf Of Mark Lentczner
 Sent: 17 March 2013 16:57
 To: dag.odenh...@gmail.com
 Cc: Haskell Libraries; cabal-devel; Duncan Coutts; ghc-d...@haskell.org;
 Antoine Latter
 Subject: Re: Advance notice that I'd like to make Cabal depend on parsec



 This thread is raising all sorts of questions for me:



 Is it essential, or even sensical, that the serialization format GHC needs
 for storing package info bear any relation to the human authored form? If
 not, the split out of the package types could be accomplished in a way where
 GHC uses simple show/read(P) style serialization for storage of package
 info, where as cabal-lib would use a lovely parsec parser for humans. I'd
 like this approach.



 The issue of putting the yet one more HP package into GHC's core packages is
 increasing the exposure of the difficulty of the current GHC/HP
 relationship. See also threads in HP's mailing list for why can't we bump
 some packages in GHC's core set for the next HP release. The split
 arrangement is strange because we have two groups making up what is in the
 HP, but they have different processes and aims. The complex technical
 relationship between the moving parts only heightens the difficulty.



 Perhaps the major cause is that because GHC is shipped as a library itself,
 it exposes all it's package dependencies. And as it is a large, and growing,
 piece of software, the list only wants to grow. But I wonder how often GHC
 is used as a library itself? If not often, then perhaps GHC should be
 shipped as two parts: Just a compiler (plus the small number of packages
 that the compiler forces), and ghc-lib as an optional, even separate,
 package - perhaps one with even a traditional way of depending on other
 packages. In otherwords, users that wanted to incorporate the ghc-lib into
 their programs would depend, and download, and configure, and build, ghc-lib
 indpenendant of the GHC binaries installed on their system. Perhaps then,
 GHC, the compiler, built from ghc-lib, would be bootstrapped not from the
 past compiler, but from the past HP.



 Okay, perhaps that is all just fantasy. But, no other programming system
 operates the way we do. They all fall into one of two camps:

 The dominant implementation is maintained, built, and shipped along with a
 large collection of common packages. Examples: Python, Ruby, PHP, Java.
 The dominant implementation is shipped as a bare tool, and large common
 libraries are maintained and shipped independently. Examples: C++ (think g++
 and boost), JavaScript (think browsers, and jQuery).

 We are in the middle and, I think, experiencing growing pains because of it.



 - Mark



 On Sat, Mar 16, 2013 at 3:42 PM, dag.odenh...@gmail.com
 dag.odenh...@gmail.com wrote:

 I'd love to have a proper parser and source-location-aware AST for sake of
 editor/IDE tools, so +1 from me. If you don't end up doing this after all,
 I'd still like to see your parser in a separate package, although I
 understand if you don't feel like maintaining two parsers especially given
 the tedious process for verifying they work similarly. I guess it could
 still be useful in the same way we find haskell-src-exts useful despite some
 incompatibilities with GHC.



 On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts
 duncan.cou...@googlemail.com wrote:

 Hi folks,

 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC

RE: Advance notice that I'd like to make Cabal depend on parsec

2013-03-18 Thread Simon Peyton-Jones

Is it essential, or even sensical, that the serialization format GHC needs for 
storing package info bear any relation to the human authored form? If not, the 
split out of the package types could be accomplished in a way where GHC uses 
simple show/read(P) style serialization for storage of package info, where as 
cabal-lib would use a lovely parsec parser for humans. I'd like this approach.

Good idea -- esp if it makes the packaging story simpler.  GHC already uses a 
binary format for interface files, so there’s no good reason to use a 
human-readable format for package data base stuff.  For interface files you can 
read them with ghc --show-iface, and as Ian remarks something similar is 
already true for the package data base.

Simon

From: ghc-devs-boun...@haskell.org [mailto:ghc-devs-boun...@haskell.org] On 
Behalf Of Mark Lentczner
Sent: 17 March 2013 16:57
To: dag.odenh...@gmail.com
Cc: Haskell Libraries; cabal-devel; Duncan Coutts; ghc-d...@haskell.org; 
Antoine Latter
Subject: Re: Advance notice that I'd like to make Cabal depend on parsec

This thread is raising all sorts of questions for me:

Is it essential, or even sensical, that the serialization format GHC needs for 
storing package info bear any relation to the human authored form? If not, the 
split out of the package types could be accomplished in a way where GHC uses 
simple show/read(P) style serialization for storage of package info, where as 
cabal-lib would use a lovely parsec parser for humans. I'd like this approach.

The issue of putting the yet one more HP package into GHC's core packages is 
increasing the exposure of the difficulty of the current GHC/HP relationship. 
See also threads in HP's mailing list for why can't we bump some packages in 
GHC's core set for the next HP release. The split arrangement is strange 
because we have two groups making up what is in the HP, but they have different 
processes and aims. The complex technical relationship between the moving parts 
only heightens the difficulty.

Perhaps the major cause is that because GHC is shipped as a library itself, it 
exposes all it's package dependencies. And as it is a large, and growing, piece 
of software, the list only wants to grow. But I wonder how often GHC is used as 
a library itself? If not often, then perhaps GHC should be shipped as two 
parts: Just a compiler (plus the small number of packages that the compiler 
forces), and ghc-lib as an optional, even separate, package - perhaps one with 
even a traditional way of depending on other packages. In otherwords, users 
that wanted to incorporate the ghc-lib into their programs would depend, and 
download, and configure, and build, ghc-lib indpenendant of the GHC binaries 
installed on their system. Perhaps then, GHC, the compiler, built from ghc-lib, 
would be bootstrapped not from the past compiler, but from the past HP.

Okay, perhaps that is all just fantasy. But, no other programming system 
operates the way we do. They all fall into one of two camps:

  *   The dominant implementation is maintained, built, and shipped along with 
a large collection of common packages. Examples: Python, Ruby, PHP, Java.
  *   The dominant implementation is shipped as a bare tool, and large common 
libraries are maintained and shipped independently. Examples: C++ (think g++ 
and boost), JavaScript (think browsers, and jQuery).
We are in the middle and, I think, experiencing growing pains because of it.

- Mark

On Sat, Mar 16, 2013 at 3:42 PM, 
dag.odenh...@gmail.commailto:dag.odenh...@gmail.com 
dag.odenh...@gmail.commailto:dag.odenh...@gmail.com wrote:
I'd love to have a proper parser and source-location-aware AST for sake of 
editor/IDE tools, so +1 from me. If you don't end up doing this after all, I'd 
still like to see your parser in a separate package, although I understand if 
you don't feel like maintaining two parsers especially given the tedious 
process for verifying they work similarly. I guess it could still be useful in 
the same way we find haskell-src-exts useful despite some incompatibilities 
with GHC.

On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts 
duncan.cou...@googlemail.commailto:duncan.cou...@googlemail.com wrote:
Hi folks,

I want to give you advance notice that I would like to make Cabal depend
on parsec. The implication is that GHC would therefore depend on parsec
and thus it would become a core package, rather than just a HP package.
So this would affect both GHC and the HP, though I hope not too much.

The rationale is that Cabal needs to parse things, like .cabal files and
currently we do not have a decent parser in the core libraries. By
decent I mean one that can produce error messages with source locations
and that doesn't have unpredictable memory use. The only parser in the
core libraries at the moment is Text.ParserCombinators.ReadP from the
base package and that fails my decent criteria on both counts. Its
idea of an error message is (), and on some largish .cabal

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-18 Thread Duncan Coutts

On Sun, 2013-03-17 at 21:04 +0100, Henning Thielemann wrote:
 
 On Sun, 17 Mar 2013, Ian Lynagh wrote:
 
  I think it would be feasible to stop GHC itself from using the human
  readable format. The only place I can think of it being used is in the
  package database, but we could use either Read/Show for that, or just
  exclusively use the binary format.
 
 I already needed the human readable format in order to check what 
 information a custom configure file generated.

Or more generally, the classic way to make the pkg info if you were not
using the simple cabal build system, but were using configure + make
(e.g. wrapped in the cabal make build-type) was to generate the input
file using configure/m4 text substitutions. So that did/does need to be
human readable.

As for the binary format, that's ghc's internal representation and not
something I think we would want to standardise between Haskell
implementations. Note that other Haskell impls use a package database
that just uses these human readable files, with no hc-pkg style program.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-18 Thread Duncan Coutts

On Sun, 2013-03-17 at 19:27 +, Ian Lynagh wrote:
 On Sun, Mar 17, 2013 at 09:57:25AM -0700, Mark Lentczner wrote:
  
  Is it essential, or even sensical, that the serialization format GHC needs
  for storing package info bear any relation to the human authored form? If
  not, the split out of the package types could be accomplished in a way
  where GHC uses simple show/read(P) style serialization for storage of
  package info, where as cabal-lib would use a lovely parsec parser for
  humans. I'd like this approach.
 
 I think it would be feasible to stop GHC itself from using the human
 readable format. The only place I can think of it being used is in the
 package database, but we could use either Read/Show for that, or just
 exclusively use the binary format.

The change in functionality to enable that would be that the binary
cache would always have to be up to date, so ghc would only ever have
to read the cache and never have to read the human-readable package
files.

Then you can have ghc-pkg depend on Cabal and use that for the
human-readable bits, but since that's a program then it doesn't expose
the Cabal lib dependency. Then ghc (and hence the ghc lib) would not
depend on Cabal, but it would need a copy of the InstalledPackageInfo
type and the other types that it uses.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-18 Thread Ian Lynagh

On Mon, Mar 18, 2013 at 12:34:16PM +, Duncan Coutts wrote:
 
 Then you can have ghc-pkg depend on Cabal and use that for the
 human-readable bits, but since that's a program then it doesn't expose
 the Cabal lib dependency. Then ghc (and hence the ghc lib) would not
 depend on Cabal, but it would need a copy of the InstalledPackageInfo
 type and the other types that it uses.

Right, exactly. But we don't want to have 2 copies of the types, so
could we move them into a Cabal-datatypes package which can be shared by
both Cabal and GHC please?


Thanks
Ian


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-18 Thread Duncan Coutts

On Mon, 2013-03-18 at 12:43 +, Ian Lynagh wrote:
 On Mon, Mar 18, 2013 at 12:34:16PM +, Duncan Coutts wrote:
  
  Then you can have ghc-pkg depend on Cabal and use that for the
  human-readable bits, but since that's a program then it doesn't expose
  the Cabal lib dependency. Then ghc (and hence the ghc lib) would not
  depend on Cabal, but it would need a copy of the InstalledPackageInfo
  type and the other types that it uses.
 
 Right, exactly. But we don't want to have 2 copies of the types, so
 could we move them into a Cabal-datatypes package which can be shared by
 both Cabal and GHC please?

That would be a rather annoying split. The cabal-lib package itself is
supposed to be just types + parsers + pretty printers ( related utils).
It'd end up looking like:

cabal-types:
  types: InstalledPackageInfo, PackageName, Version, PackageId,
InstalledPackageId, License

cabal-lib:
  parser for InstalledPackageInfo, PackageName, Version, PackageId,
InstalledPackageId, License
  modules Distribution.*

cabal-build-simple:
  modules Distribution.Simple.*

It's not as if one could frame this as a the aspects of the Cabal spec
that compilers need because the other impls will want the parser +
printers as well.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-17 Thread Ian Lynagh

On Sun, Mar 17, 2013 at 09:57:25AM -0700, Mark Lentczner wrote:
 
 Is it essential, or even sensical, that the serialization format GHC needs
 for storing package info bear any relation to the human authored form? If
 not, the split out of the package types could be accomplished in a way
 where GHC uses simple show/read(P) style serialization for storage of
 package info, where as cabal-lib would use a lovely parsec parser for
 humans. I'd like this approach.

I think it would be feasible to stop GHC itself from using the human
readable format. The only place I can think of it being used is in the
package database, but we could use either Read/Show for that, or just
exclusively use the binary format.

It would be a little less user-friendly, but maybe worth it to remove
the ghc library dependencies on most-of-Cabal, mtl and parsec.

 Perhaps the major cause is that because GHC is shipped as a library itself,
 it exposes all it's package dependencies.

Yes.

 In otherwords, users that wanted to
 incorporate the ghc-lib into their programs would depend, and download, and
 configure, and build, ghc-lib indpenendant of the GHC binaries

I think this would create more problems than it solves.

 Okay, perhaps that is all just fantasy. But, no other programming system
 operates the way we do. They all fall into one of two camps:
 
- The dominant implementation is maintained, built, and shipped along
with a large collection of common packages. Examples: Python, Ruby, PHP,
Java.
- The dominant implementation is shipped as a bare tool, and large
common libraries are maintained and shipped independently. Examples: C++
(think g++ and boost), JavaScript (think browsers, and jQuery).
 
 We are in the middle and, I think, experiencing growing pains because of it.

I would say that we are doing the first option, in the form of the HP.
It's just that the core gets frozen (i.e., ghc + libs gets released)
earlier than the higher level libraries. I don't think that moving
(back) to trying to freeze/release everything all at once would be an
improvement.

You just need to remain strong, and keep saying no  :-)
(you're doing a great job, BTW!)


Thanks
Ian


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-17 Thread Henning Thielemann



On Sun, 17 Mar 2013, Ian Lynagh wrote:


I think it would be feasible to stop GHC itself from using the human
readable format. The only place I can think of it being used is in the
package database, but we could use either Read/Show for that, or just
exclusively use the binary format.


I already needed the human readable format in order to check what 
information a custom configure file generated.


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-17 Thread Ian Lynagh

On Sun, Mar 17, 2013 at 09:04:58PM +0100, Henning Thielemann wrote:
 
 On Sun, 17 Mar 2013, Ian Lynagh wrote:
 
 I think it would be feasible to stop GHC itself from using the human
 readable format. The only place I can think of it being used is in the
 package database, but we could use either Read/Show for that, or just
 exclusively use the binary format.
 
 I already needed the human readable format in order to check what
 information a custom configure file generated.

You can use ghc-pkg describe p for that.

I don't think you should ever need the human readable format unless you
need to alter the package database by hand.


-- 
Ian Lynagh, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-17 Thread Henning Thielemann



On Sun, 17 Mar 2013, Ian Lynagh wrote:


On Sun, Mar 17, 2013 at 09:04:58PM +0100, Henning Thielemann wrote:


On Sun, 17 Mar 2013, Ian Lynagh wrote:


I think it would be feasible to stop GHC itself from using the human
readable format. The only place I can think of it being used is in the
package database, but we could use either Read/Show for that, or just
exclusively use the binary format.


I already needed the human readable format in order to check what
information a custom configure file generated.


You can use ghc-pkg describe p for that.

I don't think you should ever need the human readable format unless you
need to alter the package database by hand.


I think I also altered these package descriptions in order to check what 
the correct content should be.


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-16 Thread Bardur Arantsson

On 03/15/2013 04:33 PM, Duncan Coutts wrote:
 On Fri, 2013-03-15 at 05:19 +0100, Bardur Arantsson wrote:
 On 03/14/2013 11:01 PM, Duncan Coutts wrote:
 On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote:
 On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson 
 s...@scientician.netwrote:

 On 03/14/2013 03:53 PM, Duncan Coutts wrote:
 Hi folks,

 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).


 Just thinking out loud here, but what about ditching the current format
 for something that's simpler to parse/generate? Like, say, JSON?

 Of course .cabal files are mainly written by humans, not machines, so we
 should optimise for them.

 I though we were mostly talking about InstalledPackageInfo. That could
 be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right?
 
 In principle it could be any format. But it is a format specified in the
 Cabal spec, and shared between all the Haskell implementations. Unless
 there's a compelling reason to change all that, I'd rather not.
 

Not having GHC core depend on parsec(*) sounds like a compelling reason
to me...?

(*) And the potential ensuing Cabal hell when a package depends on
anything in GHC.*.



___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-16 Thread dag.odenh...@gmail.com

I'd love to have a proper parser and source-location-aware AST for sake of
editor/IDE tools, so +1 from me. If you don't end up doing this after all,
I'd still like to see your parser in a separate package, although I
understand if you don't feel like maintaining two parsers especially given
the tedious process for verifying they work similarly. I guess it could
still be useful in the same way we find haskell-src-exts useful despite
some incompatibilities with GHC.


On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com
 wrote:

 Hi folks,

 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC would therefore depend on parsec
 and thus it would become a core package, rather than just a HP package.
 So this would affect both GHC and the HP, though I hope not too much.

 The rationale is that Cabal needs to parse things, like .cabal files and
 currently we do not have a decent parser in the core libraries. By
 decent I mean one that can produce error messages with source locations
 and that doesn't have unpredictable memory use. The only parser in the
 core libraries at the moment is Text.ParserCombinators.ReadP from the
 base package and that fails my decent criteria on both counts. Its
 idea of an error message is (), and on some largish .cabal files we take
 100s of MB to parse (I realise that the ReadP in the base package is a
 cutdown version so I don't mean to malign all ReadP-style libs out
 there).

 Partly due to the performance problem, the terrible .cabal file error
 messages, and partly because Doaitse Swierstra keeps asking me if .cabal
 files have a grammar, I've been writing a new .cabal parser. It uses an
 alex lexer and a parsec parser. It's fast and the error messages are
 pretty good. I have reverse engineered a grammar that closely matches
 the existing parser and .cabal files in the wild, though I'm not sure
 Doaitse will be satisfied with the approach I've taken to handling
 layout.

 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).

 I've been doing regression testing against hackage and I'm satisfied
 that the new parser matches close enough. I've uncovered all kinds of
 horrors with .cabal files in the wild relying on quirks of the old
 parser. I've made adjustments for most of them but I will be breaking a
 half dozen old packages (most of those don't actually build correctly
 because though their syntax errors are not picked up by the parser, they
 do cause failure eventually).

 So far I've just done the outline parser, not the individual field
 parsers. I'll be doing those next and then integrate. So this change is
 still a bit of a ways off, but I thought it'd be useful to warn people
 now.

 Duncan


 ___
 cabal-devel mailing list
 cabal-devel@haskell.org
 http://www.haskell.org/mailman/listinfo/cabal-devel

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Heinrich Apfelmus


Duncan Coutts wrote:

Hi folks,

I want to give you advance notice that I would like to make Cabal depend
on parsec. The implication is that GHC would therefore depend on parsec
and thus it would become a core package, rather than just a HP package.
So this would affect both GHC and the HP, though I hope not too much.

[..]

Why did I choose parsec? Practicality dictates that I can only use
things in the core libraries, and the nearest thing we have to that is
the parser lib that is in the HP. I tried to use happy but I could not
construct a grammar/lexer combo to handle the layout (also, happy is not
exactly known for its great error messages).


Reuse is good, but the implication I'm worried about is this: Can I 
upgrade the  parsec  package installed on my system by doing a user 
install from  hackage ? Without an implementation of more flexible 
package installations (multiple versions installed simultaneously), any 
dependency of GHC has its version number essentially set into stone.


From this point of view, this proposal is not about making Cabal depend 
on  parsec , but about fixing the canonical version of  parsec .



Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Roman Cheplyaka

* Heinrich Apfelmus apfel...@quantentunnel.de [2013-03-15 10:38:37+0100]
 Duncan Coutts wrote:
 Hi folks,
 
 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC would therefore depend on parsec
 and thus it would become a core package, rather than just a HP package.
 So this would affect both GHC and the HP, though I hope not too much.
 
 [..]
 
 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).
 
 Reuse is good, but the implication I'm worried about is this: Can I
 upgrade the  parsec  package installed on my system by doing a user
 install from  hackage ? Without an implementation of more flexible
 package installations (multiple versions installed simultaneously),
 any dependency of GHC has its version number essentially set into
 stone.

We've had that working for a long time. Right now I even have multiple
installed versions of Cabal-the-library itself.

It's not that Parsec would be automatically linked into each executable.
It's just that ghc-the-program would have Parsec linked into it.

Roman

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Roman Cheplyaka

* Ivan Lazar Miljenovic ivan.miljeno...@gmail.com [2013-03-15 22:12:47+1100]
 On 15 March 2013 22:05, Roman Cheplyaka r...@ro-che.info wrote:
  * Heinrich Apfelmus apfel...@quantentunnel.de [2013-03-15 10:38:37+0100]
  Duncan Coutts wrote:
  Hi folks,
  
  I want to give you advance notice that I would like to make Cabal depend
  on parsec. The implication is that GHC would therefore depend on parsec
  and thus it would become a core package, rather than just a HP package.
  So this would affect both GHC and the HP, though I hope not too much.
  
  [..]
  
  Why did I choose parsec? Practicality dictates that I can only use
  things in the core libraries, and the nearest thing we have to that is
  the parser lib that is in the HP. I tried to use happy but I could not
  construct a grammar/lexer combo to handle the layout (also, happy is not
  exactly known for its great error messages).
 
  Reuse is good, but the implication I'm worried about is this: Can I
  upgrade the  parsec  package installed on my system by doing a user
  install from  hackage ? Without an implementation of more flexible
  package installations (multiple versions installed simultaneously),
  any dependency of GHC has its version number essentially set into
  stone.
 
  We've had that working for a long time. Right now I even have multiple
  installed versions of Cabal-the-library itself.
 
  It's not that Parsec would be automatically linked into each executable.
  It's just that ghc-the-program would have Parsec linked into it.
 
 And ghc-the-library, which means that anything that uses
 ghc-as-a-library (and indeed even Cabal-as-a-library) no longer has a
 choice of which version of parsec they use.

Right. But in this regard, GHC API and Cabal are no different from any
other libraries that suffer from the same issue. (Except that it's hard
to recompile GHC to use an alternative Parsec version.) And these are
not exactly the most popular libraries either — so I doubt this change
will have a large impact.

Roman

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Malcolm Wallace


On 14 Mar 2013, at 14:53, Duncan Coutts wrote:

 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP.

I fully agree that a real parser is needed for Cabal files.  I implemented one 
myself, many years ago, using the polyparse library, and using a hand-written 
lexer.  Feel free to reuse it (attached, together with a sample program) if you 
like, although I expect it has bit-rotted a little over time.

Regards,
Malcolm




cabal-parse2.hs
Description: Binary data


CabalParse2.hs
Description: Binary data
___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Duncan Coutts

On Fri, 2013-03-15 at 10:10 +0100, Herbert Valerio Riedel wrote:
 Duncan Coutts duncan.cou...@googlemail.com writes:
 
 
 [...]
 
  I've been doing regression testing against hackage and I'm satisfied
  that the new parser matches close enough. I've uncovered all kinds of
  horrors with .cabal files in the wild relying on quirks of the old
  parser. I've made adjustments for most of them but I will be breaking a
  half dozen old packages (most of those don't actually build correctly
  because though their syntax errors are not picked up by the parser, they
  do cause failure eventually).
 
 btw, why not just keep the current parser as a legacy parser in the
 code, for older .cabal files (or as a fallback parser, in case the new
 stricter parsec-parser fails)?

I'm satisfied at this point that the number of packages affected by the
change is so low that it's not worth the extra maintenance. As I
mentioned, most of the ones that break in the new parser are actually
already broken in the sense that they will not build (because of
mistakes in the .cabal file that just happen not to be caught by the old
parser). So the amount of real breakage is trivial.

Also, with the new hackage server we will be able to fix .cabal files
post-release so if we do care about those few older packages we can
actually fix them.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Duncan Coutts

On Fri, 2013-03-15 at 12:37 +0800, Conrad Parker wrote:
 On 14 March 2013 22:53, Duncan Coutts duncan.cou...@googlemail.com wrote:
 
  I've been doing regression testing against hackage and I'm satisfied
  that the new parser matches close enough. I've uncovered all kinds of
  horrors with .cabal files in the wild relying on quirks of the old
  parser. I've made adjustments for most of them but I will be breaking a
  half dozen old packages
 
 When you say you've made adjustments for dodgy .cabal files in the
 wild, do you mean that you'll send those maintainers patches that make
 their cabal files less dodgy, or do you mean you've added hacks to
 your parser to reproduce the quirky behaviour?

The latter, but the egregiousness of the hacks is actually not too bad
in the end. I don't find it revolting. For the worst examples I didn't
make adjustments and those ones will break. I think I've made a
reasonable judgement about the where to draw the line between the two.

I can look into generating warnings in those cases (which is probably
better than me emailing them).

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-15 Thread Duncan Coutts

On Fri, 2013-03-15 at 05:19 +0100, Bardur Arantsson wrote:
 On 03/14/2013 11:01 PM, Duncan Coutts wrote:
  On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote:
  On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson 
  s...@scientician.netwrote:
 
  On 03/14/2013 03:53 PM, Duncan Coutts wrote:
  Hi folks,
 
  Why did I choose parsec? Practicality dictates that I can only use
  things in the core libraries, and the nearest thing we have to that is
  the parser lib that is in the HP. I tried to use happy but I could not
  construct a grammar/lexer combo to handle the layout (also, happy is not
  exactly known for its great error messages).
 
 
  Just thinking out loud here, but what about ditching the current format
  for something that's simpler to parse/generate? Like, say, JSON?
  
  Of course .cabal files are mainly written by humans, not machines, so we
  should optimise for them.
 
 I though we were mostly talking about InstalledPackageInfo. That could
 be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right?

In principle it could be any format. But it is a format specified in the
Cabal spec, and shared between all the Haskell implementations. Unless
there's a compelling reason to change all that, I'd rather not.

 Another option if GHC really also needs to parse .cabal files:

That's ok, it doesn't. GHC use Cabal when building ghc, but at runtime
it's just using the InstalledPackageInfo type, parser (and perhaps some
index utils).

-- 
Duncan Coutts, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

Hi folks,

I want to give you advance notice that I would like to make Cabal depend
on parsec. The implication is that GHC would therefore depend on parsec
and thus it would become a core package, rather than just a HP package.
So this would affect both GHC and the HP, though I hope not too much.

The rationale is that Cabal needs to parse things, like .cabal files and
currently we do not have a decent parser in the core libraries. By
decent I mean one that can produce error messages with source locations
and that doesn't have unpredictable memory use. The only parser in the
core libraries at the moment is Text.ParserCombinators.ReadP from the
base package and that fails my decent criteria on both counts. Its
idea of an error message is (), and on some largish .cabal files we take
100s of MB to parse (I realise that the ReadP in the base package is a
cutdown version so I don't mean to malign all ReadP-style libs out
there).

Partly due to the performance problem, the terrible .cabal file error
messages, and partly because Doaitse Swierstra keeps asking me if .cabal
files have a grammar, I've been writing a new .cabal parser. It uses an
alex lexer and a parsec parser. It's fast and the error messages are
pretty good. I have reverse engineered a grammar that closely matches
the existing parser and .cabal files in the wild, though I'm not sure
Doaitse will be satisfied with the approach I've taken to handling
layout.

Why did I choose parsec? Practicality dictates that I can only use
things in the core libraries, and the nearest thing we have to that is
the parser lib that is in the HP. I tried to use happy but I could not
construct a grammar/lexer combo to handle the layout (also, happy is not
exactly known for its great error messages).

I've been doing regression testing against hackage and I'm satisfied
that the new parser matches close enough. I've uncovered all kinds of
horrors with .cabal files in the wild relying on quirks of the old
parser. I've made adjustments for most of them but I will be breaking a
half dozen old packages (most of those don't actually build correctly
because though their syntax errors are not picked up by the parser, they
do cause failure eventually).

So far I've just done the outline parser, not the individual field
parsers. I'll be doing those next and then integrate. So this change is
still a bit of a ways off, but I thought it'd be useful to warn people
now.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 14:53 +, Duncan Coutts wrote:
 Hi folks,
 
 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC would therefore depend on parsec
 and thus it would become a core package, rather than just a HP package.
 So this would affect both GHC and the HP, though I hope not too much.

It's already been pointed out to me that this also implies the following
dependencies:

text, deepseq, mtl, transformers

deepseq is a core package already I think, though ghc doesn't actually
depend on it currently.

I should also say that I want to make Cabal depend on bytestring and
text too.

-- 
Duncan Coutts, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Gregory Collins

On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com
 wrote:

 Hi folks,

 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC would therefore depend on parsec
 and thus it would become a core package, rather than just a HP package.
 So this would affect both GHC and the HP, though I hope not too much.


+1 from me, although the amount of potential knock-on work might be
discouraging. The current cabal-install bootstrap process (which is
currently pretty easy and is necessary at times) will get a bunch more deps
as a result of this change, no?

-- 
Gregory Collins g...@gregorycollins.net
___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote:
 On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com
  wrote:
 
  Hi folks,
 
  I want to give you advance notice that I would like to make Cabal depend
  on parsec. The implication is that GHC would therefore depend on parsec
  and thus it would become a core package, rather than just a HP package.
  So this would affect both GHC and the HP, though I hope not too much.
 
 
 +1 from me, although the amount of potential knock-on work might be
 discouraging. The current cabal-install bootstrap process (which is
 currently pretty easy and is necessary at times) will get a bunch more deps
 as a result of this change, no?

Yes it will, but given that we do have a script it's not too bad I
think. And overall I think its worth it to have the better error
messages, performance and memory use. Do you have any idea how slow it
is to parse all the .cabal files on hackage, and how much memory that
takes? You'd be horrified :-)

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Administrator

This GHC dependency on Cabal is putting a rather troubling constraint
in Cabal's evolution, which in my opinion is a serious problem. When I
first took a look at the dependencies between GHC and Cabal I found it
a bit strange that GHC would depend on Cabal as I would expect GHC to
be as low in the dependency tree as possible to avoid exactly these
kinds of problems.

These GHC dependencies on Cabal are in fact small (see
http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png
for a summary) and with a little bit of refactoring it would be
possible to split these dependencies into a very small shared package
with minimal or no further dependencies. This would liberate Cabal to
make the necessary refactoring.

IMHO, the addition of these new dependencies to Cabal should go
together with splitting the GHC-Cabal shared dependencies into a
separate package so that there would be no additional coordination
needed from then on between these two development efforts (except when
dealing with this new package).


On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts
duncan.cou...@googlemail.com wrote:
 On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote:
 On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts duncan.cou...@googlemail.com
  wrote:

  Hi folks,
 
  I want to give you advance notice that I would like to make Cabal depend
  on parsec. The implication is that GHC would therefore depend on parsec
  and thus it would become a core package, rather than just a HP package.
  So this would affect both GHC and the HP, though I hope not too much.


 +1 from me, although the amount of potential knock-on work might be
 discouraging. The current cabal-install bootstrap process (which is
 currently pretty easy and is necessary at times) will get a bunch more deps
 as a result of this change, no?

 Yes it will, but given that we do have a script it's not too bad I
 think. And overall I think its worth it to have the better error
 messages, performance and memory use. Do you have any idea how slow it
 is to parse all the .cabal files on hackage, and how much memory that
 takes? You'd be horrified :-)

 Duncan


 ___
 cabal-devel mailing list
 cabal-devel@haskell.org
 http://www.haskell.org/mailman/listinfo/cabal-devel

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 12:22 -0300, Administrator wrote:
 This GHC dependency on Cabal is putting a rather troubling constraint
 in Cabal's evolution, which in my opinion is a serious problem. When I
 first took a look at the dependencies between GHC and Cabal I found it
 a bit strange that GHC would depend on Cabal as I would expect GHC to
 be as low in the dependency tree as possible to avoid exactly these
 kinds of problems.

The problem is that a compiler is a rather sophisticated application and
so though you'd like it to have minimal deps, it needs to do so much
stuff that it ends up needing lots of deps to support its features.

Things would be easier if that were not the case, and it's made harder
by the fact that ghc is not just a program, but it's exposed as a
library, which exposes all of its dependencies.

 These GHC dependencies on Cabal are in fact small (see
 http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png
 for a summary) and with a little bit of refactoring it would be
 possible to split these dependencies into a very small shared package
 with minimal or no further dependencies. This would liberate Cabal to
 make the necessary refactoring.

Except that the bits of Cabal that ghc needs are exactly the bits that
will now need parsec, text etc. The shared part would be the part that
defines the InstalledPackageInfo and the parser for that.

Also, though the ghc library has only relatively small dependencies on
Cabal, the ghc build process uses Cabal extensively, and currently the
system is that libraries that ghc needs to build get included as core
libraries and shipped with ghc. That itself could change but it's also
more work.

 IMHO, the addition of these new dependencies to Cabal should go
 together with splitting the GHC-Cabal shared dependencies into a
 separate package so that there would be no additional coordination
 needed from then on between these two development efforts (except when
 dealing with this new package).

So I would consider this if I thought it'd make a difference. In
particular at some point we'll want to split the Cabal lib into the bit
that just defines types and parsers etc, and the part that is a build
system. But even that wouldn't save us any dependencies in this
situation.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Jason Dagit

On Thu, Mar 14, 2013 at 7:53 AM, Duncan Coutts duncan.cou...@googlemail.com
 wrote:

 Hi folks,

 I want to give you advance notice that I would like to make Cabal depend
 on parsec. The implication is that GHC would therefore depend on parsec
 and thus it would become a core package, rather than just a HP package.
 So this would affect both GHC and the HP, though I hope not too much.

 The rationale is that Cabal needs to parse things, like .cabal files and
 currently we do not have a decent parser in the core libraries. By
 decent I mean one that can produce error messages with source locations
 and that doesn't have unpredictable memory use. The only parser in the
 core libraries at the moment is Text.ParserCombinators.ReadP from the
 base package and that fails my decent criteria on both counts. Its
 idea of an error message is (), and on some largish .cabal files we take
 100s of MB to parse (I realise that the ReadP in the base package is a
 cutdown version so I don't mean to malign all ReadP-style libs out
 there).

 Partly due to the performance problem, the terrible .cabal file error
 messages, and partly because Doaitse Swierstra keeps asking me if .cabal
 files have a grammar, I've been writing a new .cabal parser. It uses an
 alex lexer and a parsec parser. It's fast and the error messages are
 pretty good. I have reverse engineered a grammar that closely matches
 the existing parser and .cabal files in the wild, though I'm not sure
 Doaitse will be satisfied with the approach I've taken to handling
 layout.

 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).


Failed attempt aside for a moment, I think you should reconsider happy. Can
you learn how to do layout from reading the GHC source? The happy
documentation that explains how to attach a monad (you could use it to
communicate between alex and happy for layout info) is a bit misleading but
I have examples I can share with you. I haven't specifically tackled the
layout problem but I could try to make a parser if it would help.

One major benefit of using happy is that the productions of the grammar can
be analyzed for shift/shift and shift/reduce conflicts. The equivalent
analysis doesn't appear to be possible in parsec. In theory, applicative
parsers should allow for this but my understanding is that parsec does not
have this feature for its applicative subset.

Other benefits are: a) GHC can certainly use parers generated by it, b) the
generated code uses common dependencies, c) it's fast, d) it's expressive.

What is it about happy parser errors that you don't like? Do you know
examples where parsec does a better job?

I have an alex + happy parser for a tiny functional language that I can
share with you if you'd like to give it another go. It doesn't support
layout at the moment, but I think I could add that.

Jason
___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

RE: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Simon Peyton-Jones

Yes I think that'd be a great plan.  It's bizarre that GHC depends on *all* of 
Cabal, but only uses a tiny part of it (more or less the Package data type I 
think).

Simon

|  -Original Message-
|  From: cabal-devel-boun...@haskell.org 
[mailto:cabal-devel-boun...@haskell.org]
|  On Behalf Of Administrator
|  Sent: 14 March 2013 15:23
|  To: Duncan Coutts
|  Cc: Lentczner; cabal-devel; Haskell Libraries; ghc-d...@haskell.org
|  Subject: Re: Advance notice that I'd like to make Cabal depend on parsec
|  
|  This GHC dependency on Cabal is putting a rather troubling constraint
|  in Cabal's evolution, which in my opinion is a serious problem. When I
|  first took a look at the dependencies between GHC and Cabal I found it
|  a bit strange that GHC would depend on Cabal as I would expect GHC to
|  be as low in the dependency tree as possible to avoid exactly these
|  kinds of problems.
|  
|  These GHC dependencies on Cabal are in fact small (see
|  http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png
|  for a summary) and with a little bit of refactoring it would be
|  possible to split these dependencies into a very small shared package
|  with minimal or no further dependencies. This would liberate Cabal to
|  make the necessary refactoring.
|  
|  IMHO, the addition of these new dependencies to Cabal should go
|  together with splitting the GHC-Cabal shared dependencies into a
|  separate package so that there would be no additional coordination
|  needed from then on between these two development efforts (except when
|  dealing with this new package).
|  
|  
|  On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts
|  duncan.cou...@googlemail.com wrote:
|   On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote:
|   On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts
|  duncan.cou...@googlemail.com
|wrote:
|  
|Hi folks,
|   
|I want to give you advance notice that I would like to make Cabal depend
|on parsec. The implication is that GHC would therefore depend on parsec
|and thus it would become a core package, rather than just a HP package.
|So this would affect both GHC and the HP, though I hope not too much.
|  
|  
|   +1 from me, although the amount of potential knock-on work might be
|   discouraging. The current cabal-install bootstrap process (which is
|   currently pretty easy and is necessary at times) will get a bunch more 
deps
|   as a result of this change, no?
|  
|   Yes it will, but given that we do have a script it's not too bad I
|   think. And overall I think its worth it to have the better error
|   messages, performance and memory use. Do you have any idea how slow it
|   is to parse all the .cabal files on hackage, and how much memory that
|   takes? You'd be horrified :-)
|  
|   Duncan
|  
|  
|   ___
|   cabal-devel mailing list
|   cabal-devel@haskell.org
|   http://www.haskell.org/mailman/listinfo/cabal-devel
|  
|  ___
|  cabal-devel mailing list
|  cabal-devel@haskell.org
|  http://www.haskell.org/mailman/listinfo/cabal-devel

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 09:39 -0700, Jason Dagit wrote:

  Why did I choose parsec? Practicality dictates that I can only use
  things in the core libraries, and the nearest thing we have to that is
  the parser lib that is in the HP. I tried to use happy but I could not
  construct a grammar/lexer combo to handle the layout (also, happy is not
  exactly known for its great error messages).
 
 
 Failed attempt aside for a moment, I think you should reconsider happy. Can
 you learn how to do layout from reading the GHC source?

Yes I looked at it, though Haskell's layout is a bit different.

 The happy documentation that explains how to attach a monad (you could
 use it to communicate between alex and happy for layout info) is a bit
 misleading but I have examples I can share with you.

Yes, that's what I was doing. I've used happy with monadic lexers with
feedback between the lexer and parser before, e.g. when I wrote the C
parser now used in language-c.

 I haven't specifically tackled the layout problem but I could try to
 make a parser if it would help.
 
 One major benefit of using happy is that the productions of the grammar can
 be analyzed for shift/shift and shift/reduce conflicts.

Right, I know and that's great. For example there's no way I could have
extended the C89 grammar I started with to cover C99 and GNU C
extensions without the aid of that analysis.

In this case I could not for the life of me construct a grammar that
didn't have conflicts.

Now it's plausible that now that I have worked out a grammar using
parsec that I could have another go with happy and make it work, though
I'd have to do the layout rather differently from how I do it with
parsec. I was so pleased to finally have something work, I didn't feel
like going back and trying it with happy again.

I'd be happy to show you the code I've got with parsec and you can have
a go with happy.

 The equivalent analysis doesn't appear to be possible in parsec. In
 theory, applicative parsers should allow for this but my understanding
 is that parsec does not have this feature for its applicative subset.

Right, it doesn't.

 Other benefits are: a) GHC can certainly use parers generated by it, b) the
 generated code uses common dependencies, c) it's fast, d) it's expressive.

Yes, I started with happy for all those reasons. The speed isn't a
problem here. I'm using a fast lexer using alex and profiling indicates
that still almost all the time is spent in the lexer and very little in
the parser. (And that's after I submitted a patch to alex which gets us
a 30% perf improvement.)

About dependencies. So if we got it working with happy, there is still
the issue that we need to parse the individual fields. The way
the .cabal (and other files like ghc-pkg input files) work is that we
parse the outline and then use individual parsers on the fields. For the
latter we use a type class with a parser and pretty printer. That
approach using a type class more or less requires that we use a parser
combinator approach, rather than a monolithic happy style parser. And
it's actually the field parsers that are a large part of the problem:
they give us no error messages and their performance is atrocious
(that's where we get the massive memory blowups). I think happy just
isn't suitable there, so I'd want to use parsec (or any other decent
combinator lib) for that part anyway.

 What is it about happy parser errors that you don't like? Do you know
 examples where parsec does a better job?

Happy doesn't really give parser errors at all as such. It tells you
where it failed and you can poke at the token stream and do what you
like. It doesn't tell you what production you're in, what set of tokens
it was expecting, nothing. Parsec tells us what tokens it was expecting
and it tells us what production it was in and it has code to take that
info and generate reasonable error messages from it (which I've extended
to include the line in question and a visual position indicator).

The reason ghc's parser error messages are so bad is exactly because
happy doesn't really give us anything to work with. See frown for an
example of how we can do better, while still using an LALR(1) approach.

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 16:44 +, Simon Peyton-Jones wrote:
 Yes I think that'd be a great plan.  It's bizarre that GHC depends on
 *all* of Cabal, but only uses a tiny part of it (more or less the
 Package data type I think).

The sensible way to split it (I think) would be like this:

cabal-lib:
  Distribution.*
  -- containing definitions of types and parsers  pretty printers
  -- including the InstalledPackageInfo

cabal-build-simple
  Distribution.Simple.*
  -- the build system for Simple packages

cabal
  -- the program, what is currently called cabal-install

And then the ghc package would only depend on the cabal-lib package. But
it's that package that is going to use bytestring, text, parsec etc, for
its type definitions and parser.

The InstalledPackageInfo and its parser is what ghc and ghc-pkg
primarily use (though there's the opportunity to share code for handling
package indexes) and that type and that parser are also going to end up
using text and parsec etc.

It'd be possible to split things out further and have
InstalledPackageInfo and the types it uses and a special parser just for
that with fewer dependencies, but I'm not sure that's really worth it
and it would duplicate things (the types and/or parsers shared by
InstalledPackageInfo and the source package description).

So all in all, the split I suggest above makes sense for its own reasons
but it wouldn't help ghc here, and a further split just to help ghc
would be rather annoying.

Duncan

 |  -Original Message-
 |  From: cabal-devel-boun...@haskell.org 
 [mailto:cabal-devel-boun...@haskell.org]
 |  On Behalf Of Administrator
 |  Sent: 14 March 2013 15:23
 |  To: Duncan Coutts
 |  Cc: Lentczner; cabal-devel; Haskell Libraries; ghc-d...@haskell.org
 |  Subject: Re: Advance notice that I'd like to make Cabal depend on parsec
 |  
 |  This GHC dependency on Cabal is putting a rather troubling constraint
 |  in Cabal's evolution, which in my opinion is a serious problem. When I
 |  first took a look at the dependencies between GHC and Cabal I found it
 |  a bit strange that GHC would depend on Cabal as I would expect GHC to
 |  be as low in the dependency tree as possible to avoid exactly these
 |  kinds of problems.
 |  
 |  These GHC dependencies on Cabal are in fact small (see
 |  http://hackage.haskell.org/trac/ghc/attachment/ticket/7740/ghc-2.png
 |  for a summary) and with a little bit of refactoring it would be
 |  possible to split these dependencies into a very small shared package
 |  with minimal or no further dependencies. This would liberate Cabal to
 |  make the necessary refactoring.
 |  
 |  IMHO, the addition of these new dependencies to Cabal should go
 |  together with splitting the GHC-Cabal shared dependencies into a
 |  separate package so that there would be no additional coordination
 |  needed from then on between these two development efforts (except when
 |  dealing with this new package).
 |  
 |  
 |  On Thu, Mar 14, 2013 at 12:12 PM, Duncan Coutts
 |  duncan.cou...@googlemail.com wrote:
 |   On Thu, 2013-03-14 at 16:06 +0100, Gregory Collins wrote:
 |   On Thu, Mar 14, 2013 at 3:53 PM, Duncan Coutts
 |  duncan.cou...@googlemail.com
 |wrote:
 |  
 |Hi folks,
 |   
 |I want to give you advance notice that I would like to make Cabal 
 depend
 |on parsec. The implication is that GHC would therefore depend on 
 parsec
 |and thus it would become a core package, rather than just a HP 
 package.
 |So this would affect both GHC and the HP, though I hope not too much.
 |  
 |  
 |   +1 from me, although the amount of potential knock-on work might be
 |   discouraging. The current cabal-install bootstrap process (which is
 |   currently pretty easy and is necessary at times) will get a bunch more 
 deps
 |   as a result of this change, no?
 |  
 |   Yes it will, but given that we do have a script it's not too bad I
 |   think. And overall I think its worth it to have the better error
 |   messages, performance and memory use. Do you have any idea how slow it
 |   is to parse all the .cabal files on hackage, and how much memory that
 |   takes? You'd be horrified :-)
 |  
 |   Duncan
 |  
 |  
 |   ___
 |   cabal-devel mailing list
 |   cabal-devel@haskell.org
 |   http://www.haskell.org/mailman/listinfo/cabal-devel
 |  
 |  ___
 |  cabal-devel mailing list
 |  cabal-devel@haskell.org
 |  http://www.haskell.org/mailman/listinfo/cabal-devel




___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Bardur Arantsson

On 03/14/2013 03:53 PM, Duncan Coutts wrote:
 Hi folks,
 
 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).
 

Just thinking out loud here, but what about ditching the current format
for something that's simpler to parse/generate? Like, say, JSON?

Regards,




___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Jason Dagit

On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson s...@scientician.netwrote:

 On 03/14/2013 03:53 PM, Duncan Coutts wrote:
  Hi folks,
 
  Why did I choose parsec? Practicality dictates that I can only use
  things in the core libraries, and the nearest thing we have to that is
  the parser lib that is in the HP. I tried to use happy but I could not
  construct a grammar/lexer combo to handle the layout (also, happy is not
  exactly known for its great error messages).
 

 Just thinking out loud here, but what about ditching the current format
 for something that's simpler to parse/generate? Like, say, JSON?


I thought I heard someone say that most existing cabal files can be
converted to valid yaml by adding one token at the start? If the change was
that simple it might be doable. I think the trick is that we'd need to
expose this by only treating the file as yaml if the minimum cabal version
is = 1.17 (or so).

In general these sorts of format changes are painful for users and I sense
that now might be a bad time to change it (user morale is already a bit low
with complaints of cabal hell, let's not exacerbate that by breaking
existing .cabal files).

Jason
___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Roman Cheplyaka

* Duncan Coutts duncan.cou...@googlemail.com [2013-03-14 17:12:14+]
 The InstalledPackageInfo and its parser is what ghc and ghc-pkg
 primarily use (though there's the opportunity to share code for handling
 package indexes) and that type and that parser are also going to end up
 using text and parsec etc.

Correct me if I'm wrong, but isn't it just a strange coincidence that
InstalledPackageInfo is serialised in the format similar to .cabal
format?

InstalledPackageInfos aren't supposed to be edited by hand and do not
need good error reporting. They can be serialized using any
serialization library.

(Then again, any serialization library like aeson would probably bring
more dependencies than you're considering...)

Roman

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 21:29 +0200, Roman Cheplyaka wrote:
 * Duncan Coutts duncan.cou...@googlemail.com [2013-03-14 17:12:14+]
  The InstalledPackageInfo and its parser is what ghc and ghc-pkg
  primarily use (though there's the opportunity to share code for handling
  package indexes) and that type and that parser are also going to end up
  using text and parsec etc.
 
 Correct me if I'm wrong, but isn't it just a strange coincidence that
 InstalledPackageInfo is serialised in the format similar to .cabal
 format?

It's not a very strange coincidence. The type is not specific to ghc,
it's defined in a compiler-neutral way by the original Cabal spec. So
since both the source package and installed package info was defined in
the Cabal spec, using the same kind of external syntax and sharing many
of the same types, then they both ended up in the Cabal lib and share
the same parsers  pretty printers.

 InstalledPackageInfos aren't supposed to be edited by hand and do not
 need good error reporting. They can be serialized using any
 serialization library.

Right, it doesn't need good error reporting (though it's nice if it's
fast, which it isn't currently). The main advantage of the current
arrangement is that the source and installed package descriptions get to
share the same types and parser/pretty printer.

I think there's a slightly more general point here though. Why is it
that we don't have any good parser in the core packages? It's not just
Cabal that needs to parse things. We have two useless parsers in the
base package, ReadS and ReadP. Haskell is famous for its parser
combinators and yet our core infrastructure is stuck with only useless
ones!

Duncan


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Duncan Coutts

On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote:
 On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson 
 s...@scientician.netwrote:
 
  On 03/14/2013 03:53 PM, Duncan Coutts wrote:
   Hi folks,
  
   Why did I choose parsec? Practicality dictates that I can only use
   things in the core libraries, and the nearest thing we have to that is
   the parser lib that is in the HP. I tried to use happy but I could not
   construct a grammar/lexer combo to handle the layout (also, happy is not
   exactly known for its great error messages).
  
 
  Just thinking out loud here, but what about ditching the current format
  for something that's simpler to parse/generate? Like, say, JSON?

Of course .cabal files are mainly written by humans, not machines, so we
should optimise for them. The grammar I've got now really isn't that
bad. In fact if we wanted to simplify it we'd rip out the bits that are
designed to make it easier to generate by programs: we'd eliminate the
explicit {} syntax and just use layout. Allowing either is what makes
the grammar more complex. But as I say, I'm satisfied that the grammar
is ok.

 I thought I heard someone say that most existing cabal files can be
 converted to valid yaml by adding one token at the start? If the change was
 that simple it might be doable. I think the trick is that we'd need to
 expose this by only treating the file as yaml if the minimum cabal version
 is = 1.17 (or so).

I know people have compared it to yaml and suggested we just use yaml,
but I don't think it's that close syntactically. I did look into this
when I started and I think there are too many differences to make it
practical to switch to yaml (or a subset).

 In general these sorts of format changes are painful for users and I sense
 that now might be a bad time to change it (user morale is already a bit low
 with complaints of cabal hell, let's not exacerbate that by breaking
 existing .cabal files).

Right. I'm satisfied the format is basically ok, we don't need any
breaking changes.

-- 
Duncan Coutts, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/


___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Bardur Arantsson

On 03/14/2013 11:01 PM, Duncan Coutts wrote:
 On Thu, 2013-03-14 at 11:15 -0700, Jason Dagit wrote:
 On Thu, Mar 14, 2013 at 11:01 AM, Bardur Arantsson 
 s...@scientician.netwrote:

 On 03/14/2013 03:53 PM, Duncan Coutts wrote:
 Hi folks,

 Why did I choose parsec? Practicality dictates that I can only use
 things in the core libraries, and the nearest thing we have to that is
 the parser lib that is in the HP. I tried to use happy but I could not
 construct a grammar/lexer combo to handle the layout (also, happy is not
 exactly known for its great error messages).


 Just thinking out loud here, but what about ditching the current format
 for something that's simpler to parse/generate? Like, say, JSON?
 
 Of course .cabal files are mainly written by humans, not machines, so we
 should optimise for them.

I though we were mostly talking about InstalledPackageInfo. That could
be in $EASILY_PARSEABLE_FORMAT without really breaking anything, right?

Another option if GHC really also needs to parse .cabal files:

- Introduce a format for Cabal files that's trivial to hand-code a
recursive descent parser for.
- Add a command in Cabal to generate that format from a .cabal file.
- Have cabal sdist automatically generate that file and put it into
the uploaded archive.

Regards,



___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

Re: Advance notice that I'd like to make Cabal depend on parsec

2013-03-14 Thread Conrad Parker

On 14 March 2013 22:53, Duncan Coutts duncan.cou...@googlemail.com wrote:

 I've been doing regression testing against hackage and I'm satisfied
 that the new parser matches close enough. I've uncovered all kinds of
 horrors with .cabal files in the wild relying on quirks of the old
 parser. I've made adjustments for most of them but I will be breaking a
 half dozen old packages

When you say you've made adjustments for dodgy .cabal files in the
wild, do you mean that you'll send those maintainers patches that make
their cabal files less dodgy, or do you mean you've added hacks to
your parser to reproduce the quirky behaviour?

Conrad.

___
cabal-devel mailing list
cabal-devel@haskell.org
http://www.haskell.org/mailman/listinfo/cabal-devel

38 matches

Mail list logo