Re: [Haskell-cafe] Design of extremely usable programming language libraries
Unfortunately you can only do traversals, not unfolds, with GADTs. That's because in an unfold, the return type is determined by the value itself and can vary among the produced results, whereas in a traversal it is determined by the input type. This means also that you cannot simply derive Data, because the derived instance will contain a gunfold function, which then will fail to typecheck. You can copy-paste the generated instance (-ddump-deriv) and simply remove the code for gunfold (or write your own deriver). The following compiles for me: https://gist.github.com/feuerbach/5668198 Roman * Andrey Chudnov [2013-05-28 17:29:10-0400] > Thanks for a prompt reply, Roman. > > On 05/28/2013 04:52 PM, Roman Cheplyaka wrote: > > Any syb-style library works with GADTs, by the virtue of dealing with > > value representations instead of type representations. > I tried to use syb, but the following code fails to typecheck for me. > What am I doing wrong? > > {-# LANGUAGE GADTs, EmptyDataDecls, MultiParamTypeClasses, > TypeFamilies #-} > > {-# LANGUAGE DeriveDataTypeable, StandaloneDeriving #-} > > > data HasHoles > > data Complete > > deriving instance Typeable HasHoles > > deriving instance Data HasHoles > > deriving instance Typeable Complete > > deriving instance Data Complete > > type family Holes a b :: * > > canHaveHolesT :: a -> b -> Holes a b > > canHaveHolesT _ _ = undefined > > type instance Holes HasHoles Complete = HasHoles > > type instance Holes Complete HasHoles = HasHoles > > type instance Holes HasHoles HasHoles = HasHoles > > type instance Holes Complete Complete = HasHoles > > > data Expression k a where > > EQuote :: a -> String -> Expression HasHoles a > > IntLit :: a -> Int -> Expression Complete a > > EArith :: a -> ArithOp -> Expression k1 a -> Expression k2 a -> > > Expression (Holes k1 k2) a > > deriving instance Typeable2 (Expression) > > deriving instance Data (Expression k a) > > data ArithOp = OpAdd > > | OpSub > > | OpMul > > | OpDiv > >deriving (Data, Typeable) > > Fails with: > > Couldn't match type `Complete' with `HasHoles' > > Expected type: a -> String -> Expression k a > > Actual type: a -> String -> Expression HasHoles a > > In the first argument of `z', namely `EQuote' > > In the first argument of `k', namely `z EQuote' > > When typechecking the code for `Data.Data.gunfold' > > in a standalone derived instance for `Data (Expression k a)': > > To see the code I am typechecking, use -ddump-deriv > > > > Not sure what you mean here — attoparsec does support unlimited > > lookahead, in the sense that a parser may fail arbitrarily late in the > > input stream, and backtrack to any previous state. Although attoparsec > > is a poor choice for programming language parsing, primarily because > > of the error messages. > I guess I have an outdated notion of attoparsec. But yes, error messages > seem to be the weak point of attoparsec. Also, the fact that it only > accepts bytestrings makes it harder (but no impossible, since we can > convert Strings to ByteStrings) to reuse the parser as a QuasiQuoter. > So, I'll rephrase my question. What's the best choice for a library for > parsing programming languages nowadays? > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Design of extremely usable programming language libraries
> > Not sure what you mean here — attoparsec does support unlimited > > lookahead, in the sense that a parser may fail arbitrarily late in the > > input stream, and backtrack to any previous state. Although attoparsec > > is a poor choice for programming language parsing, primarily because > > of the error messages. > I guess I have an outdated notion of attoparsec. But yes, error messages > seem to be the weak point of attoparsec. Also, the fact that it only > accepts bytestrings makes it harder (but no impossible, since we can > convert Strings to ByteStrings) to reuse the parser as a QuasiQuoter. > So, I'll rephrase my question. What's the best choice for a library for > parsing programming languages nowadays? Parsec is still widely popular since it's part of the HP, but I use uu-parsinglib as my first-choice parser. It comes with a lot of examples, good documentation, and many features I like (good error messages and auto error correction). I don't know how performance compares with parsec or attoparsec, but it's always been good enough for me. John L. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Design of extremely usable programming language libraries
Thanks for a prompt reply, Roman. On 05/28/2013 04:52 PM, Roman Cheplyaka wrote: > Any syb-style library works with GADTs, by the virtue of dealing with > value representations instead of type representations. I tried to use syb, but the following code fails to typecheck for me. What am I doing wrong? > {-# LANGUAGE GADTs, EmptyDataDecls, MultiParamTypeClasses, TypeFamilies #-} > {-# LANGUAGE DeriveDataTypeable, StandaloneDeriving #-} > data HasHoles > data Complete > deriving instance Typeable HasHoles > deriving instance Data HasHoles > deriving instance Typeable Complete > deriving instance Data Complete > type family Holes a b :: * > canHaveHolesT :: a -> b -> Holes a b > canHaveHolesT _ _ = undefined > type instance Holes HasHoles Complete = HasHoles > type instance Holes Complete HasHoles = HasHoles > type instance Holes HasHoles HasHoles = HasHoles > type instance Holes Complete Complete = HasHoles > data Expression k a where > EQuote :: a -> String -> Expression HasHoles a > IntLit :: a -> Int -> Expression Complete a > EArith :: a -> ArithOp -> Expression k1 a -> Expression k2 a -> > Expression (Holes k1 k2) a > deriving instance Typeable2 (Expression) > deriving instance Data (Expression k a) > data ArithOp = OpAdd > | OpSub > | OpMul > | OpDiv >deriving (Data, Typeable) Fails with: > Couldn't match type `Complete' with `HasHoles' > Expected type: a -> String -> Expression k a > Actual type: a -> String -> Expression HasHoles a > In the first argument of `z', namely `EQuote' > In the first argument of `k', namely `z EQuote' > When typechecking the code for `Data.Data.gunfold' > in a standalone derived instance for `Data (Expression k a)': > To see the code I am typechecking, use -ddump-deriv > Not sure what you mean here — attoparsec does support unlimited > lookahead, in the sense that a parser may fail arbitrarily late in the > input stream, and backtrack to any previous state. Although attoparsec > is a poor choice for programming language parsing, primarily because > of the error messages. I guess I have an outdated notion of attoparsec. But yes, error messages seem to be the weak point of attoparsec. Also, the fact that it only accepts bytestrings makes it harder (but no impossible, since we can convert Strings to ByteStrings) to reuse the parser as a QuasiQuoter. So, I'll rephrase my question. What's the best choice for a library for parsing programming languages nowadays? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Design of extremely usable programming language libraries
* Andrey Chudnov [2013-05-28 16:36:14-0400] > * Does any generic traversal/transformation (uniplate-style) library > support GADTs? Any syb-style library works with GADTs, by the virtue of dealing with value representations instead of type representations. > * What is the best choice, performance- and memory-wise, for a parser > combinator library with support for arbitrary look-ahead? Parsec is > considered slow by some [1], but is it only in comparison with > attoparsec (which, unfortunately, doesn't support arbitrary > look-ahead)? Is there any parser library that performs better than > Parsec while still supporting arbitrary look-ahead. Not sure what you mean here — attoparsec does support unlimited lookahead, in the sense that a parser may fail arbitrarily late in the input stream, and backtrack to any previous state. Although attoparsec is a poor choice for programming language parsing, primarily because of the error messages. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Design of extremely usable programming language libraries
Dear Cafe, I'm exploring the design space of programming language libraries with enhanced usability and I'd your help and comments. I'll start with a few short questions, but offer a detailed discussion of the motivations and the problems I'm facing below. So, if you have interest in the subject or feel you can offer some insight, please, do read on. * Does any generic traversal/transformation (uniplate-style) library support GADTs? * What is the best choice, performance- and memory-wise, for a parser combinator library with support for arbitrary look-ahead? Parsec is considered slow by some [1], but is it only in comparison with attoparsec (which, unfortunately, doesn't support arbitrary look-ahead)? Is there any parser library that performs better than Parsec while still supporting arbitrary look-ahead. * Any multi-mode pretty printer libraries? By multi-mode I mean writing code once and being able to generate, say, both "pretty" and "minified" text representations of a program by changing just one parameter. Also, what's the most efficient pretty-printing library nowadays? Blaze? I've been using Haskell for quite a while now, primarily, for programming-language applications: program analysis, transformation and compilation. I'm sure many would agree that PL work is where Haskell shines. In the recent years new language features and libraries --namely, GADTs, Template Haskell, quasi-quotation and generic programming--- have appeared that could make working with languages even easier than before. That's why it's sad to see that none of the PL libraries seem to make good use of these features (however, I might be starting to understand why). So, I'm currently exploring the design space for a library that uses these advanced Haskell features for delivering better usability, and I'm having problems with implementing some of them. I welcome comments on both the motivations, overall design and the more technical aspects. I've omitted a few details because it's a long e-mail as is. If something is not clear or doesn't make sense, please, let me know. I'll start by listing the features that an "ideal" PL library should have, and that I've come to cherish as both a heavy user and a developer of such. The basic features (pretty much every library has them) include a parser (text->AST (abstract syntax tree)) and a pretty-printer (AST->text), as well as a Haskell representation of the AST that is somewhat easy to use. Pretty much every library has that --- although some might debate the ease of use of the AST representations. However, there are other features that, in my opinion, are essential to a PL library. The features are motivated by three requirements: static safety (as few run-time errors as possible), minimal code duplication (DRY) and ease of use and inspection of the code. 1) the pretty-printer should be multi-mode. One should be able to write code once and be able to generate different textual representations of the AST: - the "pretty" which is nice to the eye with white spaces, indentations etc. - minified, with minimum white space (while still being valid) - debuggable which inserts comments based on AST annotations - source-map generation - being able to generate colored LaTeX/HTML code would be nice, but non-essential 2) ASTs should be statically safe: you should only be able construct values that represent valid programs, or get a typechecker/compiler error otherwise. Languages that have syntactic productions that can appear in one context but not in another need GADTs with type witnesses to achieve that. In fact, such languages are often used to motivate GADTs in the first place [2]. And while the problem in [2] could have been solved by splitting the Expr datatype into two (IntExpr and BoolExpr), in some languages this can't be done (or produces awkward syntax trees). 3) a quasi-quoter with support for anti-quotation and quoted patterns. This also saves a lot of typing *and* makes your code less error-prone and easier to read. What is better (to both write and read)? > [js|#x# = (function (a, b) {return {t1: a + b, t2: a*b};})(#x#, #y#);|] or > ExprStmt def $ AssignExpr def x (CallExpr def (FuncExpr def Nothing > [Id def "a", Id def "b"] $ ReturnStmt def $ ObjectLit [(PropId def $ > Id def "t1", InfixExpr def OpAdd (VarRef def $ Id def "a") (VarRef def $ > Id def "b")), PropId def $ > Id def "t2", InfixExpr def OpMul (VarRef def $ Id def "a") (VarRef def $ > Id def "b")])) [x, y] The caveat here is that, to help ensure correctness, the quasi-quoter and the parser should share code as much code as possible. Ideally, there should be just one parser that has a switch for recognizing normal and quasi-quoted programs. However, that would require adding additional constructors representing anti-quotations to our AST. And with that the user might be able to generate invalid AST's and cause a run-time error.