Re: [Haskell-cafe] Regular Expressions without GADTs
Hello Conor, I personally think GADTs are a great idea, don't get me wrong. I am not arguing that we should all start programming using on a Church encoding. Neither I am arguing that this encoding can replace GADTs in all cases. Nevertheless, I believe that knowing about this encoding, in the first place, can be extremely valuable for programmers. Let's not forget that neither GADTs or multiple parameter type classes are Haskell 98! In fact, as far as I understood from the Haskell Workshop, we are a long way from those extensions becoming standard. My point is that if you are developing software, then you want to use something standard (this is, running on Hugs, Nhc, Ghc, Ehc, ...). This encoding tries to avoid extensions (not saying that it does for all cases). In the end, that is what most design patterns are: workarounds on missing language features. In this case missing features of Haskell 98 (but not GHC Haskell). Best Regards, Bruno Oliveira ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
On Tue, Oct 18, 2005 at 05:51:13PM +0100, Conor McBride wrote: | Neither Oleg nor Bruno translated my code; they threw away my | structurally recursive on-the-fly automaton and wrote combinator parsers | instead. That's why there's no existential, etc. The suggestion that | removing the GADT simplifies the code would be better substantiated if | like was compared with like. Here's a version in H98 + existentials. I'm not claiming it proves anything. module RegExp where import Control.Monad data RegExp tok a = Zero | One a | Check (tok - a) (tok - Bool) | Plus (RegExp tok a) (RegExp tok a) | forall b. Mult (RegExp tok (b - a)) (RegExp tok b) | forall e. Star ([e] - a) (RegExp tok e) I could just use Prod (One f) instead of fmap f, but the functor instance is mechanical: instance Functor (RegExp tok) where fmap f Zero = Zero fmap f (One v) = One (f v) fmap f (Check k p) = Check (f.k) p fmap f (Plus r1 r2) = Plus (fmap f r1) (fmap f r2) fmap f (Mult r1 r2) = Mult (fmap (f.) r1) r2 fmap f (Star k r) = Star (f.k) r Constructors one :: RegExp tok () one = One () check :: (tok - Bool) - RegExp tok tok check = Check id plus :: RegExp tok a - RegExp tok b - RegExp tok (Either a b) plus r1 r2 = Plus (fmap Left r1) (fmap Right r2) mult :: RegExp tok a - RegExp tok b - RegExp tok (a,b) mult r1 r2 = Mult (fmap (,) r1) r2 star :: RegExp tok a - RegExp tok [a] star = Star id Parsing parse :: RegExp tok a - [tok] - Maybe a parse r [] = empty r parse r (t : ts) = parse (divide t r) ts empty :: RegExp tok a - Maybe a empty Zero= mzero empty (One v) = return v empty (Check _ _) = mzero empty (Plus r1 r2)= empty r1 `mplus` empty r2 empty (Mult r1 r2)= empty r1 `ap` empty r2 empty (Star k _) = return (k []) divide :: tok - RegExp tok a - RegExp tok a divide t Zero = Zero divide t (One v) = Zero divide t (Check k p) | p t = One (k t) | otherwise = Zero divide t (Plus r1 r2) = Plus (divide t r1) (divide t r2) divide t (Mult r1 r2) = case empty r1 of Nothing - Mult (divide t r1) r2 Just f - Plus (Mult (divide t r1) r2) (fmap f (divide t r2)) divide t (Star k r) = Mult (fmap k_cons (divide t r)) (Star id r) where k_cons x xs = k (x:xs) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
[EMAIL PROTECTED] wrote: I suspect that once you start producing values with the relevant properties (as I do) rather than just consuming them (as Oleg and Bruno do), the GADT method might work out a little neater. Actually, the code is pretty much your original code, with downcased identifiers. It faithfully implements that parser division approach. Still, there are no existentials. That's because this isn't quite my program either, although I'll grant you it's a lot closer. Give me back the uppercase identifiers and you've pretty much done it. That's to say, I want a datatype of regular expressions which happens (at least) to support parsing, rather than a parser library which supports (at least) regular languages. We can leave it to others to decide which is neater, but I'd remark that... I wouldn't say that GADT code is so much different. Perhaps the code below is a bit neater due to the absence of existentials, `case' statements, and local type annotations. ...I'm also irritated by the case statements, the local type annotations, and so on. I was rather surprised to find that I needed them. I bet it's not an essential or permanent problem, and it's certainly a problem we have planned not to have in Epigram. The existential also bothers me, but for different reasons. In the dependently typed version, RegExp isn't a GADT: it's just an ordinary datatype, and there's a perfectly ordinary function, Sem :: RegExp - *, which computes the type corresponding to a regular expression. The syntactic part of division is thus completely separated from the semantic part (glueing the head back on). The GADT version here replaces Sem by an index in the datatype, so sometimes I'm forced to write an existential in order to collect what would have been the result of applying Sem. You've avoided the existential by throwing away the syntax and keeping only the specific empty-and-divide semantics required for this particular example. In effect * I define regular expressions and show they possess at least the empty-and-divide semantics; * You define the empty-and-divide semantics and show that it's closed under at least the regular expression formers (nicely expressed via Bruno's class) Let's see... Here's the class of regular expression algebras: -- Bruno.Oliveira's type class class RegExp g where zero :: g tok Zero one:: g tok () check :: (tok - Bool) - g tok tok plus :: g tok a - g tok b - g tok (Either a b) mult :: g tok a - g tok b - g tok (a,b) star :: g tok a - g tok [a] Here's the specification of the intended semantics: data Parse tok t = Parse { empty :: Maybe t , divide :: tok - Parse tok t} Here's where you explain how stuff other than regular expressions also has the semantics. This is how the existential is made to disappear: you use liftP to fuse the head-glueing function into the parsers themselves. It's something like adding Map :: (s - t) - RegExp tok s - RegExp tok t to the syntax of regular expressions, isn't it? liftP f p = Parse{empty = liftM f (empty p), divide = \tok - liftP f (divide p tok)} And then you show that indeed the desired semantics is closed under RegExp formation. instance RegExp Parse where It's a classic instance of the Expression Problem. If you fix the elimination behaviour, it's easy to add new introduction behaviour; if you fix the introduction behaviour, it's easy to add new elimination behaviour. I think it's important to support both modes of working: type classes are good at the former, datatypes the latter, so let's be open to both. There's always a danger with small examples that there's not so much to choose between the two approaches---except that they scale up in completely different ways. It's important not to get into a sterile argument 'Oh, but I can extend the grammar', 'Oh, but I can extend the functionality', as if only one of these is ever important. I will happily cheer anyone who can help us to be lighter on our feet in jumping between the two. I'm not saying there's no value to 'poor man's' alternatives to GADTs. I certainly agree with Bruno (a) that it's important to know how to work within the standard; my experiences implementing the first version of Epigram have convinced me to be much more cautious about which Haskell extensions I'm willing to use in the second (b) that the pattern 'show type constructor f has such-an-such an algebra' is useful and worth abstracting. I was just about to send, when up popped Ross's program, which does give me a datatype, including just enough extra stuff to support functoriality. data RegExp tok a = Zero | One a | Check (tok - a) (tok - Bool) | Plus (RegExp tok a) (RegExp tok a) | forall b. Mult (RegExp tok (b - a)) (RegExp tok b) | forall e. Star ([e] - a) (RegExp tok e) This combines
Re: [Haskell-cafe] Regular Expressions without GADTs
On Wed, Oct 19, 2005 at 03:17:41PM +0100, Conor McBride wrote: I was just about to send, when up popped Ross's program, which does give me a datatype, including just enough extra stuff to support functoriality. data RegExp tok a = Zero | One a | Check (tok - a) (tok - Bool) | Plus (RegExp tok a) (RegExp tok a) | forall b. Mult (RegExp tok (b - a)) (RegExp tok b) | forall e. Star ([e] - a) (RegExp tok e) This combines several dodges rather neatly. You can see the traces of the translation from GADT to nested-datatype-with-equations, but instead of saying forall e. Star (Eq [e] a) (RegExp tok e), Ross has kept just the half of the iso he needs in order to extract the parsed value. The direction I've kept is the one that matters if everything is covariant, because forall a. F a - T (G a) ~= forall a, b. (G a - b) - F a - T b if F, G and T are functors. So we can translate data T :: * - * where C :: F a - T (G a) to data T b = forall a. C (G a - b) (F a) and the mechanical translation of your GADT is data RegExp tok a = Zero (Empty - a) Empty | One (() - a) () | Check (tok - a) (tok - Bool) (RegExp tok a) | forall b c. Plus (Either b c - a) (RegExp tok b) (RegExp tok c) | forall b c. Mult ((b, c) - a) (RegExp tok b) (RegExp tok c) | forall e. Star ([e] - a) (RegExp tok e) A little simplification (ignoring lifted types here and there) yields the version I gave. And of course we're no longer assuming that the function is half of an iso. Throwing away the other half more-or-less allows him to hide the head-glueing functions inside the grammar of the regular expressions. In effect, Map has been distributed through the syntax. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
On Wed, Oct 19, 2005 at 03:17:41PM +0100, Conor McBride wrote: It's nice that the syntax of regular expressions survives, but it has been somewhat spun in favour of the functionality required in the example. I'm not sure about that either -- the existential coding supports elimination too, if the result type is functorial (again specializing the general scheme): caseRegExp :: Functor r = r Empty - r () - ((tok - Bool) - r tok) - (forall a b. RegExp tok a - RegExp tok b - r (Either a b)) - (forall a b. RegExp tok a - RegExp tok b - r (a, b)) - (forall a. RegExp tok a - r [a]) - RegExp tok v - r v caseRegExp fZero fOne fCheck fPlus fMult fStar x = case x of Zero- fmap (const bot) fZero One v - fmap (const v) fOne Check k p - fmap k (fCheck p) Plus r1 r2 - fmap (either id id) (fPlus r1 r2) Mult r1 r2 - fmap (uncurry id)(fMult r1 r2) Star k r- fmap k (fStar r) where bot = error can't happen Not that this is comfortable to use, but it does show that existentials are as expressive, assuming the functor: they can't do RegExp t a - a - a. I would say that the availability of 'this workaround for this example, that workaround for that example', is evidence in *favour* of adopting the general tools which are designed to support many examples. If a job's worth doing, it's worth doing well. Is there a *uniform* workaround which delivers all the GADT functionality cheaply and cleanly? Kind of subjective, I suppose, but I haven't seen anything cheap enough for me. It seems the existential coding is uniform, if everything is covariant. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
Hello, You can also write this code in Haskell 98. Jeremy Gibbons and I have recently written a paper entitled TypeCase: A design pattern for type-indexed functions where we explore the use of some techniques from lightweight approaches to generic programming (Ralf Hinze and James Cheney's work) for other purposes. If you are interested you can take a look at: http://web.comlab.ox.ac.uk/oucl/work/bruno.oliveira/typecase.pdf Basically, we can use type classes and GADTs as two alternatives approaches to implement type-based case analysis. A well-known example of this is printf. With printf we want to construct a format string that determines the type of printf. Something similar happens here, we want to build a parser based on the type of the regular expression. There is nothing ingenious with this solution --- I basically translated Oleg's and Conor's code. In the paper we discuss in details different trade-offs between different implementations (e.g. GADT vs type classes). For instance, here is something you cannot do (easily) with the GADT approach: myParser :: MonadPlus m = Parse m tok Int myParser = Parse (\_ - return (1::Int)) test = asList (parse myParser ole) Basically, with the code that follows you can define your own customized Regular expressions that you can use with your parser. (Sorry for the stupid example, but just to give you the idea). Ultimately you can reason on your programs (of this kind!) using GADTs or multiple-parameter type classes with functional dependencies and them translate them into Haskell 98. Here is the code: module RegExps where import Monad newtype Zero = Zero Zero -- Zero type in Haskell 98 class RegExp g where zero :: g tok Zero one:: g tok () check :: (tok - Bool) - g tok tok plus :: g tok a - g tok b - g tok (Either a b) mult :: g tok a - g tok b - g tok (a,b) push :: tok - g tok r - g tok r star :: g tok a - g tok [a] newtype Parse m tok t = Parse {parse :: [tok] - m t} instance MonadPlus m = RegExp (Parse m) where zero= Parse (\_ - mzero) one = Parse (\l - case l of [] - return () _ - mzero) check p = Parse (\l - case l of [t] - if (p t) then return t else mzero _ - mzero) plus r1 r2 = Parse (\l - (liftM Left $ parse r1 l) `mplus` (liftM Right $ parse r2 l)) push tok r = Parse (\l - parse r (tok:l)) mult r1 r2 = Parse (\l - case l of [] - liftM2 (,) (parse r1 l) (parse r2 l) (t:ts) - parse (mult (push t r1) r2) ts `mplus` liftM2 (,) (parse r1 ([] `asTypeOf` ts)) (parse r2 (t:ts))) star r = Parse (\l - case l of [] - return [] ts - parse (mult r (star r)) ts = (\(x,xs) - return $ x : xs)) Problem with the monomorphism restriction p1 :: RegExp g = g Char ([Char], [Char]) p1 = mult (star (check (== 'a'))) (star (check (== 'b'))) p1 = (Mult (Star (Check (== 'a'))) (Star (Check (== 'b' asMayBe :: Maybe a - Maybe a asMayBe = id asList :: [a] - [a] asList = id testp = asMayBe $ parse (star (mult (star (check (== 'a'))) (star (check (== 'b') abaabaaa *RX testp Just [(a,b),(aa,b),(aaa,)] -- see alternative parses testp1 = take 3 $ asList $ parse (star (mult (star (check (== 'a'))) (star (check (== 'b') abaabaaa -- Parsing the list of integers ieven = even :: Int-Bool iodd = odd :: Int-Bool p2 :: RegExp g = g Int (Either (Int, (Int, [Int])) (Int, [Int])) p2 = plus (mult (check iodd) (mult (check iodd) (star (check ieven (mult (check ieven) (star (check iodd))) -- the parser is ambiguous. We can see the alternatives test2 = take 3 $ asList $ parse (star p2) [1::Int,1,2,3,3,4] Connor's code for empty. {- newtype Empty tok t = Empty {empty :: Maybe t} instance RegExp Empty where zero= Empty mzero one = Empty (return ()) check _ = Empty mzero plus r1 r2 = Empty ((return Left `ap` empty r1) `mplus` (return Right `ap` empty r2)) mult r1 r2 = Empty (return (,) `ap` empty r1 `ap` empty r2) star _ = Empty (return []) -} ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
Hi folks There is nothing ingenious with this solution --- I basically translated Oleg's and Conor's code. Neither Oleg nor Bruno translated my code; they threw away my structurally recursive on-the-fly automaton and wrote combinator parsers instead. That's why there's no existential, etc. The suggestion that removing the GADT simplifies the code would be better substantiated if like was compared with like. By the way, the combinator parsers, as presented, don't quite work: try asMayBe $ parse (Star One) I really must get to the bottom of this... with Oleg's code, or the analogous (star one) with Bruno's. But it's almost as easy a mistake to fix (I hope) as it is to make, so the point that the GADT isn't strictly necessary does stand. I wouldn't swear my code's correct either, but I don't think it loops on finite input. Oleg simulates GADTs with Ye Olde Type Class Trick, getting around the fact that type classes aren't closed by adding each necessary piece of functionality to the methods. By contrast, Bruno goes straight for the Church encoding, as conveniently packaged by type classes: even if you can't write down the (morally) *initial* RegExp algebra, you can write down what RegExp algebras are in general. These two approaches aren't mutually exclusive: you can use Bruno's algebras to capture the canonical functionality which should be supported for all types in Oleg's class. Then you're pretty much writing down that a GADT is an inductive relation, which perhaps you had guessed already... Where we came in, if you recall, was the use of a datatype to define regular expressions. I used a GADT to equip this type with some useful semantic information (namely a type of parsed objects), and I wrote a program (divide) which exploited this representation to compute the regexp the tail of the input must match from the regexp the whole input must match. This may be a bit of a weird way to write a parser, but it's a useful sort of thing to be able to do. There are plenty of other examples of 'auxiliary' types computed from types to which they relate in some interesting way, the Zipper being the a favourite. It's useful to have a syntax (Generic Haskell, GADTs etc) which makes that look as straightforward and concrete as possible. I'm sure the program I actually wrote can be expressed with the type class trick, just by cutting up my functions and pasting the bits into individual instances; the use of the existential is still available. I don't immediately see how to code this up in Bruno's style, but that doesn't mean it can't be done. Still, it might be worth comparing like with like. I suspect that once you start producing values with the relevant properties (as I do) rather than just consuming them (as Oleg and Bruno do), the GADT method might work out a little neater. And that's the point: not only does the GADT have all the disadvantages of a closed datatype, it has all the advantages too! It's easy to write new functions which both consume and produce data, without having to go back and change the definition. I didn't have to define Division or divide in order to say what a regular expression was. If I remember rightly, regular languages are closed under a whole bunch of useful stuff (intersection, complementation, ...) so someone who can remember better than me might implement the corresponding algorithms together with their related operations; they don't need to change the definition of RegExp to do that. Might we actually decide semantic subtyping (yielding a coercion) that way? So what's the point? GADTs are a very convenient way to provide data which describe types. Their encoding via type classes, or whatever, may be possible, but this encoding is not necessarily as convenient as the GADT. Whether or not the various encodings of GADTs deliver more convenience in this example is not clear, because the programs do not correspond. Moreover, once you have to start messing about with equality constraints, the coding to eliminate GADTs becomes a lot hairier. I implemented it back in '95 (when all these things had different names) and I was very glad not to have to do it by hand any more. But your mileage may vary Conor ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
Conor McBride wrote: Neither Oleg nor Bruno translated my code; they threw away my structurally recursive on-the-fly automaton and wrote combinator parsers instead. That's why there's no existential, etc. The suggestion that removing the GADT simplifies the code would be better substantiated if like was compared with like. The following code specifically tries to be as close to the original Conor McBride's code as possible. I only removed local type annotations (turn out unnecessary) and replaced `ap' with `liftM' as a matter of personal preference. Where we came in, if you recall, was the use of a datatype to define regular expressions. I used a GADT to equip this type with some useful semantic information (namely a type of parsed objects), and I wrote a program (divide) which exploited this representation to compute the regexp the tail of the input must match from the regexp the whole input must match. This may be a bit of a weird way to write a parser, but it's a useful sort of thing to be able to do... That particular feature is fully preserved. Data types are used to define regular expression: p = (Star (Mult (Star (Check (== 'a'))) (Star (Check (== 'b') testp = parse p abaabaaa Moreover, *RX :t p p :: Star (Mult (Star (Check Char)) (Star (Check Char))) the regular expression is apparent in the type of the `parser' (or its representation, to be precise). One can easily define data types Digit, Alphanumeric, WhiteSpace, etc. and so type will look more informative. Here's one step to regular types... Instead of GADT, a type class is used to associate semantic information with the data type (with labels). Typing is more explicit now. And that's the point: not only does the GADT have all the disadvantages of a closed datatype, it has all the advantages too! A type class can be closed too, thanks to functional dependencies. You were right: the code below is just the mechanical reshuffling of the original code, using the translation I come across a year ago. `empty' and `division' are now grouped by the parser type, which I personally like. It is hard to see any loss of expressiveness by getting rid of GADTs in this case. More explicit type is an advantage. Also an advantage is the fact that `pattern match' is guaranteed exhaustive. If I forgot to implement some particular parser, that would be a type error (rather than a pattern-match run-time error, as is the case with ADT/GADT). Moreover, once you have to start messing about with equality constraints, the coding to eliminate GADTs becomes a lot hairier. I implemented it back in '95 (when all these things had different names) and I was very glad not to have to do it by hand any more. The discovery of the presence of type equality in Haskell changes things a great deal. It would be interesting to see how your code of '95 will look now. Writing an equality and even _dis-equality_ constraint is just as simple as |TypeEq a b HTrue| or |TypeEq a b HFalse|. {-# OPTIONS -fglasgow-exts #-} module RX where import Control.Monad class RegExp r tok a | r tok - a where empty :: r - [tok] - Maybe a divide :: tok - r - Division tok a data Zero = Zero data One = One newtype Check tok = Check (tok - Bool) data Plus r1 r2 = Plus r1 r2 data Mult r1 r2 = Mult r1 r2 newtype Star r = Star r data Empty = Empty data Division tok x = forall r y. RegExp r tok y = Div r (y - x) parse :: (RegExp r tok x) = r - [tok] - Maybe x parse r [EMAIL PROTECTED] = empty r t parse r (t : ts) = case divide t r of Div q f - liftM f $ parse q ts nullt :: tok - [tok] nullt _ = [] instance RegExp Zero tok Empty where empty _ _ = mzero divide _ _ = Div Zero naughtE instance RegExp One tok () where empty _ _ = return () divide _ _ = Div Zero naughtE instance RegExp (Check tok) tok tok where empty _ _ = mzero divide t (Check p) | p t= Div One (const t) divide _ _ = Div Zero naughtE instance (RegExp r1 tok a, RegExp r2 tok b) = RegExp (Plus r1 r2) tok (Either a b) where empty (Plus r1 r2) t = (liftM Left $ empty r1 t) `mplus` (liftM Right $ empty r2 t) divide t (Plus r1 r2) = case (divide t r1, divide t r2) of (Div q1 f1, Div q2 f2) - Div (Plus q1 q2) (f1 +++ f2) instance (RegExp r1 tok a, RegExp r2 tok b) = RegExp (Mult r1 r2) tok (a,b) where empty (Mult r1 r2) t = liftM2 (,) (empty r1 t) (empty r2 t) divide t (Mult r1 r2) = case (empty r1 (nullt t), divide t r1, divide t r2) of (Nothing, Div q1 f1, _) - Div (Mult q1 r2) (f1 *** id) (Just x1, Div q1 f1, Div q2 f2) - Div (Plus (Mult q1 r2) q2) (either (f1 *** id) (((,) x1) . f2)) instance RegExp r tok a = RegExp (Star r) tok [a] where empty _ _ = return [] divide t (Star r) = case (divide t r) of
Re: [Haskell-cafe] Regular Expressions without GADTs
Conor McBride wrote: Neither Oleg nor Bruno translated my code; they threw away my structurally recursive on-the-fly automaton and wrote combinator parsers instead. That's why there's no existential, etc. The suggestion that removing the GADT simplifies the code would be better substantiated if like was compared with like. ... I'm sure the program I actually wrote can be expressed with the type class trick, just by cutting up my functions and pasting the bits into individual instances; the use of the existential is still available. I don't immediately see how to code this up in Bruno's style, but that doesn't mean it can't be done. Still, it might be worth comparing like with like. Please see the enclosed code. It is still in Haskell98 -- and works in Hugs. I suspect that once you start producing values with the relevant properties (as I do) rather than just consuming them (as Oleg and Bruno do), the GADT method might work out a little neater. Actually, the code is pretty much your original code, with downcased identifiers. It faithfully implements that parser division approach. Still, there are no existentials. I wouldn't say that GADT code is so much different. Perhaps the code below is a bit neater due to the absence of existentials, `case' statements, and local type annotations. {-- Haskell98! --} module RegExps where import Monad newtype Zero = Zero Zero -- Zero type in Haskell 98 -- Bruno.Oliveira's type class class RegExp g where zero :: g tok Zero one:: g tok () check :: (tok - Bool) - g tok tok plus :: g tok a - g tok b - g tok (Either a b) mult :: g tok a - g tok b - g tok (a,b) star :: g tok a - g tok [a] data Parse tok t = Parse { empty :: Maybe t , divide :: tok - Parse tok t} parse :: Parse tok a - [tok] - Maybe a parse p [] = empty p parse p (t:ts) = parse (divide p t) ts liftP f p = Parse{empty = liftM f (empty p), divide = \tok - liftP f (divide p tok)} liftP2 mf p1 p2 = Parse{empty = mf (empty p1) (empty p2), divide = \tok - liftP2 mf (divide p1 tok) (divide p2 tok)} lsum x y = (liftM Left x) `mplus` (liftM Right y) lprod x y = liftM2 (,) x y -- Conor McBride's parser division algorithm instance RegExp Parse where zero= Parse mzero (\_ - zero) one = Parse (return ()) (\_ - liftP (const ()) zero) check p = Parse mzero (\t - if p t then liftP (const t) one else liftP (const t) zero) plus r1 r2 = Parse (lsum (empty r1) (empty r2)) (\t - plus (divide r1 t) (divide r2 t)) mult r1 r2 = Parse (lprod (empty r1) (empty r2)) (\t - let (q1,q2) = (divide r1 t, divide r2 t) lp x1 = liftP (either id ((,) x1)) in maybe (mult q1 r2) (\x1 - lp x1 (plus (mult q1 r2) q2)) (empty r1)) star r = Parse (return []) (\t- liftP (uncurry (:)) (mult (divide r t) (star r))) p1 :: RegExp g = g Char ([Char], [Char]) p1 = mult (star (check (== 'a'))) (star (check (== 'b'))) testp = parse (star (mult (star (check (== 'a'))) (star (check (== 'b') abaabaaa {- *RX testp Just [(a,b),(aa,b),(aaa,)] -} testc = parse (star one) abracadabra -- Parsing the list of integers ieven = even :: Int-Bool iodd = odd :: Int-Bool p2 :: RegExp g = g Int (Either (Int, (Int, [Int])) (Int, [Int])) p2 = plus (mult (check iodd) (mult (check iodd) (star (check ieven (mult (check ieven) (star (check iodd))) test2 = parse (star p2) [1::Int,1,2,3,3,4] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Regular Expressions without GADTs
The code seems a bit simpler, too. Do you really think so? To me replacing a GADT by class and instance declarations seems the wrong way round. We should not forget that the DT in GADT stands for `data type'. One could certainly argue that the gist of functional programming is to define a collection of data types and functions that operate on these types. The fact that with GADTs constructors can have more general types just allows us to use the functional design pattern more often (freeing us from the need or temptation to resort to type class hackery with multiple parameter type classes, undecidable instances, functional dependencies etc). Cheers, Ralf ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe