Semantic subtyping issue: Assume we have a function f of type f :: Reg (r*) -> ... to which we pass a value x of type Reg (r,r*). We have that (r,r*) is a semantic subtype of r*, hence, the code f x is accepted in languages such as XDuce/CDuce.
I'm just saying that the fact regexp can be represented via GADTs doesn't imply that we get the same expressive power of languages such as XDuce/CDuce. Martin Hongwei Xi writes: > Hi Martin: > > Thanks for the message. > > No, I am not on the list. > > Back then, we did something similar but for a different > purpose. The idea is like this: > > We wanted to represent an XML document as a pair: > > XML (r) * Reg (r) > > where r stands for some regexp and Reg(r) is a > singleton type. When performing search over XML(r), > we can use Reg(r) as some kind of pilot value > (which is a lot smaller than XML(r)) to guide the > search. For instance, it is often easy to tell from > Reg(r) that an item to be found is not in XML(r) and > thus we can skip searching XML(r) entirely. > This is sort of like indexing scheme in database. > > BTW, I am not sure what kind of 'semantic typing' > you have in mind. An example? > > --Hongwei > > Computer Science Department > Boston University > 111 Cummington Street > Boston, MA 02215 > > Email: [EMAIL PROTECTED] > Url: http://www.cs.bu.edu/~hwxi > Tel: +1 617 358 2511 (office) > Fax: +1 617 353 6457 (department) > > > > On Mon, 17 Oct 2005, Martin Sulzmann wrote: > > >> > >>Very interesting Conor. Do you know Xi et al's APLAS'03 paper? > >>(Hongwei, I'm not sure whether you're on this list). > >>Xi et al. use GRDTs (aka GADTs aka first-class phantom types) > >>to represent XML documents. There're may be some connections > >>between what you're doing and Xi et al's work. > >> > >>I believe that there's an awful lot you can do with GADTs > >>(in the context of regular expressions). But there're two things > >>you can't accomplish: > >>semantic subtyping and regular expression pattern matching > >>a la XDuce/CDuce. Unless somebody out there proves me wrong. > >> > >>Martin > >> > >> > >> > >>Hi folks, > >> > >>Inspired by Ralf's post, I thought I'd just GADTize a dependently typed=20 > >>program I wrote in 2001. > >> > >>Wolfgang Jeltsch wrote: > >> > >>> Now lets consider using an algebraic datatype for regexps: > >>> > >>> data RegExp > >>> =3D Empty | Single Char | RegExp :+: RegExp | RegExp :|: > >>> RegExpt | Ite= > >>r RegExp > >>> > >>>Manipulating regular expressions now becomes easy and safe =E2=80=93 you= > >> are just not=20 > >>>able to create "syntactically incorrect regular expressions" since durin= > >>g=20 > >>>runtime you don't deal with syntax at all. > >>> =20 > >>> > >> > >>A fancier variation on the same theme... > >> > >> > data RegExp :: * -> * -> * where > >> > Zero :: RegExp tok Empty > >> > One :: RegExp tok () > >> > Check :: (tok -> Bool) -> RegExp tok tok > >> > Plus :: RegExp tok a -> RegExp tok b -> RegExp tok (Either a b) > >> > Mult :: RegExp tok a -> RegExp tok b -> RegExp tok (a, b) > >> > Star :: RegExp tok a -> RegExp tok [a] > >> > >> > data Empty > >> > >>The intuition is that a RegExp tok output is a regular expression=20 > >>explaining how to parse a list of tok as an output. Here, Zero is the=20 > >>regexp which does not accept anything, One accepts just the empty=20 > >>string, Plus is choice and Mult is sequential composition; Check lets=20 > >>you decide whether you like a single token. > >> > >>Regular expressions may be seen as an extended language of polynomials=20 > >>with tokens for variables; this parser works by repeated application of=20 > >>the remainder theorem. > >> > >> > parse :: RegExp tok x -> [tok] -> Maybe x > >> > parse r [] =3D empty r > >> > parse r (t : ts) =3D case divide t r of > >> > Div q f -> return f `ap` parse q ts > >> > >>Example > >> > >>*RegExp> parse (Star (Mult (Star (Check (=3D=3D 'a'))) (Star (Check > >>(=3D=3D= > >>=20 > >>'b'))))) "abaabaaabbbb" > >>Just [("a","b"),("aa","b"),("aaa","bbbb")] > >> > >>The 'remainder' explains if a regular expression accepts the empty=20 > >>string, and if so, how. The Star case is a convenient=20 > >>underapproximation, ruling out repeated empty values. > >>=20 > >> > empty :: RegExp tok a -> Maybe a > >> > empty Zero =3D mzero > >> > empty One =3D return () > >> > empty (Check _) =3D mzero > >> > empty (Plus r1 r2) =3D (return Left `ap` empty r1) `mplus` > >> > (return Right `ap` empty r2) > >> > empty (Mult r1 r2) =3D return (,) `ap` empty r1 `ap` empty r2 > >> > empty (Star _) =3D return [] > >> > >>The 'quotient' explains how to parse the tail of the list, and how to=20 > >>recover the meaning of the whole list from the meaning of the tail. > >> > >> > data Division tok x =3D forall y. Div (RegExp tok y) (y -> x) > >> > >>Here's how it's done. I didn't expect to need scoped type variables, but=20 > >>I did... > >> > >> > divide :: tok -> RegExp tok x -> Division tok x > >> > divide t Zero =3D Div Zero naughtE > >> > divide t One =3D Div Zero naughtE > >> > divide t (Check p) | p t =3D Div One (const t) > >> > | otherwise =3D Div Zero naughtE > >> > divide t (Plus (r1 :: RegExp tok a) (r2 :: RegExp tok b)) =3D > >> > case (divide t r1, divide t r2) of > >> > (Div (q1 :: RegExp tok a') (f1 :: a' -> a), > >> > Div (q2 :: RegExp tok b') (f2 :: b' -> b)) -> > >> > Div (Plus q1 q2) (f1 +++ f2) > >> > divide t (Mult r1 r2) =3D case (empty r1, divide t r1, divide t r2) of > >> > (Nothing, Div q1 f1, _) -> Div (Mult q1 r2) (f1 *** id) > >> > (Just x1, Div q1 f1, Div q2 f2) -> > >> > Div (Plus (Mult q1 r2) q2) (either (f1 *** id) (((,) x1) . f2)) > >> > divide t (Star r) =3D case (divide t r) of > >> > Div q f -> Div (Mult q (Star r)) (\ (y, xs) -> (f y : xs)) > >> > >>Bureaucracy. > >> > >> > (***) :: (a -> b) -> (c -> d) -> (a, c) -> (b, d) > >> > (f *** g) (a, c) =3D (f a, g c) > >> > >> > (+++) :: (a -> b) -> (c -> d) -> Either a c -> Either b d > >> > (f +++ g) (Left a) =3D Left (f a) > >> > (f +++ g) (Right c) =3D Right (g c) > >> > >> > naughtE :: Empty -> x > >> > naughtE =3D undefined > >> > >>It's not the most efficient parser in the world (doing some algebraic=20 > >>simplification on the fly wouldn't hurt), but it shows the sort of stuff=20 > >>you can do. > >> > >>Have fun > >> > >>Conor > >>_______________________________________________ > >>Haskell-Cafe mailing list > >>Haskell-Cafe@haskell.org > >>http://www.haskell.org/mailman/listinfo/haskell-cafe > >> _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe