[Haskell-cafe] Perl-ish =~ operator
In my effort to turn Haskell into a language more like Perl (muahaha)[1], I got a bit fed up and implemented something like Perl 5's =~ binding operator (a.k.a. regex operator); I thought maybe somebody else here might find it useful. Perl has the concept of 'contexts': a function does something different depending on what type its caller expects back from the function. Sounds like a perfect abuse of type classes, to me :). Code follows: --- {-# OPTIONS -fglasgow-exts #-} {- Need this for instance Foo [String] declarations -} module PLRE where -- Perl-Like Regular Expressions import Text.Regex -- Perl-Like =~ operator, which changes behaviour depending on its calling -- context class RegExContext a where (=~) :: String - String - a instance RegExContext Bool where s =~ re = case matchRegex (mkRegex re) s of Nothing - False Just x - True instance RegExContext [String] where s =~ re = case matchRegex (mkRegex re) s of Nothing - [] Just x - x boolContextTest string regEx = case string =~ regEx of True - print True False - print False stringListContextTest string regEx = case string =~ regEx of (a:x) - print (First match: ++ a) _ - error No subexpression matches --- Some test output for you: *PLRE boolContextTest foo ^f True *PLRE boolContextTest foo ^g False *PLRE stringListContextTest foo ^(.) First match: f *PLRE stringListContextTest goo ^(.) First match: g *PLRE stringListContextTest ^(.) *** Exception: No subexpression matches Note that you have a fairly severe restriction if you want to use =~ in your code: the Haskell compiler must be able to determine a concrete type for the context that =~ is used in. i.e. if stringListContextTest was defined as: stringListContextTest string regEx = case string =~ regEx of (a:x) - print a _ - error No subexpression matches The compiler can't concretise a type for 'a', and it'll complain about not having an instance for RegExContext [a] (which is fair enough). Even with this restriction, I'm sure it'll still be useful. It shouldn't be a bit leap to define other Perl-ish operators in this fashion, such as !~, or even s/.../. Have the appropriate amount of fun! 1. Actually, I wanted to turn Haskell into a language more suitable for text processing, but that doesn't sound as evil. -- % Andre Pang : trust.in.love.to.save ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Perl-ish =~ operator
On 24/02/2004, at 1:30 AM, Andre Pang wrote: In my effort to turn Haskell into a language more like Perl (muahaha)[1], I got a bit fed up and implemented something like Perl 5's =~ binding operator (a.k.a. regex operator); I thought maybe somebody else here might find it useful. Perl has the concept of 'contexts': a function does something different depending on what type its caller expects back from the function. Sounds like a perfect abuse of type classes, to me :). And, just for kicks, here's Perl 6's smart match (~~) operator, which works much like ==, but in a more Do What I Mean! fashion, matching any type to any other type: --- {-# OPTIONS -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances #-} module SmartMatch where import Text.Regex class TruthValue a where truth :: a - Bool instance TruthValue Bool where truth = id instance TruthValue Int where truth 0 = False truth _ = True instance TruthValue String where truth = False truth _ = True instance TruthValue a = TruthValue [a] where truth = all truth class SmartMatch a b where (~~) :: a - b - Bool instance Eq a = SmartMatch a a where a ~~ b = a == b instance SmartMatch c d = SmartMatch d c where a ~~ b = b ~~ a instance TruthValue a = SmartMatch Bool a where a ~~ b = a == truth b instance SmartMatch String Regex where s ~~ re = case matchRegex re s of Nothing - False Just x - True --- -- % Andre Pang : trust.in.love.to.save ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Perl-ish =~ operator
On Tue, 24 Feb 2004 [EMAIL PROTECTED] wrote: In my effort to turn Haskell into a language more like Perl (muahaha)[1], I got a bit fed up and implemented something like Perl 5's =~ binding operator (a.k.a. regex operator); I thought maybe (snip) This reminds me that one thing I do miss from the regex stuff I've found so far in Haskell is Perl's ? operator for turning greedy matches into minimally-short ones. I can still usually do what I need to with Parsec, at least, but I just thought I'd mention the issue. -- Mark ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Perl-ish =~ operator
On Mon, Feb 23, 2004 at 12:09:12PM -0500, Mark Carroll wrote: On Tue, 24 Feb 2004 [EMAIL PROTECTED] wrote: In my effort to turn Haskell into a language more like Perl (muahaha)[1], I got a bit fed up and implemented something like Perl 5's =~ binding operator (a.k.a. regex operator); I thought maybe (snip) This reminds me that one thing I do miss from the regex stuff I've found so far in Haskell is Perl's ? operator for turning greedy matches into minimally-short ones. I can still usually do what I need to with Parsec, at least, but I just thought I'd mention the issue. yeah, ? is awesome. of course, if I had my way, I'd go back in time and make non-greedy matching the default. The silly longest match rule breaks a lot of the computational niceness of regular expressions. ah.. em. sorry for the random rant, to get things back on track, I started writing a Template Haskell module to give one something like perls =~ but it syntax checked the regular expression at compile time and compiled it to efficient code rather than using the Regex module. so I didn't have to feel bad about writing /foo/ as I know it will be compiled to just a simple substring match. perhaps this will inspire someone to finish the project? a standard pcre (pcre.org) binding would also be a cool thing to work on. I looked into writing one, but by default pcre is compiled without utf-8 support on my system, which made binding to it somewhat useless for my particular task. however, if we could put a binding in the standard library, perhaps building a version which works with haskell unicode if necessary that would be cool. John -- --- John Meacham - California Institute of Technology, Alum. - [EMAIL PROTECTED] --- ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Perl-ish =~ operator
On Mon, 23 Feb 2004, John Meacham wrote: (snip) a standard pcre (pcre.org) binding would also be a cool thing to work on. (snip) Heh - maybe a Cambridge computer science student could do it, having both PCRE's author and Haskell experts handy locally. (-: -- Mark ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Perl-ish =~ operator
On 24/02/2004, at 1:30 AM, [EMAIL PROTECTED] wrote: In my effort to turn Haskell into a language more like Perl (muahaha)[1], I got a bit fed up and implemented something like Perl 5's =~ binding operator (a.k.a. regex operator); I thought maybe somebody else here might find it useful. Perl has the concept of 'contexts': a function does something different depending on what type its caller expects back from the function. Sounds like a perfect abuse of type classes, to me :). Bonus round: I've managed to hack up something which simulates Perl's s/// operator, and also something to emulate its /e modifier (evaluate), so that you can run a function over the resulting subexpression matches. (You don't really _need_ the latter, because all you have to do is use the =~ operator in a [String] context and run your function over that, but hey, it's cool, and I get to abuse type classes even more :). Now, =~ has to change its behaviour not only depending on its context, but also depending on what operation you do with it: matching, substitution, etc. We differentiate each of the operations by giving them a different type: a simple match operation takes in a String (the regex to match), whereas substitution requires two strings: the string to match against, and the substitution string, i.e. (String, String). --- {-# OPTIONS -fglasgow-exts #-} -- Need this for instance Foo [String] declarations (look, don't even -- need undecidable or overlapping instances :) module PLRE where -- Perl-Like Regular Expressions import Maybe import Text.Regex -- Perl-Like =~ operator, which changes behaviour depending on its calling -- context class Bind op context where (=~) :: String - op - context -- (=~) :: String - String - Bool -- returns whether the regex matched or not instance Bind String Bool where s =~ re = case matchRegex (mkRegex re) s of Nothing - False Just x - True -- (=~) :: String - String - [String] -- returns a list of subexpression matches instance Bind String [String] where s =~ re = case matchRegex (mkRegex re) s of Nothing - [] Just x - x -- (=~) :: String - (String, String) - String -- substitution: foo =~ (f, g) = goo instance Bind (String, String) String where s =~ (re, sub) = case matchRegexAll (mkRegex re) s of Nothing - [] Just (before, _, after, _) - before ++ sub ++ after -- perl's /e modifier. We expect a function that takes in an argument of -- type [String] (and can output any type): in that argument, index 0 of -- the list is the original string to match against, index 1 (if it -- exists) is the first subexpression match, index 2 is the second -- subexpression match, etc. instance Bind (String, ([String] - context)) context where s =~ (re, fn) = case matchRegex (mkRegex re) s of Just matches - fn (s:matches) Nothing - fn [] -- or maybe this should be an error? boolContextTest string regEx = case string =~ regEx of True - print True False - print False stringListContextTest string regEx = case string =~ regEx of (a:x) - print (First match: ++ a) _ - error No subexpression matches --- For an example of how to use the /e-like operator: PLRE foo =~ (^(..), \l - map Char.ord (l!!1) ) :: [Int] [102,111] i.e. it (vaguely) resembles something like $foo =~ s/^(..)/ord $1/; One thing which would be really nice is to use implicit parameters for the subexpression match instead of passing the list of subexpression matches explicitly to the function, so that you could instead write: PLRE foo =~ (^(..), map Char.ord ?_1 ) :: [Int] [102,111] So ?_n maps nicely on to Perl's $n match variable (or \n, if you're a sed foo). I couldn't find any way for this to work, though, since implicit parameters aren't allowed in an instance declaration. -- % Andre Pang : trust.in.love.to.save ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe