[Haskell-cafe] Perl-ish =~ operator

2004-02-23 Thread ozone
In my effort to turn Haskell into a language more like Perl 
(muahaha)[1], I got a bit fed up and implemented something like Perl 
5's =~ binding operator (a.k.a. regex operator); I thought maybe 
somebody else here might find it useful.  Perl has the concept of 
'contexts': a function does something different depending on what type 
its caller expects back from the function.  Sounds like a perfect abuse 
of type classes, to me :).  Code follows:

---
{-# OPTIONS -fglasgow-exts #-}
{-  Need this for instance Foo [String] declarations -}
module PLRE where
-- Perl-Like Regular Expressions
import Text.Regex

-- Perl-Like =~ operator, which changes behaviour depending on its 
calling
-- context

class RegExContext a where
  (=~) :: String - String - a
instance RegExContext Bool where
  s =~ re = case matchRegex (mkRegex re) s of
Nothing - False
Just x - True
instance RegExContext [String] where
  s =~ re = case matchRegex (mkRegex re) s of
Nothing - []
Just x - x
boolContextTest string regEx =
  case string =~ regEx of
True - print True
False - print False
stringListContextTest string regEx =
  case string =~ regEx of
(a:x) - print (First match:  ++ a)
_ - error No subexpression matches
---
Some test output for you:

*PLRE boolContextTest foo ^f
True
*PLRE boolContextTest foo ^g
False
*PLRE stringListContextTest foo ^(.)
First match: f
*PLRE stringListContextTest goo ^(.)
First match: g
*PLRE stringListContextTest  ^(.)
*** Exception: No subexpression matches
Note that you have a fairly severe restriction if you want to use =~ in 
your code: the Haskell compiler must be able to determine a concrete 
type for the context that =~ is used in.  i.e. if stringListContextTest 
was defined as:

stringListContextTest string regEx =
  case string =~ regEx of
(a:x) - print a
_ - error No subexpression matches
The compiler can't concretise a type for 'a', and it'll complain about 
not having an instance for RegExContext [a] (which is fair enough).  
Even with this restriction, I'm sure it'll still be useful.  It 
shouldn't be a bit leap to define other Perl-ish operators in this 
fashion, such as !~, or even s/.../.  Have the appropriate amount of 
fun!

1. Actually, I wanted to turn Haskell into a language more suitable for 
text processing, but that doesn't sound as evil.

--
% Andre Pang : trust.in.love.to.save
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Perl-ish =~ operator

2004-02-23 Thread ozone
On 24/02/2004, at 1:30 AM, Andre Pang wrote:

In my effort to turn Haskell into a language more like Perl 
(muahaha)[1], I got a bit fed up and implemented something like Perl 
5's =~ binding operator (a.k.a. regex operator); I thought maybe 
somebody else here might find it useful.  Perl has the concept of 
'contexts': a function does something different depending on what type 
its caller expects back from the function.  Sounds like a perfect 
abuse of type classes, to me :).
And, just for kicks, here's Perl 6's smart match (~~) operator, which 
works much like ==, but in a more Do What I Mean! fashion, matching 
any type to any other type:

---
{-# OPTIONS -fglasgow-exts -fallow-undecidable-instances 
-fallow-overlapping-instances #-}

module SmartMatch where

import Text.Regex

class TruthValue a where
  truth :: a - Bool
instance TruthValue Bool where
  truth = id
instance TruthValue Int where
  truth 0 = False
  truth _ = True
instance TruthValue String where
  truth  = False
  truth _  = True
instance TruthValue a = TruthValue [a] where
  truth = all truth
class SmartMatch a b where
  (~~) :: a - b - Bool
instance Eq a = SmartMatch a a where
  a ~~ b = a == b
instance SmartMatch c d = SmartMatch d c where
  a ~~ b = b ~~ a
instance TruthValue a = SmartMatch Bool a where
  a ~~ b = a == truth b
instance SmartMatch String Regex where
  s ~~ re = case matchRegex re s of
Nothing - False
Just x - True
---
--
% Andre Pang : trust.in.love.to.save
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Perl-ish =~ operator

2004-02-23 Thread Mark Carroll
On Tue, 24 Feb 2004 [EMAIL PROTECTED] wrote:

 In my effort to turn Haskell into a language more like Perl
 (muahaha)[1], I got a bit fed up and implemented something like Perl
 5's =~ binding operator (a.k.a. regex operator); I thought maybe
(snip)

This reminds me that one thing I do miss from the regex stuff I've found
so far in Haskell is Perl's ? operator for turning greedy matches into
minimally-short ones. I can still usually do what I need to with Parsec,
at least, but I just thought I'd mention the issue.

-- Mark
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Perl-ish =~ operator

2004-02-23 Thread John Meacham
On Mon, Feb 23, 2004 at 12:09:12PM -0500, Mark Carroll wrote:
 On Tue, 24 Feb 2004 [EMAIL PROTECTED] wrote:
 
  In my effort to turn Haskell into a language more like Perl
  (muahaha)[1], I got a bit fed up and implemented something like Perl
  5's =~ binding operator (a.k.a. regex operator); I thought maybe
 (snip)
 
 This reminds me that one thing I do miss from the regex stuff I've found
 so far in Haskell is Perl's ? operator for turning greedy matches into
 minimally-short ones. I can still usually do what I need to with Parsec,
 at least, but I just thought I'd mention the issue.

yeah, ? is awesome. of course, if I had my way, I'd go back in time and
make non-greedy matching the default. The silly longest match rule
breaks a lot of the computational niceness of regular expressions.  ah..
em. sorry for the random rant, to get things back on track, I started
writing a Template Haskell module to give one something like perls =~
but it syntax checked the regular expression at compile time and
compiled it to efficient code rather than using the Regex module. so I
didn't have to feel bad about writing /foo/ as I know it will be
compiled to just a simple substring match. perhaps this will inspire
someone to finish the project?

a standard pcre (pcre.org) binding would also be a cool thing to work on. I looked
into writing one, but by default pcre is compiled without utf-8 support
on my system, which made binding to it somewhat useless for my
particular task. however, if we could put a binding in the standard
library, perhaps building a version which works with haskell unicode if
necessary that would be cool. 

John
-- 
---
John Meacham - California Institute of Technology, Alum. - [EMAIL PROTECTED]
---
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Perl-ish =~ operator

2004-02-23 Thread Mark Carroll
On Mon, 23 Feb 2004, John Meacham wrote:
(snip)
 a standard pcre (pcre.org) binding would also be a cool thing to work on.
(snip)

Heh - maybe a Cambridge computer science student could do it, having both
PCRE's author and Haskell experts handy locally. (-:

-- Mark
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Perl-ish =~ operator

2004-02-23 Thread ozone
On 24/02/2004, at 1:30 AM, [EMAIL PROTECTED] wrote:

In my effort to turn Haskell into a language more like Perl 
(muahaha)[1], I got a bit fed up and implemented something like Perl 
5's =~ binding operator (a.k.a. regex operator); I thought maybe 
somebody else here might find it useful.  Perl has the concept of 
'contexts': a function does something different depending on what type 
its caller expects back from the function.  Sounds like a perfect 
abuse of type classes, to me :).
Bonus round: I've managed to hack up something which simulates Perl's 
s/// operator, and also something to emulate its /e modifier 
(evaluate), so that you can run a function over the resulting 
subexpression matches.  (You don't really _need_ the latter, because 
all you have to do is use the =~ operator in a [String] context and run 
your function over that, but hey, it's cool, and I get to abuse type 
classes even more :).

Now, =~ has to change its behaviour not only depending on its context, 
but also depending on what operation you do with it: matching, 
substitution, etc.  We differentiate each of the operations by giving 
them a different type: a simple match operation takes in a String (the 
regex to match), whereas substitution requires two strings: the string 
to match against, and the substitution string, i.e. (String, String).

---
{-# OPTIONS -fglasgow-exts #-}
--  Need this for instance Foo [String] declarations (look, don't even
--  need undecidable or overlapping instances :)
module PLRE where
-- Perl-Like Regular Expressions
import Maybe
import Text.Regex
-- Perl-Like =~ operator, which changes behaviour depending on its 
calling
-- context

class Bind op context where
  (=~) :: String - op - context
-- (=~) :: String - String - Bool
-- returns whether the regex matched or not
instance Bind String Bool where
  s =~ re = case matchRegex (mkRegex re) s of
Nothing - False
Just x - True
-- (=~) :: String - String - [String]
-- returns a list of subexpression matches
instance Bind String [String] where
  s =~ re = case matchRegex (mkRegex re) s of
Nothing - []
Just x - x
-- (=~) :: String - (String, String) - String
-- substitution: foo =~ (f, g) = goo
instance Bind (String, String) String where
  s =~ (re, sub) = case matchRegexAll (mkRegex re) s of
Nothing - []
Just (before, _, after, _) - before ++ sub ++ after
-- perl's /e modifier.  We expect a function that takes in an argument 
of
-- type [String] (and can output any type): in that argument, index 0 of
-- the list is the original string to match against, index 1 (if it
-- exists) is the first subexpression match, index 2 is the second
-- subexpression match, etc.

instance Bind (String, ([String] - context)) context where
  s =~ (re, fn) = case matchRegex (mkRegex re) s of
Just matches - fn (s:matches)
Nothing - fn [] -- or maybe this should be an error?
boolContextTest string regEx =
  case string =~ regEx of
True - print True
False - print False
stringListContextTest string regEx =
  case string =~ regEx of
(a:x) - print (First match:  ++ a)
_ - error No subexpression matches
---
For an example of how to use the /e-like operator:

PLRE foo =~ (^(..), \l - map Char.ord (l!!1) ) :: [Int]
[102,111]
i.e. it (vaguely) resembles something like $foo =~ s/^(..)/ord $1/;

One thing which would be really nice is to use implicit parameters for 
the subexpression match instead of passing the list of subexpression 
matches explicitly to the function, so that you could instead write:

PLRE foo =~ (^(..), map Char.ord ?_1 ) :: [Int]
[102,111]
So ?_n maps nicely on to Perl's $n match variable (or \n, if you're a 
sed foo).  I couldn't find any way for this to work, though, since 
implicit parameters aren't allowed in an instance declaration.

--
% Andre Pang : trust.in.love.to.save
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe