[Haskell-cafe] Regex API ideas

2007-11-01 Thread ChrisK
Hi Bryan,

  I wrote the current regex API, so your suggestions are interesting to me.  The
also goes for anyone else's regex API opinions, of course.

Bryan O'Sullivan wrote:
 Ketil Malde wrote:
 
 Python used to do pretty well here compared
 to Haskell, with rather efficient hashes and text parsing, although I
 suspect ByteString IO and other optimizations may have changed that
 now. 

 
 It still does just fine.  For typical munge a file with regexps, lists,
 and maps tasks, Python and Perl remain on par with comparably written
 Haskell.  This because the scripting-level code acts as a thin layer of
 glue around I/O, regexps, lists, and dicts, all of which are written in
 native code.
 
 The Haskell regexp libraries actually give us something of a leg down
 with respect to Python and Perl.

True, the pure Haskell library is not as fast as a C library.  In particular,
the current regex-tdfa handles lazy bytestring in a sub-optimal manner.  This
may eventually be fixed.

But the native code libraries have also been wrapped in the same API, and they
are quite fast when combined with strict ByteStrings.

 The aggressive use of polymorphism in
 the return type of (=~) makes it hard to remember which of the possible
 return types gives me what information.  Not only did I write a regexp
 tutorial to understand the API in the first place, I have to reread it
 every time I want to match a regexp.

The (=~) operator uses many return types provided by the instances of
RegexContext.  These are all thin wrappers around the unpolymorphic return types
of the RegexLike class.  So (=~) could be avoided altogether, or another API
created.

 
 A suitable solution would be a return type of RegexpMatch a = Maybe a
 (to live alongside the existing types, but aiming to become the one
 that's easy to remember), with appropriate methods on a, but I don't
 have time to write up a patch.
 
 b

The (=~~) is the monadic wrapper for (=~) to allow for different failure
behaviors.  So using (=~~) with Maybe is already possible, and gives Nothing
whenever there are zero matches.

But more interesting to me is learning what API you would like to see.
What would you like the code that uses the API to be?
Could you sketch either the definition or usage of your RegexMatch class 
suggestion?

I don't use my own regex API much, so external feedback and ideas would be
wonderful.

-- 
Chris

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Regex API ideas

2007-11-01 Thread Bryan O'Sullivan

ChrisK wrote:


The Haskell regexp libraries actually give us something of a leg down
with respect to Python and Perl.


True, the pure Haskell library is not as fast as a C library.


Actually, I wasn't referring to the performance of the libraries, merely 
to the non-stick nature of the API.  For my purposes, regex-pcre 
performs well (though I owe you some patches to make it, and other regex 
back ends, compile successfully out of the box).



But more interesting to me is learning what API you would like to see.
What would you like the code that uses the API to be?


Python's regexp API is pretty easy to use, and also to remember.  Here's 
what it does for match objects.


http://docs.python.org/lib/match-objects.html

b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe