Jason Dagit wrote:
> On Mon, Oct 5, 2009 at 6:25 PM, Trent W. Buck
> <[email protected]>wrote:
>> Ben Franksen <[email protected]> writes:
>>
>> > Jason Dagit wrote:
>> >> It's possible that regex-pcre gives better performance than
>> >> regex-posix
>> >
>> > I made some tests using the new criterion package (excellent for stuff
>> > like that) and found regex-pcre to be faster by a factor of 3 to 9,
>> > depending on regex and test string. I did not test it with darcs, but
>> > I took some random regexes from the standard boring file and a random
>> > file with a long path.
>>
>> Does the default ERE matcher use the OS's regexp implementation, or is
>> it purely done in Haskell?  In the former case, the OS and OS version
>> might be significant -- e.g. AIX 4 might have much slower EREs than a
>> recent GNU/Linux.
> 
> I don't think regex-posix is the OS's implementation.  Or if it is, then
> I'm not sure what magic is done to provide that implementation on windows.
> Perhaps there is a bundled source version which is used if the OS's
> doesn't
> provide an implementation?  I guess this makes performance testing
> worthwhile on windows.
> 
> Ben, could you make the source for your test publicly available (why
> duplicate effort, if we don't have to).  Maybe someone will volunteer to
> crunch some numbers on windows.

Sure, file is attached. It's not much, though! Just played around with a few
regexes from the latest default boringfile and a random path from my darcs
repo. It should be easy to extend, though. Note I am not explicitly
compiling the regexes, so one could argue that in fact this tests speed of
the regex-compiler plus speed of the regex-engine. This can also easily be
fixed.

import Control.Monad
import Criterion.Main
import Text.Regex.Base
import qualified Text.Regex.PCRE
import qualified Text.Regex.Posix
import qualified Text.Regex.TDFA

texts = [
    "src/Darcs/Commands/ShowRepo.lhs"
  ]

regexes = [
    "(^|/)\\.waf-[[:digit:].]+-[[:digit:]]+($|/)",
    "(^|/)autom4te\\.cache($|/)",
    "\\.(obj|a|exe|so|lo|la)$"
  ]

data Engine = Pcre | Posix | Tdfa deriving (Enum,Show)

test :: Engine -> String -> String -> Int -> Bool
test eng text re _ = text * re where
  (*) = case eng of
    Posix -> (Text.Regex.Posix.=~)
    Tdfa -> (Text.Regex.TDFA.=~)
    Pcre -> (Text.Regex.PCRE.=~)

main = defaultMain $
  flip map texts $ \t ->
    bgroup (show t ++ "\n") $
      flip map regexes $ \re ->
        bgroup (show re ++ "\n") $
          flip map [Pcre ..] $ \eng ->
            bench (show eng) $ test eng t re

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to