Hi,

I'm trying (for the first time ever) to use RULES pragmas to achieve
some nice speedups in my bytestring parsing library. The relevant code
in my library's module is:

-- The module imports Control.Applicative which containes 'many' and 'some'.

-- | The parser @satisfy p@ succeeds for any byte for which the
-- supplied function @p@ returns 'True'.  Returns the byte that is
-- actually parsed.
satisfy :: (Word8 -> Bool) -> Parser Word8
satisfy p =
    Parser $ \s@(S bs pos eof) succ fail ->
        case S.uncons bs of
          Just (b, bs') -> if p b
                           then succ b (S bs' (pos + 1) eof)
                           else fail s
          Nothing       -> if eof
                           then fail s
                           else IPartial $ \x ->
                               case x of
                                 Just bs' -> retry (S bs' pos eof)
                                 Nothing  -> fail (S bs pos True)
            where retry s' = unParser (satisfy p) s' succ fail

-- | @byte b@ parses a single byte @[EMAIL PROTECTED]  Returns the parsed byte
-- (i.e. @b@).
byte :: Word8 -> Parser Word8
byte b = satisfy (== b)

-- ---------------------------------------------------------------------
-- Rewrite rules

satisfyMany :: (Word8 -> Bool) -> Parser S.ByteString
satisfyMany p = undefined  -- More efficient implementation goes here.

satisfySome :: (Word8 -> Bool) -> Parser S.ByteString
satisfySome p = undefined  -- More efficient implementation goes here.

{-# RULES

"fmap/pack/many/satisfy" forall p.
  fmap S.pack (many (satisfy p)) = satisfyMany p

"fmap/pack/some/satisfy" forall p.
  fmap S.pack (some (satisfy p)) = satisfySome p
  #-}

In another module where I use the library I have this code:

pHeaders :: Parser [(S.ByteString, S.ByteString)]
pHeaders = many header
    where
      header = liftA2 (,) fieldName (byte (c2w ':') *> spaces *> contents)
      fieldName = liftA2 (S.cons) letter fieldChars
      contents = liftA2 (S.append) (fmap S.pack $ some notEOL <* crlf)
                 (continuation <|> pure S.empty)
      continuation = liftA2 (S.cons) ((c2w ' ') <$
                                      some (oneOf (map c2w " \t"))) contents

-- It's important that all three of these definitions are kept on the
-- top level to have RULES fire correctly.
fieldChars = fmap S.pack $ many fieldChar

-- fieldChar = letter <|> digit <|> oneOf (map c2w "-_")
fieldChar = satisfy isFieldChar
    where
      isFieldChar b = (isDigit $ chr $ fromIntegral b) ||
                      (isAlpha $ chr $ fromIntegral b) ||
                      (b `elem` map c2w "-_")

I want the fieldChars use of 'fmap S.pack $ many fieldChar' to trigger
my rewrite rule "fmap/pack/many/satisfy" which it does in this case.
The trouble is that the rule only triggers when I make at least
fieldChars and fieldChar top-level definition and isFieldChar either a
named local definition in fieldChar or a top-level definition. If I
turn the predicate (isFieldChar) into to an anonymous lambda function
it doesn't trigger, if I make either fieldChars or fieldChars a local
defintion (in a where clause) of pHeaders it doesn't trigger. If I
make fieldChar a local definition in fieldChars it doesn't trigger,
etc.

It would be great if there was a way to make this a bit less fragile
and have the rule trigger more often as it is potentially a huge
performance win. I understand it's hard to guarantee that the rule
always triggers but now it triggers in rare cases.

-- Johan
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to