Re: [haskell-pipes] Project Euler #42 - how can I use pipes here?

Gabriel Gonzalez Fri, 13 Jun 2014 07:54:37 -0700

Here you go:

    import Control.Monad (void)
    import Data.Char (ord)
    import Data.Word (Word8)
    import Control.Applicative
    import qualified Data.ByteString as B
    import Data.Attoparsec.ByteString
    import Pipes.Attoparsec (parsed)
    import Pipes.ByteString (fromHandle)
    import Pipes
    import qualified Pipes.Prelude as Pipes
    import qualified System.IO as IO


    main = IO.withFile "euler42.txt" IO.ReadMode $ \handle -> do
        n <- Pipes.length $ void $  -- Ignore errors for simplicity
            parsed parser (fromHandle handle) >-> Pipes.filter isTriangle
        print n
      where
        isTriangle _ = True

    parser :: Parser Word8
    parser =
            fmap wordValue
        $   word8 quote
        *>  takeWhile1 (/= quote)
        <*  word8 quote
        <*  optional (word8 comma) )
      where
        wordValue = B.foldl' (+) 0 . B.map (subtract 64)
        quote = fromIntegral (ord '"')
        comma = fromIntegral (ord ',')

The trick is to only use `attoparsec` to parse a single value, so thatit never backtracks further than one word. The reason that `attoparsec`chokes on large files is that it has to backtrack.

The way `pipes-attoparsec` is designed to work is that you use`attoparsec` to define a backtracking parser for a small unit of input(i.e. a line or word), but then after each successful parse you commitand don't look back. The first argument to `parsed` is the parser forthe backtrackable single element, and `pipes-attoparsec` makes sure tocommit and not backtrack every time the parser succeeds. That's why itruns in constant space compared to `attoparsec`.

The rest is almost identical to what you proposed. The only differenceis that I moved the `map` directly into the `Parser` logic instead ofmaking it a separate pipe.


On 06/12/2014 09:13 PM, Данило Глинський wrote:

There is simple problem on Project Euler- http://projecteuler.net/problem=42 , which boils down to
import Data.Char
import Data.Word
import Control.Applicative
import qualified Data.ByteString as B
import Data.Attoparsec.ByteString

main = do
    result <- solution <$> parseFile <$> B.readFile "euler42.txt"
    print result
solution :: [B.ByteString] -> Int
solution =
    length . filter isTriangle . fmap wordValue
    where
        wordValue = B.foldl' (+) 0 . B.map (subtract 64)
isTriangle = const True {- some secret function -}
parseFile :: B.ByteString -> [B.ByteString]
parseFile = either (const []) id .
    parseOnly (wordParser `sepBy1` word8 (char ','))
  where
wordParser = word8 (char '"') *> takeWhile1 (/= (char '"')) <*word8 (char '"')
    char = fromIntegral . ord
Now the problem is: given large input file (150 Mb) this solutionleads to OutOfMemory.
Seems like pipes are designed to solve this problem in samecompositinal way, so the question is: what is a correct way to"pipify" this small program?
As I understand, there must be:
1) Producer of ByteString's, based on pipes-parse and pipes-attoparsec
2) Some analogues of "map", "filter" and "length" on pipe streams
3) extract result (unpipify)

The 2) seems easy. We should get something like

solution :: Producer B.ByteString IO () -> Producer Int IO ()
solution =
    P.length <-< P.filter isTriangle <-< P.map wordValue
but streaming from parsing and extracting is not that obvious. Couldsomeone help me?
--
You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Project Euler #42 - how can I use pipes here?

Reply via email to