On Sun, Jul 20, 2008 at 7:25 AM, Chaddaï Fouché <[EMAIL PROTECTED]> wrote:
>
> > That's exactly what I thought. But even if I remove the only 'try' I use
> the
> > memory consumption remains unchanged:
>
> It's true, but in your case your output is almost the raw input data,
> which means that even without a noxious "try", you still have the
> whole file in memory. Well hopefully not with your latest code, which
> I would really like to see.
Here is the part that actually changed:
-----------
split c str = let (p,ps) = aux str in (p:ps)
where
aux [] = ([],[])
aux (x:cs) = let (xs,xss) = aux cs in
if x == c then ([c],(xs:xss)) else ((x:xs),xss)
splitPred :: (Eq a) => (a -> Bool) -> [a] -> [[a]]
splitPred pr str = let (p,ps) = aux str in (p:ps)
where
aux [] = ([],[])
aux (x:cs) = let (xs,xss) = aux cs in
if pr x then ([],((x:xs):xss)) else ((x:xs),xss)
doOneFile :: String -> IO ()
doOneFile fname = do
t1 <- getCurrentTime
doesFileExist (fname ++ ".html") >>= \b -> if b then hPutStrLn stderr $
printf "File already processed, skipping: %s" fname else do
src <- readFile fname
out <- openFile (fname ++ ".html") WriteMode
hSetBuffering out (BlockBuffering (Just 64000))
hPutStrLn out "<html>"
hPutStrLn out "<body bgcolor=\"black\">"
hPutStrLn out "<meta http-equiv=\"Content-Type\" content=\"text/html;
charset=UTF-8\">"
hPutStrLn out "<span style=\"font-family: monospace; font-size: 13;\"
><span>"
let extractData = \p -> case p of
Right x -> x
Left err -> (trace . show $ err) []
let srcSplit = splitPred (`elem`"\n") src
let parsed = concatMap (extractData . parse mainParser fname) srcSplit
execStateT (hPrintHtml (St id)) (out,emptyStyle) -- wypisujemy pierwszy
wiersz
execStateT (mapM_ hPrintHtml parsed) (out,emptyStyle)
hPutStrLn out "</span></span>"
hPutStrLn out "</body>"
hPutStrLn out "</html>"
t2 <- getCurrentTime
hPutStrLn stderr $ printf "File %s processed. It took %s. File size was
%d characters." fname (show $ diffUTCTime t2 t1) (length src)
hClose out
------
The whole file is also attached. You will find there another (worse)
implementation of split and a little bit of code similar to thread pool
stuff.
On Sun, Jul 20, 2008 at 8:17 AM, John Meacham <[EMAIL PROTECTED]> wrote:
> On Sun, Jul 20, 2008 at 02:34:09AM +0400, Bulat Ziganshin wrote:
> > i think that Parsec library should hold entire file in memory only when
> > you use 'try' for whole file. otherwise it should omit data as
> > proceeded
>
> I do not believe that is the case, since the return type of runParser
> "Either ParseError a" means that before you can extract the result of
> the parse from the 'Right' branch, it must evaluate whether the result
> is 'Left' or 'Right' meaning it needs to parse the whole input in order
> to determine whether the parse was succesful.
>
It's true it has to parse the whole file, but it is not true it has to
reside in the memory: only the results must be there. In this case, when the
result is 1-1 transformation of input, it is true. But consider this
program:
----
module Main where
import Text.ParserCombinators.Parsec
par = eof <|> (char 'a' >> par)
alst = take 200000000 (repeat 'a')
main = print (runParser par () "" alst)
----
It runs in constant memory:
$ ./partest.exe +RTS -sstderr
C:\cygwin\home\Metharius\killer\killerPy\ansi2html\partest.exe +RTS -sstderr
Right ()
84,326,845,636 bytes allocated in the heap
22,428,536 bytes copied during GC
9,684 bytes maximum residency (1 sample(s))
13,848 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 160845 collections, 0 parallel, 0.63s, 0.63s elapsed
Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.02s ( 0.00s elapsed)
MUT time 54.31s ( 54.55s elapsed)
GC time 0.63s ( 0.63s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 54.95s ( 55.17s elapsed)
%GC time 1.1% (1.1% elapsed)
Alloc rate 1,552,176,623 bytes per MUT second
Productivity 98.8% of total user, 98.4% of total elapsed
Best regards
Christopher Skrzętnicki
ansi2html.hs
Description: Binary data
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
