dons:
> briqueabraque:
> >   Hi,
> > 
> >   I need to edit big text files (5 to 500 Mb). But I just need to 
> > change one or two small lines, and save it. What is the best way to do 
> > that in Haskell, without creating copies of the whole files?
> > 

Thinking further, since you want to avoid copying on the disk, you need
to be able to keep the edited version in memory. So the strict
bytestring would be best, for example:

    import System.Environment
    import qualified Data.ByteString.Char8 as B

    main = do
        [f] <- getArgs
        B.writeFile f . B.unlines . map edit . B.lines =<< B.readFile f

        where
            edit :: B.ByteString -> B.ByteString
            edit s | (B.pack "Instances") `B.isPrefixOf` s = B.pack "EDIT"
                   | otherwise                             = s

Edits a 100M file in

    $ ghc -O -funbox-strict-fields A.hs -package fps 
    $ time ./a.out /home/dons/data/100M
    ./a.out /home/dons/data/100M  1.54s user 0.76s system 13% cpu 17.371 total

You could probably tune this further.

-- Don
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to