Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-19 Thread S. Doaitse Swierstra
The uu-parsing library support every ata type that is an instance of  
Data.Listlike 
(http://hackage.haskell.org/packages/archive/ListLike/3.0.1/doc/html/Data-ListLike.html#t:ListLike)
 and thus input from Data.Bytestring.Lazy.

A very small starting program can be found below. Note that here we ask for the 
error correction during parsin at the end of the processing; that is probably 
something you do not want to do, unless you only keep a very small part of the 
input in the result. The parsers are online, do not hang on to the input and 
thus you essentially only access and keep the part of the result you are 
interested in.

We find it a great help to have the error correction at hand since it makes it 
a lot easier to debug your parser. Here we just recognise any list of Word8's.

 Doaitse





{-# LANGUAGE MultiParamTypeClasses #-}
module ReadLargeBinaryFile where

import Text.ParserCombinators.UU
import Text.ParserCombinators.UU.BasicInstances
import Data.Word
import Data.ByteString.Lazy (ByteString,readFile)
import Prelude hiding (readFile)


type BS_Parser a = P (Str Word8 ByteString Integer) a

instance IsLocationUpdatedBy Integer Word8 where
   advance pos _ = pos + 1

p:: BS_Parser [Word8]
p =  pList (pSatisfy (const True) (Insertion  0 0) )
main filename = doinp - readFile filename
  let r@(a, errors) =  parse ( (,) $ p * pEnd) 
(createStr 0 inp)
  putStrLn (--  Result:  ++ show a)
  if null errors then  return ()
 else  do putStr (--  Correcting steps: 
\n)
  show_errors errors
  putStrLn -- 
  where show_errors :: (Show a) = [a] - IO ()
show_errors = sequence_ . (map (putStrLn . show))



interface and that exists for Data. 
On 10 mrt 2011, at 16:36, Skeptic . wrote:

 
 
 Hi,
 I finally have an opportunity to learn Haskell (I'm a day-to-day Java 
 programmer, but I'm also at ease with Scheme), parsing a huge (i.e. up to 50 
 go) binary file. The encoding is very stable, but it's not a flat struct 
 array (i.e. it uses flags). 
 Different outputs (i.e. text files) will be needed, some unknown at this 
 time. 
 Sounds to me a perfect real-world task to see what Haskell can offer.
 
 Any suggestions at how to structure the code or on which packages to look at 
 is welcome.
 
 Thanks. 
 ___
 Haskell mailing list
 Haskell@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell


___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-11 Thread Ketil Malde
Skeptic . skeptic2...@hotmail.com writes:

 I finally have an opportunity to learn Haskell (I'm a day-to-day Java
 programmer, but I'm also at ease with Scheme), parsing a huge (i.e. up
 to 50 go) binary file. The encoding is very stable, but it's not a
 flat struct array (i.e. it uses flags). 

I use binary 0.5 (later versions can no longer read a list of items
lazily.  I believe attoparsec has the same restriction.)

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-11 Thread Felipe Almeida Lessa
On Fri, Mar 11, 2011 at 8:11 AM, Ketil Malde ke...@malde.org wrote:
 I use binary 0.5 (later versions can no longer read a list of items
 lazily.  I believe attoparsec has the same restriction.)

You can define a parser for one item using attoparsec and then get all
items using an enumerator with attoparsec-enumerator.

Cheers,

-- 
Felipe.

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-11 Thread Nick Ingolia
I do the same, but using iteratee and attoparsec-iteratee.

Best,
--Nick

On 2011 Mar 11, at 07:51 EST, Felipe Almeida Lessa wrote:

 On Fri, Mar 11, 2011 at 8:11 AM, Ketil Malde ke...@malde.org wrote:
 I use binary 0.5 (later versions can no longer read a list of items
 lazily.  I believe attoparsec has the same restriction.)
 
 You can define a parser for one item using attoparsec and then get all
 items using an enumerator with attoparsec-enumerator.
 
 Cheers,
 
 -- 
 Felipe.
 
 ___
 Haskell mailing list
 Haskell@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell

Nick Ingolia
n...@ingolia.org




___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell] A opportunity to lern (parsing huge binary file)

2011-03-10 Thread Skeptic .


Hi,
I finally have an opportunity to learn Haskell (I'm a day-to-day Java 
programmer, but I'm also at ease with Scheme), parsing a huge (i.e. up to 50 
go) binary file. The encoding is very stable, but it's not a flat struct array 
(i.e. it uses flags). 
Different outputs (i.e. text files) will be needed, some unknown at this time. 
Sounds to me a perfect real-world task to see what Haskell can offer.

Any suggestions at how to structure the code or on which packages to look at is 
welcome.

Thanks.   
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-10 Thread Piyush P Kurur
On Thu, Mar 10, 2011 at 10:36:27AM -0500, Skeptic . wrote:
 
 
 Hi,

 I finally have an opportunity to learn Haskell (I'm a day-to-day
 Java programmer, but I'm also at ease with Scheme), parsing a huge
 (i.e. up to 50 go) binary file. The encoding is very stable, but
 it's not a flat struct array (i.e. it uses flags).   Different
 outputs (i.e. text files) will be needed, some unknown at this
 time.   Sounds to me a perfect real-world task to see what Haskell
 can offer.
 

  Maybe you can try attoparsec. I have not tired it but will like to
hear your experience

Regards

ppk

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] A opportunity to lern (parsing huge binary file)

2011-03-10 Thread Don Stewart
ppk:
 On Thu, Mar 10, 2011 at 10:36:27AM -0500, Skeptic . wrote:
  
  
  Hi,
 
  I finally have an opportunity to learn Haskell (I'm a day-to-day
  Java programmer, but I'm also at ease with Scheme), parsing a huge
  (i.e. up to 50 go) binary file. The encoding is very stable, but
  it's not a flat struct array (i.e. it uses flags).   Different
  outputs (i.e. text files) will be needed, some unknown at this
  time.   Sounds to me a perfect real-world task to see what Haskell
  can offer.
  
 
   Maybe you can try attoparsec. I have not tired it but will like to
 hear your experience
 

attoparsec or Data.Binary 

-- Don

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell