Re: [Haskell-cafe] file splitter with enumerator package

2011-07-26 Thread yi huang
On Tue, Jul 26, 2011 at 12:19 PM, yi huang wrote: > Actually, i'm wondering how to do exception handling and resource cleanup > in iteratee, e.g. your `writer` iteratee, i found it difficult, because > iteratee is designed to let enumerator manage resources. > > I've found the answer for myself,

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread yi huang
Actually, i'm wondering how to do exception handling and resource cleanup in iteratee, e.g. your `writer` iteratee, i found it difficult, because iteratee is designed to let enumerator manage resources. On Sat, Jul 23, 2011 at 2:41 AM, Eric Rasmussen wrote: > Hi everyone, > > A friend of mine rec

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread David McBride
I feel like there is a little bit better way to code this by splitting the file outputting part from the part that counts and checks for newlines like so: run_ $ (EB.enumFile "file.txt" $= toChunksnl 4096) $$ toFiles filelist toFiles [] = error "expected infinite file list" toFiles (f:fs) = do

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread Eric Rasmussen
I just found another solution that seems to work, although I don't fully understand why. In my original function where I used EB.take to strictly read in a Lazy ByteString and then L.hPut to write it out to a handle, I now use this instead (full code in the annotation here: http://hpaste.org/49366)

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread David McBride
Well I was going to say: import Data.Text.IO as T import Data.Enumerator.List as EL import Data.Enumerator.Text as ET run_ $ (ET.enumHandle fp $= ET.lines) $$ EL.mapM_ T.putStrLn for example. But it turns out this actually concatenates the lines together and prints one single string at the end.

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread Yves Parès
Okay, so there, the chunks (xs) will be lines of Text, and not just random blocks. Isn't there a primitive like printChunks in the enumerator library, or are we forced to handle Chunks and EOF by hand? 2011/7/25 David McBride > blah = do > fp <- openFile "file" ReadMode > run_ $ (ET.enumHandle

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread David McBride
blah = do fp <- openFile "file" ReadMode run_ $ (ET.enumHandle fp $= ET.lines) $$ printChunks True printChunks is super duper simple: printChunks printEmpty = continue loop where loop (Chunks xs) = do let hide = null xs && not printEmpty CM.unless hide

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-25 Thread Yves Parès
Sorry, I'm only beginning to understand iteratees, but then how do you access each line of text output by the enumeratee "lines" within an iteratee? 2011/7/24 Felipe Almeida Lessa > On Sun, Jul 24, 2011 at 12:28 PM, Yves Parès wrote: > > If you used Data.Enumerator.Text, you would maybe benefit

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-24 Thread Eric Rasmussen
Since the program only needs to finish a line after it's made a bulk copy of a potentially large chunk of a file (could be 25 - 500 mb), I was hoping to find a way to copy the large chunk in constant memory and without inspecting the individual bytes/characters. I'm still having some difficulty wit

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-24 Thread Felipe Almeida Lessa
On Sun, Jul 24, 2011 at 12:28 PM, Yves Parès wrote: > If you used Data.Enumerator.Text, you would maybe benefit the "lines" > function: > > lines :: Monad m => Enumeratee Text Text m b It gets arbitrary blocks of text and outputs lines of text. > But there is something I don't get with that sign

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-24 Thread Yves Parès
If you used Data.Enumerator.Text, you would maybe benefit the "lines" function: lines :: Monad m => Enumeratee Text Text m b But there is something I don't get with that signature: why isn't it: lines :: Monad m => Enumeratee Text [Text] m b ?? 2011/7/23 Eric Rasmussen > Hi Felipe, > > Thank

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-22 Thread Eric Rasmussen
Hi Felipe, Thank you for the very detailed explanation and help. Regarding the first point, for this particular use case it's fine if the user-specified file size is extended by the length of a partial line (it's a compact csv file so if the user breaks a big file into 100mb chunks, each chunk wou

Re: [Haskell-cafe] file splitter with enumerator package

2011-07-22 Thread Felipe Almeida Lessa
There is one problem with your algorithm. If the user asks for 4 GiB, then the program will create files with *at least* 4 GiB. So the user would need to ask for less, maybe 3.9 GiB. Even so there's some danger, because there could be a 0.11 GiB line on the file. Now, the biggest problem your c

[Haskell-cafe] file splitter with enumerator package

2011-07-22 Thread Eric Rasmussen
Hi everyone, A friend of mine recently asked if I knew of a utility to split a large file (4gb in his case) into arbitrarily-sized files on Windows. Although there are a number of file-splitting utilities, the catch was it couldn't break in the middle of a line. When the standard "why don't you us