Yes. It's definitely strict in the accumulator and your `Map` fold
looks correct to me. As a side, note you should probably be building a
`Set` instead of a `Map`.
Are you sure that it's the map that is leaking space? You should
profile your heap using the instructions here:
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/prof-heap.html
Also, how are `bslines` and `toVector` implemented?
On 8/21/2015 9:33 PM, Alexey Raga wrote:
Sorry for another beginner's question, but is P.fold strict or lazy on
its accumulator?
In the documentation it says that it is "Strict fold of the elements
of a 'Producer'", but here is what I see:
I have a big CSV file (50M rows), and one of the columns contains
about 4.5M unique values. I fold these values into a Data.Map.Strict:
names :: Producer ByteString IO ()
names = bslines
>-> P.map toVector
>-> P.map (V.! 25)
main :: IO ()
main = do
m <- P.fold (\s a -> if M.member a s then s else M.insert a 1 s)
M.empty id names
Prelude.print $ M.size m
Prelude.print $ M.findMax m
Prelude.print "ok"
The output suggests that the map is of the right size, and the max
element is correct. This takes ~6.4GB of RAM.
Now I extract these 4.5M unique values into a separate file and run
the same code again (only the column index is changing, nothing else).
The output is the same (same size, same max element), except that now
it only takes 1.2GB of RAM to run.
Am I right suspecting that laziness causes this issue? But where and
how can I fix it?
Cheers,
Alexey.
--
You received this message because you are subscribed to the Google
Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to haskell-pipes+unsubscr...@googlegroups.com
<mailto:haskell-pipes+unsubscr...@googlegroups.com>.
To post to this group, send email to haskell-pipes@googlegroups.com
<mailto:haskell-pipes@googlegroups.com>.
--
You received this message because you are subscribed to the Google Groups "Haskell
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.