Hi Iavor,
Thank you very much for this. It's nice to know that we have the ability
in Haskell to be as frugal (or profligate) with memory as R when working
with data frames. I should say this number of fields is quite low in the
data science world. Data sets with 500 columns are not uncommon
Hello,
when you parse the CSV fully, you end up creating a lot of small bytestring
objects, and each of these adds some overhead. The vectors themselves add
up some additional overhead. All of this adds up when you have as many
fields as you do. An alternative would be to use a different
Sounds bad. But it'll need someone with bytestring expertise to debug. Maybe
there's a GHC problem underlying; or maybe it's shortcoming of bytestring.
Simon
| -Original Message-
| From: Glasgow-haskell-users [mailto:glasgow-haskell-users-
| boun...@haskell.org] On Behalf Of