Hi Luke, Brian,

oh my god it works!

I'm just saying it because that's a first!

Late last night I had already suspected that laziness might be one of the reasons why the memoization technique showed no effect on memory consumption and after Brian's email that's exactly where I tried again, using Bang Patterns.


IT WORKED!

The memory consumption is now down by 20 fold!

Thank you guys very very much!

Günther



Am 06.09.2009, 03:24 Uhr, schrieb Brian Sniffen <[email protected]>:

Remember that each string will still be read, so you may still
allocate a great deal of memory and generate a great deal of garbage.
You will just release it before long, rather than hold it for the life
of your program.  Remember also that the memoization itself is lazy:
it does no good if you read all the file in, then force the
memoizations along with the computation of interest.

-Brian

2009/9/5 Günther Schmidt <[email protected]>:
Hi Luke,

thanks, this is some very good advice as I find many duplicates in the data
I have to iterate over.

However so far I'm unable to tell wether this actually works or not, I tried it a couple of times under different settings but it showed to difference in
memory consumption. The same mem peeks as before.

Do you have some code that where you could see a before and after?

Günther


Am 05.09.2009, 19:38 Uhr, schrieb Luke Palmer <[email protected]>:

2009/9/5 Günther Schmidt <[email protected]>:

Hi all,

I'm reading in a data of 216k records into a map of Key, Values Pairs,
the
values being strings.

As it happens out of 216k String values there really are only about 6.6k
distinct string values, so I could save a lot of RAM if I was able to
"insert" only actually *new* string values into the map and use
references
to (string) values that already are in memory instead.

Is there a container that would, if I wanted to insert an element, return
a
pair of either the previously inserted, equal value and the container
unchanged, or the new, previously unknown value and the new container
amended by that element?

I believe a memoization of the identity function will do what you want:

import qualified Data.MemoCombinators as Memo

share = Memo.list Memo.char id

Then pass any string through share to make/get a cached version.

You might want to limit the scope of share -- eg. put it in a where
clause for the function where you're using it -- so that it doesn't
eat memory for the lifetime of your program, only for when you need
it.

Luke

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe





_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to