Re: [Haskell-cafe] UArray Word16 Word32 uses twice as much memory as it should?

Arne Dehli Halvorsen Wed, 19 Nov 2008 10:34:04 -0800

Bulat Ziganshin wrote:

Hello Arne,


Wednesday, November 19, 2008, 11:57:01 AM, you wrote:

finding that it uses about twice as much memory as I had anticipated.

Hello, and thank you for your reply.

it may be
1) GC problem (due to GC haskell programs occupies 2-3x more memory
than actually used)

I wasn't aware of that - but it should be possible to trigger a GC afterloading a whole lot of data?

2) additional data (you not said how long each small array. you should
expect 10-30 additional bytes used for every array)

The arrays represent the netflix data set: 100 000 000 ratings, givenfor 17770 films.

For each the films, I want to hold (on average, roughly) 2000 ratings,held as one person id (32-bit) and one rating (8-bit), in the respctivearrays.

(In addition, I want to be able to load the inversion of this data: forall persons, I want to hold their ratings in a similar way:16-bit film id, 8-bit rating. There are 480000 persons, so this shouldbe on average 200 entries per person.

I have coded a few approaches to inverting this, but I can't allocatethe array before traversing the data, because I don't know the sizes.


How can one go about inverting this data in memory?

It seems that any kind of laziness will fill the whole memory before Ihave traversed the whole set - and if I use several accumArrays, itseems that it will hold the whole uncompacted dataset in memory betweenaccumArrays.

Ideally I want to hold all ratings as well as statistics for all films,and the same for all the persons - and then have room to spare forrunning an algorithm...


Best regards,
Arne D Halvorsen

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] UArray Word16 Word32 uses twice as much memory as it should?

Reply via email to