On 26/03/2010 20:28, Mads Lindstrøm wrote:
Hi

For some time I have been thinking about an idea, which could limit
Haskell's memory footprint. I don't know if the idea is crazy or clever,
but I would love to hear peoples thoughts about it. The short story is,
I propose that the garbage collector should not just reclaim unused
memory, it should also diminish the need for pointers, by replacing
nested data structures with larger chunks of consecutive memory. In
other words, I would diminish the need for pointers for arbitrary
recursive data types, just as replacing linked lists with arrays
eliminate the need for pointers.

I will explain my idea by an example of a data type we all know and
love:

data List a = Cons a (List a) | Nil

each Cons cell uses two pointers - one for the element and one for the
rest of the list. If we could somehow merge the element and the rest of
the list into consecutive memory, we would be able to eliminate the
pointer to the rest of list. On 64 bit architectures merging would save
us 8 bytes of "wasted" memory. If we could merge n elements together we
could save n*8 bytes of memory.

The trouble with techniques like this is that they break the uniformity of the representation, and complexity leaks all over the place. Various similar ideas have been tried in the past, though not with Haskell as far as I'm aware: CDR-coding and BiBOP spring to mind.

64 bit pointers are wasteful in another way, as nobody has anywhere near
2^64 bytes of memory. And nobody is going to have that much memory for
decades to come. Thus, we could use the 8 most significant bits for
something besides pointing to memory, without loosing any significant
ability to address our memories. This would be similar to how GHC uses
some of the least significant bits for pointer tagging.

Unfortunatley you don't know whereabouts in your address space your memory is located: the OS could do randomised allocation and give you a pages all over the place, so in fact you might need all 64 bits. Yes you can start doing OS-specific things, but that always leads to pain later (I know, I've been there).

In the Java community there has been experimentation with using different representations to avoid the 64-bit pointer overhead: e.g. shifting a 32-bit pointer to the left by 2 bits in order to access 16GB of memory. Personally I'm not keen on doing this kind of trick, mainly because in GHC it would be a compile-time switch; Java has it easy here because they can make the choice at runtime. Also we already use the low 2 bits of pointers in GHC for pointer tagging.

Cheers,
        Simon
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to