On Wed, 2007-06-20 at 16:53 -0700, Stefan O'Rear wrote: > On Wed, Jun 20, 2007 at 04:49:55PM -0700, David Roundy wrote: > > To expand on that terse (but very true) statement, a list of Word8 > > increases the space usage by a factor of probably around an order of > > magnitude (two pointers + 1 byte vs 1 byte), completely destroys your > > Three pointers. > > > [ INFO PTR (like a tag but not quite) ] > [ PTR to Word8 (these are hashconsed, thankfully) ] > [ PTR to next value ]
So that's 12 bytes on a 32bit box, or 24 or a 64bit one, to represent one byte of data. For comparison, ByteStrings have a bigger overhead but a lower linear factor: sizeof [Word8] of length n : 32bit: n * 12 64bit: n * 24 sizeof ByteString of length n : 32bit: 40 + n (or 32 + n for shared bytestrings like substrings) 64bit: 80 + n (or 64 + n for shared) Incidentally a more space efficient representation that could preserve the same operations speeds (like O(1) substring) would be: data ByteString = BS ByteArray# Int Int which would be 4 unshared words and 2 shared words rather than the current 5 unshared and 4 shared that we get from using ForeignPtrs. The smallest possible would be 2 words overhead by just using a ByteArray#, but that sacrifices O(1) substring which is pretty important for a functional style. Duncan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe