On Nov 18, 2008, at 7:03 AM, Bulat Ziganshin wrote:

Hello Tillmann,

Tuesday, November 18, 2008, 2:46:47 PM, you wrote:

Why should a Haskell hash table need more memory then a Python hash
table? I've heard that Data.HashTable is bad, so maybe writing a good
one could be an option.

about Data.HashTable: it uses one huge array to store all the entries.
the catch is that GC need to scan entire array on every (major) GC.

Actually, the scan on every major (full) GC is unavoidable. What *can* be avoided is a scan on every *minor* GC that occurs after an update. I forget what the exact strategy is here, but I know that one write used to cause the entire array to be re-scanned; what I don't remember is when/if the array transitions back to a state where it isn't being scanned by minor GC anymore.

using array of hashtables may improve situation a lot

Yes, this would be worth trying. Understanding the current GC strategy would make it easier to make the right tradeoffs here; we expect n insertions will touch O(n) subtables, so repeated insertion will make life worse if we're not careful.

-Jan-Willem Maessen

plus check GC times for every version: +RTS -Soutfile


--
Best regards,
Bulat                            mailto:[EMAIL PROTECTED]

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to