On Nov 18, 2008, at 7:03 AM, Bulat Ziganshin wrote:
Hello Tillmann,
Tuesday, November 18, 2008, 2:46:47 PM, you wrote:
Why should a Haskell hash table need more memory then a Python hash
table? I've heard that Data.HashTable is bad, so maybe writing a good
one could be an option.
about Data.HashTable: it uses one huge array to store all the entries.
the catch is that GC need to scan entire array on every (major) GC.
Actually, the scan on every major (full) GC is unavoidable. What
*can* be avoided is a scan on every *minor* GC that occurs after an
update. I forget what the exact strategy is here, but I know that one
write used to cause the entire array to be re-scanned; what I don't
remember is when/if the array transitions back to a state where it
isn't being scanned by minor GC anymore.
using array of hashtables may improve situation a lot
Yes, this would be worth trying. Understanding the current GC
strategy would make it easier to make the right tradeoffs here; we
expect n insertions will touch O(n) subtables, so repeated insertion
will make life worse if we're not careful.
-Jan-Willem Maessen
plus check GC times for every version: +RTS -Soutfile
--
Best regards,
Bulat mailto:[EMAIL PROTECTED]
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe