braver wrote:
I dump results of a computation as a Data.Trie of [(Int,Float)].  It
contains about 5 million entries, with the lists of 35 or less pairs
each.  It takes 8 minutes to load with Data.Binary and lookup a single
key.  What can take so long?  If I change from compressed to
uncompressed (and then decode), it's the same time...  It's not IO,
CPU is loaded 100%.

The Binary instance for Trie is based on the old Binary instance for Data.IntMap. There were some corner-case performance issues with the latter which were recently fixed[1], but I haven't had a chance to look at the new instance or to figure out if the changes would also be relevant for Trie. So this might be a potential source of your problems.

[1] Alas, I can't find the thread discussing the new instance ATM.


I'm now thinking of using cereal.  Given I have Data.Binary in place,
what needs to be changed to work with cereal?  Is it binary-
compatible?  How can one construct a cereal instance for Data.Trie?

If you send me an instance, I can apply the patch.

--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to