#4342: Review containers changes
----------------------------------+-----------------------------------------
    Reporter:  igloo              |        Owner:              
        Type:  bug                |       Status:  new         
    Priority:  high               |    Milestone:  7.2.1       
   Component:  libraries (other)  |      Version:  7.1         
    Keywords:                     |     Testcase:              
   Blockedby:                     |   Difficulty:              
          Os:  Unknown/Multiple   |     Blocking:              
Architecture:  Unknown/Multiple   |      Failure:  None/Unknown
----------------------------------+-----------------------------------------
 The recent containers changes have caused an allocation regression in GHC:
 {{{
 in T1969:
    containers-0.3,                allocations: 246,402,492
    containers + dons/tibbe/milan, allocations: 283,370,180 (+15%)

 in T3294:
    containers-0.3,                allocations: 832,307,880
    containers + dons/tibbe/milan, allocations: 950,107,000 (+14%)
 }}}
 Some of the patches were also pushed in a hurry in the buildup to the 7.0
 release. We should review them.

 {{{
 Fri Sep 24 16:49:46 BST 2010  Milan Straka <[email protected]>
   * Fix warnings in Data.Map and Data.Set.
   Ignore-this: cb2c0a8ecf0a57acc5941ba841aa7c40

   Only trivial changes.

 Fri Sep 24 16:33:53 BST 2010  Milan Straka <[email protected]>
   * Finish the started worker/wrapper transformation.
   Ignore-this: baeb24573242beb56c3bbe7ca67f5ff7

   Some methods (insert, lookup) were not modified as the rest
   (like insertWith, delete, ...). Also the `seq` were missing
   sometimes.

 Fri Sep 24 16:26:42 BST 2010  Milan Straka <[email protected]>
   * Merge all the OPTIONS and LANGUAGE module pragmas.
   Ignore-this: 86067abf13f0501f29c13ec7c877533c

 Fri Sep 24 16:20:08 BST 2010  Milan Straka <[email protected]>
   * Remove most INLINE from Map, Set, IntMap and IntSet.
   Ignore-this: c88c4ede21c06bfda20af131c232a720

   Because of a code bloat the INLINEs cause, remove most of
   them. The only INLINEs left are the following:
   - only in Set and Map, because in IntMap and IntSet the specialization
     does not help
   - only on functions which need Ord
   - only on 'small' functions, namely member, notMember, lookup*,
     insert*, delete*, adjust*, alter*, update*

   All other functions of Map, Set, IntMap and IntSet are marked INLINABLE,
   even if they are recursive.

   The INLINEs left are only a short-term solution. In the long run the
   auto-specialization of INLINABLE methods seems a good way (maybe
   SPECIALIZABLE).

 Fri Sep 24 12:07:05 BST 2010  Milan Straka <[email protected]>
   * Comment tests and benchmarks on foldlWithKey' which
   Ignore-this: 71b988389e6ae9a78ea3b0e20156ca2f
   was commented recently by Ian Lynagh.

 Thu Sep 23 13:56:04 BST 2010  Milan Straka <[email protected]>
   * Worker/wrapper transformation for Data.IntSet.
   Ignore-this: b0228582818f7bfb690d0853022a7809

 Tue Sep 21 12:58:21 BST 2010  Milan Straka <[email protected]>
   * Compile only the benchmark source, not the Data/*.hs.
   Ignore-this: f94d9e3ffe126cd057d23490c973a4e9

 Tue Sep 21 11:32:25 BST 2010  Milan Straka <[email protected]>
   * Add criterion-based benchmark for IntSet.hs.
   Ignore-this: 3d31a820830c7382748626bc9a1ba54

   The benchmark is nearly identical copy of Set.hs benchmark.

 Tue Sep 21 11:28:02 BST 2010  Milan Straka <[email protected]>
   * Add a testsuite for Data.IntSet.
   Ignore-this: e55484ee185e71915452bdf2a7b2a2b3

 Tue Sep 21 10:18:28 BST 2010  Milan Straka <[email protected]>
   * Further improve Data.Set balance function.
   Ignore-this: f23be37859224e9bbe919a3c0a71fdc6

   As suggested by Kazu Yamamoto, we split balance to balanceL and
   balanceR, which handle only one-sided inbalance, but need fewer
   tests than balance.

   As nearly all functions modifying the structure use balance, this
   results in speedup of many functions. On my 32-bit GHC 6.12.1,
   11% speedup for insert, 12% speedup for delete.

 Tue Sep 21 10:15:47 BST 2010  Milan Straka <[email protected]>
   * Further improve Data.Map balance function.
   Ignore-this: 8abfd027142a5183b2b5282e96ccb414

   As suggested by Kazu Yamamoto, we split balance to balanceL and
   balanceR, which handle only one-sided inbalance, but need fewer
   tests than balance.

   As nearly all functions modifying the structure use balance, this
   results in speedup of many functions. On my 32-bit GHC 6.12.1,
   20% speedup for insert, 7% speedup for delete, 5% speedup for update.

 Tue Sep 21 10:05:07 BST 2010  Milan Straka <[email protected]>
   * Changing delta to 3 in Data.Set.
   Ignore-this: a47d0c542ed9cee99ad6b17c52c977a1

   Only possible values are 3 and 4. The value 3 has much faster inserts,
   value 4 slightly faster deletes, so choosing 3.

   Also changed the inequalities to rebalance only when one subtree
   is _strictly_ larger than delta * the other one, to mimic the behaviour
   from the proof (both from the Adams' and from the one to come).

 Tue Sep 21 10:03:58 BST 2010  Milan Straka <[email protected]>
   * Changing delta to 3 in Data.Map.
   Ignore-this: 85f733f836b65b2b1038383ddb92e8e1

   Only possible values are 3 and 4. The value 3 has much faster inserts,
   value 4 slightly faster deletes, so choosing 3.

   Also changed the inequalities to rebalance only when one subtree
   is _strictly_ larger than delta * the other one, to mimic the behaviour
   from the proof (both from the Adams' and from the one to come).

 Tue Sep 14 16:04:42 BST 2010  Milan Straka <[email protected]>
   * Correct Data.Set Arbitrary instance never to return unbalanced trees.
   Ignore-this: b5c70fa98a56f225b8eb5faf420677b0

   The previous instance sometimes returned unbalanced trees,
   which broke the tests.

   Also the new instance mimics Data.Map instance more closely in the shape
   of the generated trees.

 Tue Sep 14 15:58:41 BST 2010  Milan Straka <[email protected]>
   * Correct Data.Map Arbitrary instance never to return unbalanced trees.
   Ignore-this: 114bbcc63acdb16b77140ea56aeb0a95

   The previous instance sometimes returned unbalanced trees,
   which broke the tests.

 Tue Sep 14 15:20:10 BST 2010  Milan Straka <[email protected]>
   * Improve Data.Set benchmark.
   Ignore-this: 9b878ae3aa5a43ef083abfd7f9b22513

   Add union, difference and intersection to Data.Set benchmark.

 Tue Sep 14 15:17:07 BST 2010  Milan Straka <[email protected]>
   * Improve benchmark infrastructure and Data.Map benchmark.
   Ignore-this: 67e8dafcb4abcb9c726b9b29c7c320fd

   Renamed Benchmarks.hs to Map.hs, as it only benchmarks Data.Map.
   Improve the Makefile to work with multiple benchmarks.
   Add union, difference and intersection to Data.Map benchmark.

 Tue Sep 14 15:04:17 BST 2010  Milan Straka <[email protected]>
   * Improve the performance of Data.Set balance function.
   Ignore-this: 577c511c219695b8d483af546c7387e8

   The balance function is now one monolithic function, which allows
   to perform all pattern-matches only once.

   Nearly all functions modifying Data.Map use balance.
   The improvements are 12% for insert, 14% for delete (GHC 6.12.1).

 Tue Sep 14 15:02:17 BST 2010  Milan Straka <[email protected]>
   * Improve the performance of Data.Map balance function.
   Ignore-this: 951181e035fcac90674dff3300350a1

   The balance function is now one monolithic function, which allows
   to perform all pattern-matches only once.

   Nearly all functions modifying Data.Map use balance.
   The improvements are 7-11% for various insert*, delete*, alter,
   update or intersection functions (GHC 6.12.1).

 Tue Sep 14 14:57:25 BST 2010  Milan Straka <[email protected]>
   * Improve performance of Data.Set union and difference operations.
   Ignore-this: 6dc4a186ea060b9cdb9e783db71ca280

   Use datatype storing evaluated bound instead of high-order functions.
   The improvements are over 25% for both union and difference (GHC
 6.12.1).

 Tue Sep 14 14:46:14 BST 2010  Milan Straka <[email protected]>
   * Improve performance of Data.Map union* and difference* operations.
   Ignore-this: 35b23a40ef33e9fa14eb81fdee4b152d

   Use datatype storing evaluated bound instead of high-order functions.
   The improvements are 22% for union and 20% for difference (GHC 6.12.1).

 Mon Sep 13 17:51:32 BST 2010  Milan Straka <[email protected]>
   * Make the Set store the elements evaluated (bang added).
   Ignore-this: b3f230db5bf30d93d3fddf2c81c5f3b4

 Tue Aug 31 13:43:52 BST 2010  Johan Tibell <[email protected]>
   * Improved performance of Data.Set
   Ignore-this: 38a304a0408d29a2956aa9a1fc0ce755

   Performance improvements are due to manually applying the
   worker/wrapper transformation and strictifying the keys.

   Average speed-up is 32% on a 2GHz Core 2 Duo on OS X 10.5.8

 Tue Aug 31 13:42:25 BST 2010  Johan Tibell <[email protected]>
   * Added benchmarks for Data.Set
   Ignore-this: fcacf88761034b8c534d936f0b336cc0

 Tue Aug 31 13:40:30 BST 2010  Johan Tibell <[email protected]>
   * Added a test suite for Data.Set
   Ignore-this: f430dc302c0fcb8b5d62db2272a1d6f7

   Expression coverage: 74%

 Thu Sep 23 13:08:38 BST 2010  [email protected]
   * Remove use of lambdas with irrefutable patterns
   Ignore-this: c36e90a0258c0d5262684c585c321419

 Wed Sep 15 14:51:03 BST 2010  Ian Lynagh <[email protected]>
   * Revert the recent contentious changes
   Ignore-this: fe4f71ff1ade51c11421dc9974aa0fda
   These will probably be tidied up and reinstated later, but this gets
   the package back to a releasable state.

 Tue Aug 31 12:45:55 BST 2010  Simon Marlow <[email protected]>
   * fix warnings
   Ignore-this: 53df71bc054a779b8ad2dad89c09e02d

 Tue Aug 31 10:34:46 BST 2010  Don Stewart <[email protected]>
   * Missing MagicHash for IntSet
   Ignore-this: d075f760adb9a2aa0ee04676e38a07cc

 Tue Aug 31 10:33:16 BST 2010  Don Stewart <[email protected]>
   * Performance improvements for Data.IntMap (worker/wrapper and inlining)
   Ignore-this: 206036448558d270f0eb85ef4cd55368

 Tue Aug 31 10:32:40 BST 2010  Don Stewart <[email protected]>
   * Add criterion-based benchmarking for IntMap
   Ignore-this: d7d85b9afb513532cc30f5b51a3f825e

 Tue Aug 31 10:32:02 BST 2010  Don Stewart <[email protected]>
   * Add comprehensive testsuite for IntMap
   Ignore-this: d455fedbc615e5b63ac488e605550557

 Tue Aug 31 10:29:56 BST 2010  Don Stewart <[email protected]>
   * -O2 -fregs-graph is a uniform 10% improvements for IntMap
   Ignore-this: 2372cf4be945fe7939d0af94e32c567f

 Sun Aug 29 13:26:28 BST 2010  Don Stewart <[email protected]>
   * Major bump (new functions, clarified strictness properties, vastly
 better performance)
   Ignore-this: 9bfbc58ecaa24a86be37b8c4cb043457

 Sun Aug 29 13:21:47 BST 2010  Don Stewart <[email protected]>
   * Add two new functions: foldlWithKey' and insertLookupWithKey'
   Ignore-this: a2f112653ba38737fe1b38609e06c314

   These two functions use strict accumulators, compared to their existing
   counterparts (which are lazy left folds, that appear not to be useful).
   Performance is significantly better.


 Sun Aug 29 12:46:11 BST 2010  Don Stewart <[email protected]>
   * Add a criterion-based benchmark suite for Data.Map
   Ignore-this: ec61668f5bcb78bd15b72e2728c01c19

   This adds a criterion-based micro-benchmarking suite for Data.Map. It
   can be used to measure performance improvements for individual top-level
   functions.

   Examples here: http://is.gd/eJHIE


 Sun Aug 29 17:33:29 BST 2010  Don Stewart <[email protected]>
   * Missed base case for updateAt worker. Spotted by Jan-Willem Maessen
   Ignore-this: b8daf1c55c163c16f50c3b54cca2dba1

 Sun Aug 29 13:02:45 BST 2010  Don Stewart <[email protected]>
   * Performance improvements to Data.Map
   Ignore-this: b4830cddfa6d62e4883f4e0f58ac4e57

   Applied several standard transformations to improve the performance of
   code:

       * Worker/wrapper of all recursive functions with constant arguments
       * Inlining of all (non-recursive) wrappers
       * Consistent use of strict keys

   Average performance improvements across common API (with GHC 6.12.3):

       * Linux / x86_64 / 2.6Ghz i7        : 48%
       * Mac OSX 10.5 / x86 / 2 Ghz Xeon   : 36%

   Graphs and raw data: http://is.gd/eJHIE

   This patch is (mostly) orthogonal to the algorithmic changes suggested
   by Milan Straka in his HW 2010 paper:

 http://research.microsoft.com/~simonpj/papers/containers/containers.pdf

   Those changes could be added separately, for some additional
 improvments.

   Work carried out over 28/29th August, 2010 in Utrecht, NL, by Johan
 Tibell
   and Don Stewart.


 Sun Aug 29 12:35:45 BST 2010  Don Stewart <[email protected]>
   * Add a comprehensive testsuite for Data.Map
   Ignore-this: 891e7fe6bac3523868714ac1ff51c0a3

   This patch adds a joint quickcheck2 / hunit testsuite, with coverage of
   91% of top level functions (remaining features are mostly in instances).

   The coverage data is here:

       http://code.haskell.org/~dons/tests/containers/hpc_index.html

   No bugs were found. It includes unit tests for known past bugs
   (balancing).
 }}}

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4342>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to