Re: Feedback request: priority queues in containers

2010-03-17 Thread Simon Marlow

On 17/03/2010 00:17, Louis Wasserman wrote:


I tested, and this implementation actually performs better if the spine
is maintained lazily, so we'll test that version.


May I request that, unless there's a significant speedup from using a 
strict spine, that you use a lazy spine where possible.  The reason 
being that lazy data structures work much better in a parallel setting: 
a strict spine is a course-grained lock on the whole operation, whereas 
a lazy spine corresponds to a fine-grained locking strategy.


Apart from this, as you were ;)

Cheers,
Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-17 Thread Roman Leshchinskiy
On 17/03/2010, at 03:16, Louis Wasserman wrote:

 I'm not willing to do this sort of typeclass wrapper thing, primarily because 
 nothing else in containers does -- even though we might have a Mapping type 
 class that handles both IntMap and Map, we don't.
 
 I'm inclined to let that design choice stand, as far as containers is 
 concerned.  It would make perfect sense to write a new package with such a 
 type class and offering instances for the containers priority queue 
 implementations, but I prefer to stick with the style that containers already 
 seems to use -- that is, exporting separate modules without a unifying type 
 class, but with nearly-identical method signatures.

FWIW, vector does both. It defines most vector operations generically and then 
exports appropriate specialisations for each concrete vector type. I think this 
is the most flexible and convenient approach. I just wish Haskell had some kind 
of support for it.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Tyson Whitehead
On March 16, 2010 09:29:06 Louis Wasserman wrote:
 I'd like to request some more feedback on the
 proposedhttp://hackage.haskell.org/trac/ghc/ticket/3909implementation
 for priority queues in containers.  Mostly, I feel like
 adding a new module to containers should be contentious, and there hasn't
 been as much griping or contention as I expected.  The silence is feeling
 kind of eerie!

Not sure if this is an appropriate question at all as I haven't looked at the 
code, but would it be possible to put any primary functionality into a class.

I'm thinking something along the lines of how the vector code works.  This 
allows you to use all the higher-order functions (i.e., those implemented 
using the primary functions) on a different underlying implementation.

I've found this particularly useful in wrapping Perl data types.  For the Perl 
array, all I had to do was write an class instance for the vector module, and 
I have all these higher-order functions I could use from existing code.

It would be very nice to have had something similar to do for the hash tables.  
Even to just provide a native haskell immutable look into the data so 
Haskell code can extract the components it needs with standard functions.

Cheers!  -Tyson

PS:  I'm still working on the wrapping, so I might change my mind as to how 
useful this really is, but thought I should throw it out there.


signature.asc
Description: This is a digitally signed message part.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Louis Wasserman
I'm not willing to do this sort of typeclass wrapper thing, primarily
because nothing else in containers does -- even though we might have a
Mapping type class that handles both IntMap and Map, we don't.

I'm inclined to let that design choice stand, as far as containers is
concerned.  It would make perfect sense to write a new package with such a
type class and offering instances for the containers priority queue
implementations, but I prefer to stick with the style that containers
already seems to use -- that is, exporting separate modules without a
unifying type class, but with nearly-identical method signatures.

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis


On Tue, Mar 16, 2010 at 11:10 AM, Tyson Whitehead twhiteh...@gmail.comwrote:

 On March 16, 2010 09:29:06 Louis Wasserman wrote:
  I'd like to request some more feedback on the
  proposedhttp://hackage.haskell.org/trac/ghc/ticket/3909implementation
  for priority queues in containers.  Mostly, I feel like
  adding a new module to containers should be contentious, and there hasn't
  been as much griping or contention as I expected.  The silence is feeling
  kind of eerie!

 Not sure if this is an appropriate question at all as I haven't looked at
 the
 code, but would it be possible to put any primary functionality into a
 class.

 I'm thinking something along the lines of how the vector code works.  This
 allows you to use all the higher-order functions (i.e., those implemented
 using the primary functions) on a different underlying implementation.

 I've found this particularly useful in wrapping Perl data types.  For the
 Perl
 array, all I had to do was write an class instance for the vector module,
 and
 I have all these higher-order functions I could use from existing code.

 It would be very nice to have had something similar to do for the hash
 tables.
 Even to just provide a native haskell immutable look into the data so
 Haskell code can extract the components it needs with standard functions.

 Cheers!  -Tyson

 PS:  I'm still working on the wrapping, so I might change my mind as to how
 useful this really is, but thought I should throw it out there.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Milan Straka
Hi,

I second that choice. There have been some attempts at using type
classes, notably the Edison library, but it never got widespread.
So I would follow the current containers design.

Milan

 I'm not willing to do this sort of typeclass wrapper thing, primarily
 because nothing else in containers does -- even though we might have a
 Mapping type class that handles both IntMap and Map, we don't.
 
 I'm inclined to let that design choice stand, as far as containers is
 concerned.  It would make perfect sense to write a new package with such a
 type class and offering instances for the containers priority queue
 implementations, but I prefer to stick with the style that containers
 already seems to use -- that is, exporting separate modules without a
 unifying type class, but with nearly-identical method signatures.
 
 Louis Wasserman
 wasserman.lo...@gmail.com
 http://profiles.google.com/wasserman.louis
 
 
 On Tue, Mar 16, 2010 at 11:10 AM, Tyson Whitehead twhiteh...@gmail.comwrote:
 
  On March 16, 2010 09:29:06 Louis Wasserman wrote:
   I'd like to request some more feedback on the
   proposedhttp://hackage.haskell.org/trac/ghc/ticket/3909implementation
   for priority queues in containers.  Mostly, I feel like
   adding a new module to containers should be contentious, and there hasn't
   been as much griping or contention as I expected.  The silence is feeling
   kind of eerie!
 
  Not sure if this is an appropriate question at all as I haven't looked at
  the
  code, but would it be possible to put any primary functionality into a
  class.
 
  I'm thinking something along the lines of how the vector code works.  This
  allows you to use all the higher-order functions (i.e., those implemented
  using the primary functions) on a different underlying implementation.
 
  I've found this particularly useful in wrapping Perl data types.  For the
  Perl
  array, all I had to do was write an class instance for the vector module,
  and
  I have all these higher-order functions I could use from existing code.
 
  It would be very nice to have had something similar to do for the hash
  tables.
  Even to just provide a native haskell immutable look into the data so
  Haskell code can extract the components it needs with standard functions.
 
  Cheers!  -Tyson
 
  PS:  I'm still working on the wrapping, so I might change my mind as to how
  useful this really is, but thought I should throw it out there.
 

 ___
 Libraries mailing list
 librar...@haskell.org
 http://www.haskell.org/mailman/listinfo/libraries

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Jim Apple
 Specifically, we use a binomial heap, which offers

 O(log n) worst-case union
 O(log n) worst-case extract (this in particular was a key objective of ours)
 amortized O(1), worst-case O(log n) insertion.  (In a persistent context,
 the amortized performance bound does not necessarily hold.)

Why not use Okasaki  Brodal's Optimal Purely Functional Priority
Queues? They offer worst case:

* O(1) union, findMin, and insert
* O(lg n) deleteMin

http://www.eecs.usma.edu/webs/people/okasaki/jfp96/index.html
ftp://ftp.daimi.au.dk/pub/BRICS/Reports/RS/96/37/BRICS-RS-96-37.pdf
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Jean-Marie Gaillourdet

Hi,

I think this is my first post to this list although I've been reading it 
for a long time. So, please, be patient with me and my post.


On 03/16/2010 02:29 PM, Louis Wasserman wrote:

* We offer Functor, Foldable, and Traversable instances that do not
  respect key ordering.  All are linear time, but Functor and
  Traversable in particular assume the function is monotonic.  The
  Foldable instance is a good way to access the elements of the
  priority queue in an unordered fashion.  (We also export
  mapMonotonic and traverseMonotonic, and encourage the use of those
  functions instead of the Functor or Traversable instances.)


So, it is possible to break the invariants of your priority queue by 
using fmap with a non-monotonic function, right? I see that it might be 
usefull to have such instances, sometimes.


As it is not possible to hide instances, once they are definded, I'd 
propose to not offer those instances by default. Instead you could 
provide implementations of all the instance functions necessary to 
define this instances yourself. Or one could have a newtype wrapper 
around the priority queue for which instances of Function, Foldable, and 
Traversable are defined. This would allow to activate the nice 
instances on demand but to stay safe by default.


Best regards,
Jean-Marie
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread kahl


Louis Wasserman wrote:
  
  I'm not willing to do this sort of typeclass wrapper thing, primarily
  because nothing else in containers does -- even though we might have a
  Mapping type class that handles both IntMap and Map, we don't.
  
  I'm inclined to let that design choice stand, as far as containers is
  concerned.  It would make perfect sense to write a new package with such a
  type class and offering instances for the containers priority queue
  implementations, but I prefer to stick with the style that containers
  already seems to use -- that is, exporting separate modules without a
  unifying type class, but with nearly-identical method signatures.

Just an aside (and shameless plug ;-): Since the signatures overlap so much,
it is in fact easy to wrap these modules into instances
for many possible different type classes that one might consider using for
containers --- I have a tool that mechanises this instance generation,
available at:

  http://sqrl.mcmaster.ca/~kahl/Haskell/ModuleTools/


More about this in the forthcoming TFP 2009 proceedings paper:

@InCollection{Kahl-2009_TFP,
  author =   {Wolfram Kahl},
  title ={Haskell Module Tools for Liberating Type Class Design},
  crossref =  {TFP2009},
  pages = {129--144},
  chapter = {9},
  abstract ={Design of Haskell type class hierarchies for complex purposes,
 including for standard containers, is a non-trivial exercise,
 and evolution of such designs
 is additionally hampered by the large overhead
 of connecting to existing implementations.

 We systematically discuss this overhead,
 and propose a tool solution, implemented using the GHC API,
 to automate its generation.}
}

@Book{TFP2009,
  title = {Trends in Functional Programming, {TFP 2009}},
  booktitle = {Trends in Functional Programming, {TFP 2009}},
  year =  2010,
  editor ={Zolt\'an Horv{\'a}th and Vikt\'oia Zs{\'o}k and Peter Achten and 
Pieter Koopman},
  address =   {UK},
  publisher = {Intellect},
  note = {(In press)}
}



Wolfram
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Louis Wasserman

 So, it is possible to break the invariants of your priority queue by

using fmap with a non-monotonic function, right? I see that it might be

usefull to have such instances, sometimes.


As it is not possible to hide instances, once they are definded, I'd

propose to not offer those instances by default. Instead you could

provide implementations of all the instance functions necessary to

define this instances yourself. Or one could have a newtype wrapper

around the priority queue for which instances of Function, Foldable, and

Traversable are defined. This would allow to activate the nice

instances on demand but to stay safe by default.


H.  I suggest:

   - Functor and Traversable should be modified as you suggest, that is, we
   continue exporting mapMonotonic and traverseMonotonic, but don't export
   Functor or Traversable instances.
   - I'm still going to insist that we retain Foldable.  The most important
   reason is that we don't lose any invariants as a result of a fold, and the
   second reason is that reexporting functions named foldr and foldl would
   be awkward.

Making this change now.

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Louis Wasserman
Why not use Okasaki  Brodal's Optimal Purely Functional Priority

Queues? They offer worst case:


* O(1) union, findMin, and insert

* O(lg n) deleteMin

The primary reason that we don't use this implementation is, quoting from
the paper you yourself linked to,

 Our data structure is reasonably efficient in practice;
 however, there are several competing data structures that, although
 not asymptotically optimal, are somewhat faster than ours in practice.

Hence, our work is primarily of theoretical interest.


The current implementation is considerably faster than Okasaki's
implementation in practice, based on our benchmarks.  Furthermore, the
asymptotics are really pretty good, and the constant factors appear to be
relatively low.

I wrote a pretty efficient skew binomial heap implementation -- the first
step of Okasaki's approach -- and found it was already slower than the
optimized binomial heap.  I'm not sure whether or not I uploaded that
benchmark, though.  I'll do that at some point today, just to keep everyone
happy.

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Louis Wasserman

 I'm not sure about some of that. Imperative queues sometimes offer

O(1) decreaseKey and O(lg n) delete by storing pointers to elements in

a priority queue. The usual purely functional PQs can only offer O(n)

decreaseKey and O(n) delete. Purely functional PSQs can offer O(lg n)

decreaseKey and O(lg n) delete.


Minimum spanning tree is a common application for PQs that makes good

use of decreaseKey.

That example did occur to me, but I feel okay about following the examples
of Java and C++ STL, which offer priority queue implementations, but those
implementations don't support decreaseKey.

You might be able to convince me that we should offer PSQueues in addition
to PQueues, but I don't think they should be merged, for the following
reason.  Using the PSQueue package, which is essentially a copy of Ralf
Hinze's original implementation, yields the following benchmark for
heapsorting 25000 Ints:

Binomial:   0.000   3.240   2.180   4.000   8.001
PSQ:8.001  13.241   2.882  12.001  24.002

I'm really not okay with that kind of performance loss for added
functionality that not everyone needs.

Louis Wasserman
wasserman.lo...@gmail.com
http://profiles.google.com/wasserman.louis


On Tue, Mar 16, 2010 at 1:58 PM, Louis Wasserman
wasserman.lo...@gmail.comwrote:


 Why not use Okasaki  Brodal's Optimal Purely Functional Priority

 Queues? They offer worst case:


 * O(1) union, findMin, and insert

 * O(lg n) deleteMin

 The primary reason that we don't use this implementation is, quoting from
 the paper you yourself linked to,

 Our data structure is reasonably efficient in practice;
 however, there are several competing data structures that, although
 not asymptotically optimal, are somewhat faster than ours in practice.

 Hence, our work is primarily of theoretical interest.


 The current implementation is considerably faster than Okasaki's
 implementation in practice, based on our benchmarks.  Furthermore, the
 asymptotics are really pretty good, and the constant factors appear to be
 relatively low.

 I wrote a pretty efficient skew binomial heap implementation -- the first
 step of Okasaki's approach -- and found it was already slower than the
 optimized binomial heap.  I'm not sure whether or not I uploaded that
 benchmark, though.  I'll do that at some point today, just to keep everyone
 happy.

 Louis Wasserman
 wasserman.lo...@gmail.com
 http://profiles.google.com/wasserman.louis

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Jim Apple
 I wrote a pretty efficient skew binomial heap implementation -- the first
 step of Okasaki's approach -- and found it was already slower than the
 optimized binomial heap.  I'm not sure whether or not I uploaded that
 benchmark, though.  I'll do that at some point today, just to keep everyone
 happy.

The skew binomial heaps you implemented should only be asymptotically
faster than the usual binomial heaps on one special case: comparing a
binomial heap in a state that would case an \Omega(lg n) time cascade
on insert to the worst-case O(1) insert of binomial heaps.

I think it would also be worth comparing binomial heap merge against
Brodal-Okasaki heap merge.

Finally, if speed is the ultimate goal, I suspect the clever nested
type approach to guaranteeing binomial tree shape slows things down a
bit. For reference, the type in the latest patch is:

data BinomForest rk a = Nil
  | Skip !(BinomForest (Succ rk) a)
  | Cons {-# UNPACK #-} !(BinomTree rk a)
!(BinomForest (Succ rk) a)

data BinomTree rk a = BinomTree a (rk a)

data Succ rk a = Succ {-# UNPACK #-} !(BinomTree rk a) (rk a)

data Zero a = Zero

I suspect the following might be faster:

data BinomForest2 a = Empty
| NonEmpty a [BinomTree2 a]

data BinomTree2 a = BinomTree2 a [BinomTree2 a]

This eliminates the Skip constructor, which contributes only to the
nested type guarantee.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Feedback request: priority queues in containers

2010-03-16 Thread Louis Wasserman

 I suspect the following might be faster:

 data BinomForest2 a = Empty
| NonEmpty a [BinomTree2 a]

 data BinomTree2 a = BinomTree2 a [BinomTree2 a]

 This eliminates the Skip constructor, which contributes only to the
 nested type guarantee.


Ehehehe.  This is something I'm pretty proud of inventing, because your
implementation is actually significantly *slower*.  The reason is
essentially that I can do a lot more constructor unpacking when I have
access to that much type information about the structure of the tree at each
level.

You didn't quite implement it correctly, because among other things, we
*need* to track tree rank, at least for the roots, if it's not encoded in
the type.  Here are your data types, with everything unpacked as much as
possible.

data BinomQueue a = Nil | Cons {-# UNPACK #-} !Int {-# UNPACK #-} !(BinHeap
a) (BinomQueue a)
-- equivalent to [(Int, BinHeap a)], but unrolled
-- we only need to track ranks in the roots
data BinomSubtree a = Nil' | Cons' {-# UNPACK #-} !(BinHeap a) (BinomSubtree
a)
-- equivalent to [BinHeap a], but unrolled
data BinHeap a = Bin a (BinomSubtree a)

I tested, and this implementation actually performs better if the spine is
maintained lazily, so we'll test that version.  The full implementation
(that is, the basic methods: insert, extract, fromList, toAscList) of your
approach was attached to the ticket
herehttp://hackage.haskell.org/trac/ghc/attachment/ticket/3909/QuickBinom.hs.
 Feel free to ghc-core it, or tinker with the implementation, but I've done
a fair bit of tinkering, and my results haven't changed.

Running a benchpress test on heapsorting 25000 Ints, calling performGC after
every run of each method.  SBinomial stands for sparse binomial heap,
which is your approach.

With -O2, +RTS -H128m:
  minmean+/-sd  medianmax
Binomial:0.000   3.440   2.204   4.000   8.001
SBinomial:  24.001  28.562   5.600  28.001  48.003  (ratio: 8.3x slower
average)
With -O2:
  minmean+/-sd  medianmax
Binomial:4.000   8.801   2.606   8.001  24.002
SBinomial:  32.002  41.763   8.007  42.003  64.004  (ratio: 4.7x slower
average)
Without optimization, with +RTS -H128m:
  min  mean+/-sdmedianmax
Binomial: 4.001   10.0413.1408.001   20.002
SBinomial:   64.004   76.8058.790   76.005  100.006  (ratio:  7.6x
slower average)
Without optimization:
Binomial:12.000   19.7615.328   16.001   40.002
SBinomial:   72.004   90.126   11.906   88.006  120.008  (ratio: 4.6x slower
average)

These times are measured in milliseconds.  Conclusion: my implementation is
*massively faster*, by a factor of at least 4.5x.  (I spent the last half
hour trying to optimize SBinomial -- it really is that big a difference, and
it's not going to change.)

Here's why I think that's the case: even though we might have the Skip
constructor, how many Skip constructors are there, total?  On average, half
the forest is made up of Skips, so there are 1/2 log n Skip values there.

But the thing is that the sparse binomial tree representation can't do
anywhere near as much unpacking; in particular, the set of children of each
node is not a single-constructor type.  That's an O(n) increase in
allocation, all for an O(log n) shortening of the spine.  That's why it's a
bad plan.  Most of the work is being done in allocations of nodes in the
*trees,* rather than along the spine among the roots.  In this area, the
crazy, type-system-exploiting approach actually does much less allocation,
because it can do a lot more constructor unpacking.  Let's test this
hypothesis:

My type-magical implementation,  -O2, +RTS -sstderr (no -H128m this time):

 3,050,272,052 bytes allocated in the heap
 240,340,552 bytes copied during GC
   1,087,992 bytes maximum residency (201 sample(s))
  53,136 bytes maximum slop
   3 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  5703 collections, 0 parallel,  0.48s,  0.53s elapsed
  Generation 1:   201 collections, 0 parallel,  0.22s,  0.24s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time4.52s  (  4.74s elapsed)
  GCtime0.70s  (  0.77s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time5.23s  (  5.51s elapsed)

  %GC time  13.5%  (13.9% elapsed)

  Alloc rate674,200,547 bytes per MUT second

  Productivity  86.5% of total user, 82.2% of total elapsed

The sparse binomial forest representation, same options:

   5,612,965,672 bytes allocated in the heap
 563,671,500 bytes copied during GC
   1,967,576 bytes maximum residency (202 sample(s))
 107,212 bytes maximum slop
   5 MB total memory in use (1 MB lost due to fragmentation)

  Generation 0: 10602 collections, 0 parallel,  1.60s,  1.67s elapsed
  Generation 1:   202 collections, 0 parallel,  0.28s,  0.30s elapsed

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time