Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-25 Thread Ketil Malde
Don Stewart d...@galois.com writes:

 Some people might be quite excited by Milan's work on significant
 performance improvements to the containers package...

Yes, this is great news - both a well written article and an important
piece of work on a cornerstone of the Haskell libraries.

But I am also somewhat disturbed that, all this time I've been using
Data.Set/Map, AVL has apparently been a much faster alternative, and I
never got around to trying it out.  Is there a downside to using AVL
compared to Data.Set/Map?  And would similar improvements apply to it?

When performance is starting to matter, it is often because I have large
Sets or Maps - it would also be interesting to compare the memory
requirement of these data structures.  (No matter how much faster a
HashMap is, if it exceeds physical RAM it will be outperformed by any
structure that manages to fit).

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-25 Thread Stephen Tetley
On 25 June 2010 05:58, Ivan Miljenovic ivan.miljeno...@gmail.com wrote:

 The reason this came up: Thomas Berekneyi wanted to use such classes
 for the rewrite of FGL, and when he discussed it on #haskell people
 generally indicated that edison was the best out there but was a bit
 long in the tooth and something like it should be re-written (though
 no-one seemed to volunteer... hmmm... :p).

Hi Ivan

Note that if new-FGL depends on new packages it might not be
acceptable for the Platform; see the thread that this message kicks
off:

http://www.haskell.org/pipermail/libraries/2009-August/012378.html

Best wishes

Stephen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-25 Thread Ivan Lazar Miljenovic
Stephen Tetley stephen.tet...@gmail.com writes:

 On 25 June 2010 05:58, Ivan Miljenovic ivan.miljeno...@gmail.com wrote:

 The reason this came up: Thomas Berekneyi wanted to use such classes
 for the rewrite of FGL, and when he discussed it on #haskell people
 generally indicated that edison was the best out there but was a bit
 long in the tooth and something like it should be re-written (though
 no-one seemed to volunteer... hmmm... :p).

 Hi Ivan

 Note that if new-FGL depends on new packages it might not be
 acceptable for the Platform; see the thread that this message kicks
 off:

 http://www.haskell.org/pipermail/libraries/2009-August/012378.html

That's been brought up already, and yes I know.

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-25 Thread David Menendez
On Fri, Jun 25, 2010 at 12:58 AM, Ivan Miljenovic
ivan.miljeno...@gmail.com wrote:
 On 25 June 2010 14:41, David Menendez d...@zednenem.com wrote:
 On Thu, Jun 24, 2010 at 3:08 AM, Ivan Miljenovic
 ivan.miljeno...@gmail.com wrote:
 As an aside, Alex Mason and I are discussing the possibility of taking
 advantage of AusHack *shameless plug* to write some kind of classes
 for the different types of containers with a hierarchy.  I know about
 ListLike, but there doesn't seem to be any applicable classes for
 generic containers (i.e. the abstract API of a Set; something like
 ListLike would then be an extension on it) and for lookup data
 structures (Map k a, [(k, a)], etc.).

 Be sure to look into Okasaki's work on Edison. It has classes for
 sequences (list-like structures) and collections (sets, heaps) and
 associations (maps, priority queues) and a paper discussing the design
 decisions.

 Yeah, we will be.

 The reason this came up: Thomas Berekneyi wanted to use such classes
 for the rewrite of FGL, and when he discussed it on #haskell people
 generally indicated that edison was the best out there but was a bit
 long in the tooth and something like it should be re-written (though
 no-one seemed to volunteer... hmmm... :p).

Edison could use some re-thinking; the state of the art in Haskell has
advanced, and there are new classes like Data.Foldable and
Data.Traversable to consider.

In my mind, the big question is whether Sequence and Assoc should be
constructor classes or use type families/fundeps. I lean towards the
former, but it means that things like ByteSting can't be instances of
Sequence.

 For example: it's a little weird that edison re-exports Data.Set and
 uses it for the instance with a type alias (same with map, Seq, etc.)
 rather than just using Data.Set itself.

I believe that's so the implementations export a common interface. If
you're in a situation where you want to use a specific implementation,
Edison is designed so that you can import just the implementation
module and avoid the overhead of the class system while still making
it easy to switch implementations in the future.

  I also find the
 structuralInvariant and instanceName fields to be a little odd,

I believe structuralInvariant is there for testing.

I'm not sure what instanceName is for. It's possible Edison predates
Data.Typeable, in which case instanceName might be useful for similar
purposes.

-- 
Dave Menendez d...@zednenem.com
http://www.eyrie.org/~zednenem/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Ivan Miljenovic
On 24 June 2010 16:57, Don Stewart d...@galois.com wrote:
 Some people might be quite excited by Milan's work on significant
 performance improvements to the containers package...

This looks quite nice; good work Milan!!!

As an aside, Alex Mason and I are discussing the possibility of taking
advantage of AusHack *shameless plug* to write some kind of classes
for the different types of containers with a hierarchy.  I know about
ListLike, but there doesn't seem to be any applicable classes for
generic containers (i.e. the abstract API of a Set; something like
ListLike would then be an extension on it) and for lookup data
structures (Map k a, [(k, a)], etc.).

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Stephen Tetley
On 24 June 2010 08:08, Ivan Miljenovic ivan.miljeno...@gmail.com wrote:

 As an aside, Alex Mason and I are discussing the possibility of taking
 advantage of AusHack *shameless plug* to write some kind of classes
 for the different types of containers with a hierarchy.  I know about
 ListLike, but there doesn't seem to be any applicable classes for
 generic containers (i.e. the abstract API of a Set; something like
 ListLike would then be an extension on it) and for lookup data
 structures (Map k a, [(k, a)], etc.).

There are some classes for bulk types in Simon Peyton-Jones's paper
Bulk Types with Class.

Personally, I'll take a lot of convincing that a class approach will
be better than using the module system...
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Ivan Miljenovic
On 24 June 2010 17:15, Stephen Tetley stephen.tet...@gmail.com wrote:

 There are some classes for bulk types in Simon Peyton-Jones's paper
 Bulk Types with Class.

Cool, I'll have a look.

 Personally, I'll take a lot of convincing that a class approach will
 be better than using the module system...

My rational for a class approach is that rather than having your
library spit out a list of values, etc. you let the consumer pick
which data type they prefer (if they're going to be just converting
your list into a Set, then why not give them a Set to start with?).

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Stephen Tetley
On 24 June 2010 08:20, Ivan Miljenovic ivan.miljeno...@gmail.com wrote:

 My rational for a class approach is that rather than having your
 library spit out a list of values, etc. you let the consumer pick
 which data type they prefer (if they're going to be just converting
 your list into a Set, then why not give them a Set to start with?).


That's a bit more interesting, at least to my tastes.

In my own work I often use specialised container types: e.g. rather
than use a 'safe list' library that avoids certain list ops or wraps
them as Maybe's, I prefer using a OneList:

data OneList a = One a | Cons a (OneList a)

With structures like a OneList their construction is more 'primary'
than their manipulation. That's to say clearly when you build a
OneList it can't be empty (and you obviously don't want it to be), but
as a structure its not conducive to many manipulations, e.g: you can't
filter it, as filtering can produce an empty list but the data type
can't.

Some algebra of constructors and destructors would be useful here,
to get the filterInto function for example:

filterInto :: Consable c = (a - Bool) - OneList a - c a

Consable could be the Coll class from Bulk Types with Class or
probably better, a smaller one with just empty and cons, or even just
cons inheriting Monoid. Unfortunately inheriting Monoid puts
legitimacy on (++) which we know is bad on regular lists, and should
discouraged where feasible. Similarly size in the Coll class is bad on
regular lists... it goes on until will have single operation type
classes for all the collection operations.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Ivan Miljenovic
On 24 June 2010 17:55, Stephen Tetley stephen.tet...@gmail.com wrote:
 On 24 June 2010 08:20, Ivan Miljenovic ivan.miljeno...@gmail.com wrote:

 My rational for a class approach is that rather than having your
 library spit out a list of values, etc. you let the consumer pick
 which data type they prefer (if they're going to be just converting
 your list into a Set, then why not give them a Set to start with?).


 That's a bit more interesting, at least to my tastes.

 In my own work I often use specialised container types: e.g. rather
 than use a 'safe list' library that avoids certain list ops or wraps
 them as Maybe's, I prefer using a OneList:

 data OneList a = One a | Cons a (OneList a)

Hmmm.

/me makes a note to see whether or not having an empty value is a
requirement of such classes...

 With structures like a OneList their construction is more 'primary'
 than their manipulation. That's to say clearly when you build a
 OneList it can't be empty (and you obviously don't want it to be), but
 as a structure its not conducive to many manipulations, e.g: you can't
 filter it, as filtering can produce an empty list but the data type
 can't.

Yeah, which can be a problem.

 Some algebra of constructors and destructors would be useful here,
 to get the filterInto function for example:

 filterInto :: Consable c = (a - Bool) - OneList a - c a

 Consable could be the Coll class from Bulk Types with Class or
 probably better, a smaller one with just empty and cons, or even just
 cons inheriting Monoid. Unfortunately inheriting Monoid puts
 legitimacy on (++) which we know is bad on regular lists, and should
 discouraged where feasible. Similarly size in the Coll class is bad on
 regular lists... it goes on until will have single operation type
 classes for all the collection operations.

If you inherit Monoid, then Bytestring can't be an instance of the class...

This dichotomy is a problem: we'd like to have as many possible types
be instances of such classes as possible; however, if we do that then
we can't use pre-existing classes such as Functor since Set isn't an
instance of Functor, etc.

Anyway, since I've brought the whole thing up now rather than in the
blog post I was planning to write this weekend, what are people's
thoughts on preferred extension used:

* MPTCs + fundeps

Pros: older tech, more widely understood/used, ListLike already uses it

Cons: I don't like the fact that the element type becomes a first
class member of the type class; I'd prefer writing SetLike s =
Value s - s - s to SetLike s v = v - s - s as IMHO the class
represents the data type/structure and not the value it stores (that's
a sub-part).

* Type Families (actually using an Associated Type)

Pros: IMHO cleaner type sigs (see above), in Fun with Type Families,
the authors call Type Families more like functional programming
whereas fundeps are more like logic programming, and also state that
type families play nicer with GADTs, etc.

Cons: newer tech, so still being developed and used, etc.  In
particular, superclass constraints are not yet implemented in GHC
(i.e. can't say something like class (SetLike l, Value l ~ (k, v)) =
MapLike l where ...

Unlike the choice of which gets used in FGL, I would like to think
that this kind of set of classes would be more widely used in the
community and will use whichever type of extension seems to be
consensus as the better one (at least for this situation).  If anyone
knows a good site to be able to make an actual poll for this, that
would probably be even better (or should we just have a wiki page
where people put their names under whichever they think is better and
then just tally up the number of names for each?).

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread John Meacham
On Wed, Jun 23, 2010 at 11:57:54PM -0700, Don Stewart wrote:
 Some people might be quite excited by Milan's work on significant
 performance improvements to the containers package...

Ah. This all looks excellent! Currently jhc spends about 30-40% of its
time in the containers package according to profiling  so improvements
there will notably speed up compilation. Though, I already modifed a lot
to use IntSets instead of Sets and independently discovered the
HashSet/HashMap trick, but the gain should still be signifigant. The
focus on tree-union is particularly cool, as I know a lot of that goes
on inside the compiler.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread David Menendez
On Thu, Jun 24, 2010 at 3:08 AM, Ivan Miljenovic
ivan.miljeno...@gmail.com wrote:
 As an aside, Alex Mason and I are discussing the possibility of taking
 advantage of AusHack *shameless plug* to write some kind of classes
 for the different types of containers with a hierarchy.  I know about
 ListLike, but there doesn't seem to be any applicable classes for
 generic containers (i.e. the abstract API of a Set; something like
 ListLike would then be an extension on it) and for lookup data
 structures (Map k a, [(k, a)], etc.).

Be sure to look into Okasaki's work on Edison. It has classes for
sequences (list-like structures) and collections (sets, heaps) and
associations (maps, priority queues) and a paper discussing the design
decisions.

-- 
Dave Menendez d...@zednenem.com
http://www.eyrie.org/~zednenem/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: signficant improvements to the containers package

2010-06-24 Thread Ivan Miljenovic
On 25 June 2010 14:41, David Menendez d...@zednenem.com wrote:
 On Thu, Jun 24, 2010 at 3:08 AM, Ivan Miljenovic
 ivan.miljeno...@gmail.com wrote:
 As an aside, Alex Mason and I are discussing the possibility of taking
 advantage of AusHack *shameless plug* to write some kind of classes
 for the different types of containers with a hierarchy.  I know about
 ListLike, but there doesn't seem to be any applicable classes for
 generic containers (i.e. the abstract API of a Set; something like
 ListLike would then be an extension on it) and for lookup data
 structures (Map k a, [(k, a)], etc.).

 Be sure to look into Okasaki's work on Edison. It has classes for
 sequences (list-like structures) and collections (sets, heaps) and
 associations (maps, priority queues) and a paper discussing the design
 decisions.

Yeah, we will be.

The reason this came up: Thomas Berekneyi wanted to use such classes
for the rewrite of FGL, and when he discussed it on #haskell people
generally indicated that edison was the best out there but was a bit
long in the tooth and something like it should be re-written (though
no-one seemed to volunteer... hmmm... :p).

For example: it's a little weird that edison re-exports Data.Set and
uses it for the instance with a type alias (same with map, Seq, etc.)
rather than just using Data.Set itself.  I also find the
structuralInvariant and instanceName fields to be a little odd,
whereas strict can now be shifted off to DeepSeq (or whatever the
class is called).

-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
IvanMiljenovic.wordpress.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe