Re: [Haskell-cafe] Map unionWith generalization

2010-01-28 Thread Hans Aberg

On 28 Jan 2010, at 03:09, Twan van Laarhoven wrote:

For example, in Map String Integer (sparse representation of  
monomials) compute the minimum value of all associative pairs with  
the same key (the gcd); if only one key is present, the absent  
should be treated as having value 0. So

 unionWith min xs ys
will not work, because unionWith will always apply the identity to  
the remaining value when one key is missing, whereas it should be  
sent to 0.


If missing keys represent 0, then wouldn't this work?

   intersectionWith min xs ys


Yes, that works for the gcd function using min. Thank you - this  
function is used a lot, so it is good to have it simple.


For the general problem, one still needs a better interface. For (-),  
though missing keys represent 0, if the first is missing, the second  
should be negated. And if both are present, one should filter out keys  
with 0 value. Right now, this is a combination of negate, (+), and  
filter (/= 0).


  Hans


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Jan-Willem Maessen

On Jan 27, 2010, at 5:53 AM, Hans Aberg wrote:

 I need ideally some generalizations of unionWith and unionWithKey, for 
 efficiency matters (i.e. avoiding conversions and traversing the maps more 
 than once). I could use a modification of the code in Map.hs, but then the 
 problem is that the module Map interface does not export the constructors of 
 data Map. So suggestions are welcome.
 
 For example, in Map String Integer (sparse representation of monomials) 
 compute the minimum value of all associative pairs with the same key (the 
 gcd); if only one key is present, the absent should be treated as having 
 value 0. So
  unionWith min xs ys
 will not work, because unionWith will always apply the identity to the 
 remaining value when one key is missing, whereas it should be sent to 0.
 
 So here, one would want:
  (a - c) - (b - c) - (a - b - c) - Map k a - Map k b - Map k c
 where the two first functions are applied when the first or second key is 
 missing.

Ah, the swiss army knife function on maps.  This particular formulation works 
well for the application you describe above, where you're completely traversing 
both maps.  The following really grubby variant can be used to implement 
asymptotically efficient variations of union, intersection, difference, etc., 
etc:

swissArmy ::
  (Map k a - Map k c) --- Used to process submaps unique to the left map
  (Map k b - Map k c) --- Used to process submaps unique to the right map
  (a - b - Maybe c) - -- Used to process a single common entry
  Map k a - Map k b - Map k c

Then your function appears to be:

-- helper
just2 :: (a - b - c) - a - b - Maybe c
just2 f a b = Just (f a b)

swissArmy (fmap (const 0)) (fmap (const 0)) (just2 gcd)

Here are unionWith and intersectionWith:

unionWith f = swissArmy id id (just2 f)
intersectionWith = swissArmy (const empty) (const empty) (just2 f)
differenceWith = swissArmy id (const empty) (\a b - Nothing)

When throwing together tree-like data structures, I often start by writing this 
function; it handles most of the bulk operations I might want to do as 
one-liners.  It's got a messy, ugly type signature, but it does everything you 
want as efficiently as you want.*

-Jan-Willem Maessen

* Actually, this is only true if you add the key as an argument to the third 
function, so that you can also encode unionWithKey etc!  I've skipped that here.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Hans Aberg

On 27 Jan 2010, at 14:56, Jan-Willem Maessen wrote:


So here, one would want:
(a - c) - (b - c) - (a - b - c) - Map k a - Map k b - Map  
k c
where the two first functions are applied when the first or second  
key is missing.


Ah, the swiss army knife function on maps.  This particular  
formulation works well for the application you describe above, where  
you're completely traversing both maps.  The following really grubby  
variant can be used to implement asymptotically efficient variations  
of union, intersection, difference, etc., etc:


swissArmy ::
 (Map k a - Map k c) --- Used to process submaps unique to the  
left map
 (Map k b - Map k c) --- Used to process submaps unique to the  
right map

 (a - b - Maybe c) - -- Used to process a single common entry
 Map k a - Map k b - Map k c


I'm not sure why you want to throw in functions between maps in the  
two first arguments. Then there is no requirement that single keys are  
preserved.


But it is a good idea to have a Maybe so that one can remove keys on  
the fly.



Then your function appears to be:

-- helper
just2 :: (a - b - c) - a - b - Maybe c
just2 f a b = Just (f a b)

swissArmy (fmap (const 0)) (fmap (const 0)) (just2 gcd)

Here are unionWith and intersectionWith:

unionWith f = swissArmy id id (just2 f)
intersectionWith = swissArmy (const empty) (const empty) (just2 f)
differenceWith = swissArmy id (const empty) (\a b - Nothing)

When throwing together tree-like data structures, I often start by  
writing this function; it handles most of the bulk operations I  
might want to do as one-liners.  It's got a messy, ugly type  
signature, but it does everything you want as efficiently as you  
want.*


My guess is that is when you write things from scratch.

I wanted to add these function on top of the module Data.Map, but then  
that seems not to be possible, as the constructors are not exported. I  
could make a copy of this file, and then make my own variation.


  Hans


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Jan-Willem Maessen

On Jan 27, 2010, at 9:42 AM, Hans Aberg wrote:

 On 27 Jan 2010, at 14:56, Jan-Willem Maessen wrote:
 
 So here, one would want:
 (a - c) - (b - c) - (a - b - c) - Map k a - Map k b - Map k c
 where the two first functions are applied when the first or second key is 
 missing.
 
 Ah, the swiss army knife function on maps.  This particular formulation 
 works well for the application you describe above, where you're completely 
 traversing both maps.  The following really grubby variant can be used to 
 implement asymptotically efficient variations of union, intersection, 
 difference, etc., etc:
 
 swissArmy ::
 (Map k a - Map k c) --- Used to process submaps unique to the left map
 (Map k b - Map k c) --- Used to process submaps unique to the right map
 (a - b - Maybe c) - -- Used to process a single common entry
 Map k a - Map k b - Map k c
 
 I'm not sure why you want to throw in functions between maps in the two first 
 arguments. Then there is no requirement that single keys are preserved.
 
 But it is a good idea to have a Maybe so that one can remove keys on the fly.

A good point.  Technically, one should delimit the scope of the first two 
arguments:

data KeyPreservingMapOperation k a b = AlwaysEmpty | Identity | MapMaybeWithKey 
(k - a - Maybe b)

perform :: KeyPreservingMapOperation a b - Map k a - Map k b
perform AlwaysEmpty = const empty
perform Identity = id
perform (MapMaybeWithKey f) = mapMaybeWithKey f

 When throwing together tree-like data structures, I often start by writing 
 this function; it handles most of the bulk operations I might want to do as 
 one-liners.  It's got a messy, ugly type signature, but it does everything 
 you want as efficiently as you want.*
 
 My guess is that is when you write things from scratch.

Yes.  On the other hand, I've repeatedly run into the same problem you're 
describing: a api that doesn't give me an efficient way to perform an operation 
I know to be very simple.  If every map provided an operation like this one, 
then I can avoid writing my own library from scratch when I need it --- 
especially when from scratch means fork the code and add what I need.

So, library implementors: think hard about your swiss army knife combinators. 
 You end up with messy functions with gigantic signatures.  On the other hand, 
you can often add a couple of judicious INLINE annotations and remove tons of 
code from the rest of your library.  Then expose them, and mark them as the 
functions of last resort that they truly are.

I bet there's even a fusion framework in here somewhere.

-Jan___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Hans Aberg

On 27 Jan 2010, at 16:33, Jan-Willem Maessen wrote:

I'm not sure why you want to throw in functions between maps in the  
two first arguments. Then there is no requirement that single keys  
are preserved.


But it is a good idea to have a Maybe so that one can remove keys  
on the fly.


A good point.  Technically, one should delimit the scope of the  
first two arguments:


data KeyPreservingMapOperation k a b = AlwaysEmpty | Identity |  
MapMaybeWithKey (k - a - Maybe b)


perform :: KeyPreservingMapOperation a b - Map k a - Map k b
perform AlwaysEmpty = const empty
perform Identity = id
perform (MapMaybeWithKey f) = mapMaybeWithKey f


I'm thinking about
  (k - Maybe a - Maybe b - Maybe c) - Map k a - Map k b - Map k c
The first two Maybe's tell if the keys are present, the last if one  
wants it in the resulting map.


When throwing together tree-like data structures, I often start by  
writing this function; it handles most of the bulk operations I  
might want to do as one-liners.  It's got a messy, ugly type  
signature, but it does everything you want as efficiently as you  
want.*


My guess is that is when you write things from scratch.


Yes.  On the other hand, I've repeatedly run into the same problem  
you're describing: a api that doesn't give me an efficient way to  
perform an operation I know to be very simple.  If every map  
provided an operation like this one, then I can avoid writing my own  
library from scratch when I need it --- especially when from  
scratch means fork the code and add what I need.


So, library implementors: think hard about your swiss army knife  
combinators.  You end up with messy functions with gigantic  
signatures.  On the other hand, you can often add a couple of  
judicious INLINE annotations and remove tons of code from the rest  
of your library.  Then expose them, and mark them as the functions  
of last resort that they truly are.


One can transverse the product of keys. Then I'm thinking about
  (k1 - k2 - a - b - Maybe c - Maybe(k, c)) - Map k1 a - Map  
k2 b - Map k c
The first Maybe tells if the key is already present; the second if one  
wants it inserted.


The idea in both cases is to combine the modifying functions into one.  
This simplifies the interface.


  Hans


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Jan-Willem Maessen

On Jan 27, 2010, at 10:54 AM, Hans Aberg wrote:

 On 27 Jan 2010, at 16:33, Jan-Willem Maessen wrote:
 
 I'm not sure why you want to throw in functions between maps in the two 
 first arguments. Then there is no requirement that single keys are 
 preserved.
 
 But it is a good idea to have a Maybe so that one can remove keys on the 
 fly.
 
 A good point.  Technically, one should delimit the scope of the first two 
 arguments:
 
 data KeyPreservingMapOperation k a b = AlwaysEmpty | Identity | 
 MapMaybeWithKey (k - a - Maybe b)
 
 perform :: KeyPreservingMapOperation a b - Map k a - Map k b
 perform AlwaysEmpty = const empty
 perform Identity = id
 perform (MapMaybeWithKey f) = mapMaybeWithKey f
 
 I'm thinking about
  (k - Maybe a - Maybe b - Maybe c) - Map k a - Map k b - Map k c
 The first two Maybe's tell if the keys are present, the last if one wants it 
 in the resulting map.

That has the same behavior semantically, but it's no more efficient than 
performing a unionWith followed by a filter.  For example, consider 
implementing:

xs `intersection` singleton k v

in this way.  With the function given, the complexity is necessarily O(size 
xs): we must traverse every key/value pair in xs.  By contrast, by aggregating 
the operations on keys that exist only in a single map, we can write functions 
like intersection with the desired complexity (which is O(log (size xs)) in 
this case).

 Yes.  On the other hand, I've repeatedly run into the same problem you're 
 describing: a api that doesn't give me an efficient way to perform an 
 operation I know to be very simple.  If every map provided an operation like 
 this one, then I can avoid writing my own library from scratch when I need 
 it --- especially when from scratch means fork the code and add what I 
 need.
 
 So, library implementors: think hard about your swiss army knife 
 combinators.  You end up with messy functions with gigantic signatures.  On 
 the other hand, you can often add a couple of judicious INLINE annotations 
 and remove tons of code from the rest of your library.  Then expose them, 
 and mark them as the functions of last resort that they truly are.
 
 One can transverse the product of keys. Then I'm thinking about
  (k1 - k2 - a - b - Maybe c - Maybe(k, c)) - Map k1 a - Map k2 b - 
 Map k c
 The first Maybe tells if the key is already present; the second if one wants 
 it inserted.

Traversing cross products is a very different operation from zipping in the key 
space.  Again I wouldn't want to mistakenly substitute one for the other!  But 
in this case I think you'll find that you're already well served by the 
functions that already exist---adding this function doesn't enable you to do 
anything more efficiently (at least in a big-O sense).

 The idea in both cases is to combine the modifying functions into one. This 
 simplifies the interface.

Understood, and with a sufficiently smart compiler we might analyze the 
behavior of the function and do the right thing with the single-function 
interface in both cases.  I have yet to encounter a compiler that is both smart 
and reliable on this count.  That is why I've found it necessary to expose 
these kinds of functions.

-Jan

 
  Hans
 
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Hans Aberg

On 27 Jan 2010, at 21:29, Jan-Willem Maessen wrote:


I'm thinking about
(k - Maybe a - Maybe b - Maybe c) - Map k a - Map k b - Map k c
The first two Maybe's tell if the keys are present, the last if one  
wants it in the resulting map.


That has the same behavior semantically, but it's no more efficient  
than performing a unionWith followed by a filter.


It would not be the complexity, but the constant factor, by not having  
to transverse it twice. I'm not sure how it works in a functional  
language, and the trees must be rebalanced, too.



For example, consider implementing:

xs `intersection` singleton k v

in this way.  With the function given, the complexity is necessarily  
O(size xs): we must traverse every key/value pair in xs.  By  
contrast, by aggregating the operations on keys that exist only in a  
single map, we can write functions like intersection with the  
desired complexity (which is O(log (size xs)) in this case).


That is a good point.


One can transverse the product of keys. Then I'm thinking about
(k1 - k2 - a - b - Maybe c - Maybe(k, c)) - Map k1 a - Map  
k2 b - Map k c
The first Maybe tells if the key is already present; the second if  
one wants it inserted.


Traversing cross products is a very different operation from zipping  
in the key space.  Again I wouldn't want to mistakenly substitute  
one for the other!


For the first one, think of sums of sparse matrices or polynomials,  
and the second, products.


But in this case I think you'll find that you're already well served  
by the functions that already exist---adding this function doesn't  
enable you to do anything more efficiently (at least in a big-O  
sense).


Yes, I can't implements (-) directly; it must be a combination of (+)  
and negate, and the 0 must be filtered out in an extra pass. And for  
gcd of monomials, it is now a combination of lcm, (*) and (/), instead  
of a single pass. Unfortunately, these operations are used a lot, so  
getting a smaller constant factor is important.


The idea in both cases is to combine the modifying functions into  
one. This simplifies the interface.


Understood, and with a sufficiently smart compiler we might analyze  
the behavior of the function and do the right thing with the single- 
function interface in both cases.  I have yet to encounter a  
compiler that is both smart and reliable on this count.  That is why  
I've found it necessary to expose these kinds of functions.


By your example above, there may be a conflict between a general,  
simple interface, and implementing things like intersections.


  Hans


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Map unionWith generalization

2010-01-27 Thread Twan van Laarhoven

Hans Aberg wrote:
For example, in Map String Integer (sparse representation of monomials) 
compute the minimum value of all associative pairs with the same key 
(the gcd); if only one key is present, the absent should be treated as 
having value 0. So

  unionWith min xs ys
will not work, because unionWith will always apply the identity to the 
remaining value when one key is missing, whereas it should be sent to 0.


If missing keys represent 0, then wouldn't this work?

intersectionWith min xs ys


Twan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe