Re: [Haskell-cafe] Optimization problem

2006-09-15 Thread Brian Brunswick
On 15/09/06, Lennart Augustsson [EMAIL PROTECTED] wrote:
On Sep 14, 2006, at 03:05 , Rohan Drape wrote: splitStreams [(3,x),(1,y),(3,z),(2,w)] [(3,[x,z]),(1,[y]),(2,[w])] [snip] Furthermore it should work on infinite lists. It can't eat the whole
 list before producing any output. This doesn't seem to make sense?Only at the end of the list can you know that you've collected all the events for each channel.If you output anything before scanning to the end, you'd not know if there
 were perhaps more events on that channel?It makes good sense.Each list will of events will be evaluatedlazily, so thing will appear there as they appear in the input.I don't think you can do it in Haskell without some magic in the IO/
ST monad.But even if you could do it, its very hard to safely /use/ the result.Looking for the next event on any channel runs the risk that actually thereare no more events ever on that channel, and a resultant scan to the end of the infinite list!
-- [EMAIL PROTECTED]
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-15 Thread Rohan Drape
 It makes good sense.  Each list will of events will be evaluated  
 lazily, so thing will appear there as they appear in the input.

Indeed, thankyou.  On a closer inspection I can in fact see that
although the first value, (chn,[msgs]), will never appear, one can
nonetheless start reading the [msgs] of any channel, precisely because
they are in order in the input.  Apologies for the noise, slowly
learning...

Regards,
Rohan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-15 Thread Lennart Augustsson
I agree, the function can be tricky to use.  But that's not our  
problem, we are only to implement it. :)


On Sep 15, 2006, at 05:28 , Brian Brunswick wrote:




On 15/09/06, Lennart Augustsson [EMAIL PROTECTED] wrote: On  
Sep 14, 2006, at 03:05 , Rohan Drape wrote:


 splitStreams [(3,x),(1,y),(3,z),(2,w)]
 [(3,[x,z]),(1,[y]),(2,[w])]

 [snip]

 Furthermore it should work on infinite lists. It can't eat the  
whole

 list before producing any output.

 This doesn't seem to make sense?  Only at the end of the list can  
you

 know that you've collected all the events for each channel.  If you
 output anything before scanning to the end, you'd not know if there
 were perhaps more events on that channel?
It makes good sense.  Each list will of events will be evaluated
lazily, so thing will appear there as they appear in the input.

I don't think you can do it in Haskell without some magic in the IO/
ST monad.



But even if you could do it, its very hard to safely /use/ the result.
Looking for the next event on any channel runs the risk that  
actually there
are no more events ever on that channel, and a resultant scan to  
the end of the infinite list!


--
[EMAIL PROTECTED] 
Shortsig_rules!

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-15 Thread Ross Paterson
On Fri, Sep 15, 2006 at 05:13:29AM +0200, Bertram Felgenhauer wrote:
 Just to prove the point, here's the same code with balancing:

How embarrassing.  Still, your code is quite subtle.  As penance, here's
my explanation, separating the main idea from the knot-tying.

The ingredients are a map type with an insertion function

data Key k a = ...
instance Functor (Map k)

insert :: Ord k = k - a - Map k a - Map k a

with a partial inverse

uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a)

satisfying

uninsert k (fmap (const ()) m) (insert k v m) = (v, m)

for any map m not containing k.  We also need an update function

update :: Ord k = k - (a - a) - Map k x - Map k a - Map k a

where the two map arguments have the same shape.  Then splitSeq becomes:

splitSeq :: Ord a = [(a,b)] - [(a,[b])]
splitSeq = fst . splitSeq' Leaf

splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
splitSeq' bp [] = ([], map (const []) bp)
splitSeq' bp ((a,b):xs)
  | member a bp =
let (l, m) = splitSeq' bp xs
in  (l, update a (b:) bp m)
  | otherwise =
let bp' = insert a () bp
(l, m') = splitSeq' bp' xs
(bs, m) = uninsert a bp m'
in ((a, b : bs) : l, m)

Applying a tupling transformation to insert+uninsert gives your version.

It's interesting that these composed transformations don't seem to cost
too much.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Henning Thielemann

On Wed, 13 Sep 2006, Magnus Jonsson wrote:

 When programming the other day I ran into this problem. What I want to do is a
 function that would work like this:
 
 splitStreams::Ord a=[(a,b)]-[(a,[b])]
 
  splitStreams [(3,x),(1,y),(3,z),(2,w)]
 [(3,[x,z]),(1,[y]),(2,[w])]

Interestingly we use such a routine in Haskore for splitting a sequence of
notes into sequences of notes of equal instruments. It's implemented
rather the same way like your version.

It's 'slice' in
   http://darcs.haskell.org/haskore/src/Haskore/Basic/TimeOrderedList.lhs
  but it is a bit more complicated because it has to manage time
information.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Matthias Fischmann


 Magnus Jonsson [EMAIL PROTECTED] writes:
 
 splitStreams::Ord a=[(a,b)]-[(a,[b])]
 
 splitStreams [(3,x),(1,y),(3,z),(2,w)]
 [(3,[x,z]),(1,[y]),(2,[w])]
 
  [...]
 
 But is there any way to write it such that each element is touched
 only once? Or at least an O(n*log(m)) algorithm?
 
 I guess it should be possible to use a quicksort-like algorithm,
 splitting the stream into three streams with keys less than, equal,
 and greater than the first element, and recurse on the less/equal
 streams?

something like this, then?

splitStreams :: Ord a = [(a, b)] - [(a, [b])]
splitStreams [] = []
splitStreams ((a, b) : t) = (a, b : bs) : splitStreams t'
where
(bs, t') = foldr f ([], []) t
f z@(a', b') (bs, t') = if a' == a
then (b' : bs, t')
else (bs, z : t')

(i guess i should use a strictness '!' for (xs, ys), but i am still
running ghc-6.5, and i don't like the case constructions that much.
does this bite me here?  i didn't check, really.)

who can tune this any further?



cheers,
m.


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Bertram Felgenhauer
Magnus Jonsson wrote:
 splitStreams::Ord a=[(a,b)]-[(a,[b])]
 
 splitStreams [(3,x),(1,y),(3,z),(2,w)]
 [(3,[x,z]),(1,[y]),(2,[w])]

 I'm afraid this algorithm is O(n*m) time in the worst case, where n is the 
 length of the input list, and m is the number of unique channels.
 
 But is there any way to write it such that each element is touched only 
 once? Or at least an O(n*log(m)) algorithm?

This can be done. It's an interesting puzzle to make it work for infinite
lists. For finite lists, sortBy + groupBy + map easily do the job.

The problem is dealing with holes in data structures in Haskell, which
are to be filled in later. This can be done by providing blueprints for
them. Let's define a basic map:

 data Map k a  = Node k a (Map k a) (Map k a) | Leaf

We will use Map k () as blueprints - the () indicate where the holes
are. Next we define a lookup function, which takes a blueprint and
evaluates the correct hole in a second map:

 lookup :: Ord k = k - Map k () - Map k a - Maybe a
 lookup _ Leaf_  = Nothing
 lookup k (Node k' _ l r) ~(Node _ a' l' r') = case compare k k' of
 LT - lookup k l l'
 EQ - Just a'
 GT - lookup k r r'

As you can see, the structure of the second argument is forced
by the first argument. The lazy pattern assures that we don't look
at the second argument too early.

In a similar fashion, we can define an update function:

 update :: Ord k = k - (a - a) - Map k x - Map k a - Map k a
 update k f (Node k' _ l r) ~(Node _ a' l' r') = case compare k k' of
 LT - Node k' a' (update k f l l') r'
 EQ - Node k' (f a') l' r'
 GT - Node k' a' l' (update k f r r')

Next comes insert. Insert takes a blueprint and inserts a key into it.
It also takes a map and removes the corresponding hole from it. To
simplify the code it does no balancing. [1]

 insert :: Ord k = k - Map k () - Map k a - (Map k (), Map k a)
 insert k Leaf_
 = (Node k () Leaf Leaf, Leaf)
 insert k (Node k' _ l r) ~(Node _ a' l' r')
 = case compare k k' of
 LT - let (m, m') = insert k l l' in
   (Node k' () m r, Node k' a' m' r')
 EQ - error inserting existing element
 GT - let (m, m') = insert k r r' in
   (Node k' () l m, Node k' a' l' m')

We also need a map, defined in the usual fashion, without the blueprint.
A version with blueprint can also be defined, but it isn't necessary for
our problem:

 map :: (a - b) - Map k a - Map k b
 map _ Leaf   = Leaf
 map f (Node k a l r) = Node k (f a) (mapMap f l) (mapMap f r)


We can now build the splitStream function, using the following helper
function:

 splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 splitSeq' bp [] = ([], map (const []) bp)
 splitSeq' bp ((a,b):xs) = case lookup a bp bp of
 Just _ - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m)
 _  - let (bp', m) = insert a bp m'
   (l, m')  = splitSeq' bp' xs
 in
   ((a, b : (fromJust $ lookup a bp' m')) : l, m)

splitSeq' takes a blueprint for a map with all keys seen so far, and
a list tail. It returns the result list for all new keys, and a map
(corresponding to the given blueprint) with the tails of the seen
elements.

The in the base case it just fills the blueprint with empty lists and
returns to the caller.

If a new element is seen, insert is used to create a new blueprint,
including the new key a, which is passed to the recursive call of
splitSeq'. The resulting map from that call is threaded back into
insert, which gives us a new map without the a key which matches
the structure of the original blueprint.

Finally we can define splitSeq as follows:

 splitSeq :: Ord a = [(a,b)] - [(a,[b])]
 splitSeq = fst . splitSeq' Leaf

A quick test:
*Main let s = splitSeq ([(3,'x'),(1,'y'),(3,'z'),(2,'w')] ++ repeat (4,' '))
*Main s !! 0
(3,xzInterrupted.  (I pressed ^C)
*Main s !! 2
(2,wInterrupted.   (ditto)
*Main s !! 3
(4,   ...

Is there a simpler way to do this? I don't know.

I'm also unsure whether it is a good idea - unless you use several
threads to process the list the code will produce a lot of thunks,
and eat a lot of memory.

The code above provides a maximum amount of lazyness while using
O(n log m) time. Depending on the circumstances processing the list in
chunks and using techniques like the above to combine the result will
be better.

enjoy,

Bertram

[1] Balancing can be done with the information in the blueprint, and
mapping back is easily done by doing the transformation on the tree
in reverse.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Ross Paterson
Here's my attempt, also using the quicksort idea, but using two passes
instead of tying the knot:

 import Data.Set hiding (map)

First a binary search tree, with a lookup function:

 data Tree k v = Node (Tree k v) k v (Tree k v)

 get :: Ord a = a - Tree a b - b
 get a (Node l k v r) = case compare a k of
 LT - get a l
 EQ - v
 GT - get a r

There's no empty case: we'll ensure that we only search for keys that
are in the tree.

Now to make a tree of lists from the list of pairs:

 mkTree :: Ord a = [(a,b)] - Tree a [b]
 mkTree [] = error unused
 mkTree ((a,b):abs) = Node l a (b:bs) r
   where l = mkTree [(a',b') | (a',b') - abs, a'  a]
 r = mkTree [(a',b') | (a',b') - abs, a'  a]
 bs = [b' | (a',b') - abs, a' == a]

It remains to extract from this tree the list of second elements
corresponding to the each distinct first element in the input list:

 splitStreams :: Ord a = [(a,b)] - [(a,[b])]
 splitStreams abs = [(a, get a t) | a - uniq (map fst abs)]
   where t = mkTree abs

where uniq computes the list of unique elements of a list:

 uniq :: Ord a = [a] - [a]
 uniq = u empty
   where u s [] = []
 u s (x:xs)
   | member x s = u s xs
   | otherwise  = x : u (insert x s) xs

There's no rebalancing, so it suffers from the same problems as
quicksort.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Bruno Oliveira
Hello, 

I think it is possible to do it in O(n) (if you change the representation 
of the output stream). 

Note that the solution is quite similar toTwan van Laarhoven, 
and it should be trivial use Map instead of Rel.

Here is my take on it:

The type of the output stream:

 type Rel a b = a - [b]

Here are cons and nil:

 cons :: Eq a = (a,b) - Rel a b - Rel a b
 cons (x,y) l z
| x == z= y : l x
| otherwise  = l z

 nil :: Rel a b
 nil _ = []

and a lookUp function:

 lookUp :: Rel a b - a - [b]
 lookUp = id

Finally the splitStreams algorithm:

 splitStreams :: Eq a = [(a,b)] - Rel a b
 splitStreams = foldr cons nil

Here is a test with finite lists:

 fin = splitStreams [(3,'x'),(1,'y'),(3,'z'),(2,'w')]

and here are the console tests:

*General7 lookUp fin 1
y
*General7 lookUp fin 2
w
*General7 lookUp fin 3
xz

Now with infinite lists:

 inf = splitStreams (map (\x - (0, x)) [1..])

and here a case where it terminates with infinite lists:

*General7 take 10 (lookUp inf 0)
[1,2,3,4,5,6,7,8,9,10]

Cheers,

Bruno Oliveira


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Bruno Oliveira



Hello Magnus,



You are right. Silly me. I was thinking of Rel as a kind of Hashtable. In this case, 

I think it should be possible to have O(n), since "cons" would only need constant 

time.



Cheers,



Bruno



On Thu, 14 Sep 2006 13:11:05 -0400 (EDT), Magnus Jonsson wrote:



Thanks Bruno.



However I think this is still O(n*m). As far as I understand your code it 

is a longwinded way to say "[b | (a,b) - input, b == 

myChannelOfInterest]". This is fine if you are only interested in one 

channel, and for that case I agree it's O(n), but if you are interested in 

many different channels, it will take O(n*m). Here's why I think your 

code is equivalent to using a filter/list-comprehension:



 cons :: Eq a = (a,b) - Rel a b - Rel a b

 cons (x,y) l z

| x == z= y : l x

| otherwise  = l z



z is the channel that the user is interested in.

x is the channel of the current message in the list.

y is the message content.

l is a function for searching the rest of the list.



For each element of the list you create a function that compares the 

requested channel to a (in (a,b)). If it's the same, you return that 

element followed by a continued search down the list. If you don't find 

it, you forward the request to the next function down the list. That's 

exactly the same as what the filter function does.



It is possible I made a mistake somewhere and if I did, let me know.



/ Magnus



On Thu, 14 Sep 2006, Bruno Oliveira wrote:



 Hello,



 I think it is possible to do it in O(n) (if you change the representation

 of the output stream).



 Note that the solution is quite similar toTwan van Laarhoven,

 and it should be trivial use "Map" instead of "Rel".



 Here is my take on it:



 The type of the output stream:



 type Rel a b = a - [b]



 Here are cons and nil:



 nil :: Rel a b

 nil _ = []



 and a lookUp function:



 lookUp :: Rel a b - a - [b]

 lookUp = id



 Finally the splitStreams algorithm:



 splitStreams :: Eq a = [(a,b)] - Rel a b

 splitStreams = foldr cons nil



 Here is a test with finite lists:



 fin = splitStreams [(3,'x'),(1,'y'),(3,'z'),(2,'w')]



 and here are the console tests:



 *General7 lookUp fin 1

 "y"

 *General7 lookUp fin 2

 "w"

 *General7 lookUp fin 3

 "xz"



 Now with infinite lists:



 inf = splitStreams (map (\x - (0, x)) [1..])



 and here a case where it terminates with infinite lists:



 *General7 take 10 (lookUp inf 0)

 [1,2,3,4,5,6,7,8,9,10]



 Cheers,



 Bruno Oliveira





 ___

 Haskell-Cafe mailing list

 Haskell-Cafe@haskell.org

 http://www.haskell.org/mailman/listinfo/haskell-cafe








___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Ross Paterson
On Thu, Sep 14, 2006 at 05:22:05PM +0200, Bertram Felgenhauer wrote:
 [much subtle code]
 We can now build the splitStream function, using the following helper
 function:
 
  splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])

This works for infinite lists if there is no balancing, but if insert does
balancing, the top of the map will not be available until the last key
is seen, so splitSeq' could only be used for finite chunks.  Then you'll
need a way to put the partial answers together.  It might be possible
to amortize the cost of that for an appropriate choice of chunk length.
It would also cost some laziness: the chunked version would work for
infinite lists, but wouldn't produce all the output for a partial list.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Lennart Augustsson

On Sep 14, 2006, at 03:05 , Rohan Drape wrote:


splitStreams [(3,x),(1,y),(3,z),(2,w)]

[(3,[x,z]),(1,[y]),(2,[w])]


[snip]


Furthermore it should work on infinite lists. It can't eat the whole
list before producing any output.


This doesn't seem to make sense?  Only at the end of the list can you
know that you've collected all the events for each channel.  If you
output anything before scanning to the end, you'd not know if there
were perhaps more events on that channel?
It makes good sense.  Each list will of events will be evaluated  
lazily, so thing will appear there as they appear in the input.


I don't think you can do it in Haskell without some magic in the IO/ 
ST monad.


In LML we had an array construction function that did almost exactly  
what the O.P. asked for.


-- Lennart

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-14 Thread Bertram Felgenhauer
Ross Paterson wrote:
 On Thu, Sep 14, 2006 at 05:22:05PM +0200, Bertram Felgenhauer wrote:
  [much subtle code]
  We can now build the splitStream function, using the following helper
  function:
  
   splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 
 This works for infinite lists if there is no balancing, but if insert does
 balancing, the top of the map will not be available until the last key
 is seen, so splitSeq' could only be used for finite chunks.  Then you'll
 need a way to put the partial answers together.

Just to prove the point, here's the same code with balancing:

 SNIP HERE (end marked with ) 

module SplitSeq (splitSeq) where

import Prelude hiding (lookup, map)

-- our map
data Map k a  = Node !Int k a (Map k a) (Map k a) | Leaf deriving Show

size :: Map k a - Int
size Leaf = 0
size (Node s _ _ _ _) = s

member :: Ord k = k - Map k a - Bool
member _ Leaf  = False
member k (Node _ k' _ l r) = case compare k k' of
LT - member k l
EQ - True
GT - member k r

-- insert key into blueprint and extract the corresponding value from
-- the second argument, threading it backward through all operations
insert :: Ord k = k - Map k () - Map k a - (Map k (), a, Map k a)
insert k Leaf~(Node _ _ a _ _)
= (Node 1 k () Leaf Leaf, a, Leaf)
insert k (Node s k' _ l r) node
= case compare k k' of
LT - let (m, a, l'')  = insert  k l l'
  (m', a', l', r') = balance k' m r node
  in  (m', a, Node s k' a' l'' r')
EQ - error inserting existing element
GT - let (m, a, r'')  = insert  k r r'
  (m', a', l', r') = balance k' l m node
  in  (m', a, Node s k' a' l' r'')

-- balance and co are taken from Data.Map and adapted
balance k l r node
| size l + size r = 1 = let Node _ _ a l' r' = node
 in  (mkNode k () l r, a, l', r')
| size r = 5 * size l = rotateL k l r node
| size l = 5 * size r = rotateR k l r node
| otherwise= let Node _ _ a l' r' = node
 in  (mkNode k () l r, a, l', r')

rotateL k l r@(Node _ _ _ l' r') node
| size l'  2*size r' = singleL k l r node
| otherwise   = doubleL k l r node

rotateR k l@(Node _ _ _ l' r') r node
| size r'  2*size l' = singleR k l r node
| otherwise   = doubleR k l r node

singleL k l (Node s k' _ m r)
~(Node _ _ a ~(Node _ _ a' l' m') r') =
(mkNode k' () (mkNode k () l m) r,
a', l', Node s k' a m' r')

singleR k (Node s k' _ l m) r
~(Node _ _ a l' ~(Node _ _ a' m' r')) =
(mkNode k' () l (mkNode k () m r), 
a', Node s k' a l' m', r')

doubleL k l (Node s k' _ (Node s' k'' _ ml mr) r)
~(Node _ _ a ~(Node _ _ a' l' ml') ~(Node _ _ a'' mr' r')) =
(mkNode k'' () (mkNode k () l ml) (mkNode k' () mr r),
a', l', Node s k' a'' (Node s' k'' a ml' mr') r')

doubleR k (Node s k' _ l (Node s' k'' _ ml mr)) r
~(Node _ _ a ~(Node _ _ a' l' ml') ~(Node _ _ a'' mr' r')) =
(mkNode k'' () (mkNode k' () l ml) (mkNode k () mr r),
a'', Node s k' a' l' (Node s' k'' a ml' mr'), r')

-- make a new node with the correct size
mkNode k x l r = Node (size l + size r + 1) k x l r

-- update the element associated with the given key
update :: Ord k = k - (a - a) - Map k x - Map k a - Map k a
update k f (Node s k' _ l r) ~(Node _ _ a' l' r') = case compare k k' of
LT - Node s k' a' (update k f l l') r'
EQ - Node s k' (f a') l' r'
GT - Node s k' a' l' (update k f r r')

-- standard map function, no blueprints here
map :: (a - b) - Map k a - Map k b
map _ Leaf = Leaf
map f (Node s k a l r) = Node s k (f a) (map f l) (map f r)

-- finally, define splitSeq
splitSeq :: Ord a = [(a,b)] - [(a,[b])]
splitSeq = fst . splitSeq' Leaf

splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
splitSeq' bp [] = ([], map (const []) bp)
splitSeq' bp ((a,b):xs) = case member a bp of
True  - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m)
False - let (bp', a', m) = insert a bp m'
 (l, m')  = splitSeq' bp' xs
 in
 ((a, b : a') : l, m)


The balancing code is adopted from Data.Map. I added threading back the
result map (from splitSeq) to the balancing operations.

This results in the promised O(n*log m) time. The cost associated with
a lazy pattern (it creates a closure for each bound variable, IIRC)
is quite high but constant, so each call to insert takes O(log m)
time, including future forces of the lazy patterns.

Likewise, member and update are also O(log m).

Example test:

Prelude SplitSeq print $ take 11 $ snd $ splitSeq ([(n `mod` 1, 'a') | n 
- [1..10]] ++ [(n, 'b') | n - [0..]]) !! 9534
aab
(1.07 secs, 137026252 bytes)

The nonbalancing version would take ages.

regards,

Bertram
___
Haskell-Cafe mailing 

Re: [Haskell-cafe] Optimization problem

2006-09-13 Thread Twan van Laarhoven

Magnus Jonsson wrote:

Dear Haskell Cafe,

When programming the other day I ran into this problem. What I want to 
do is a function that would work like this:


splitStreams::Ord a=[(a,b)]-[(a,[b])]


splitStreams [(3,x),(1,y),(3,z),(2,w)]


[(3,[x,z]),(1,[y]),(2,[w])]


A O(n log(n)) algorithm is easy if you use Data.Map:

 import qualified Data.Map as Map

 splitStreamsMap :: Ord a = [(a,b)] - Map.Map a [b]
 splitStreamsMap = foldl add Map.empty
  where add (a,b) m = Map.insertWith (++) a [b]

 splitStreams = Map.fromList . splitStreamsMap

Twan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-13 Thread Magnus Jonsson
Nice try Twan but your example fails on infinite lists. I cleaned up 
your example so that it compiles:


import qualified Data.Map as Map

splitStreamsMap :: Ord a = [(a,b)] - Map.Map a [b]
splitStreamsMap = foldl add Map.empty
  where add m (a,b) = Map.insertWith (++) a [b] m

splitStreams :: Ord a = [(a,b)] - [(a,[b])]
splitStreams = Map.toList . splitStreamsMap

It fails to return a value on this test:

take 2 $ snd $ head $ splitStreams (map (\x - (0 ,x)) [1..])

/ Magnus

On Thu, 14 Sep 2006, Twan van Laarhoven wrote:


Magnus Jonsson wrote:

Dear Haskell Cafe,

When programming the other day I ran into this problem. What I want to do 
is a function that would work like this:


splitStreams::Ord a=[(a,b)]-[(a,[b])]


splitStreams [(3,x),(1,y),(3,z),(2,w)]


[(3,[x,z]),(1,[y]),(2,[w])]


A O(n log(n)) algorithm is easy if you use Data.Map:


import qualified Data.Map as Map

splitStreamsMap :: Ord a = [(a,b)] - Map.Map a [b]
splitStreamsMap = foldl add Map.empty
 where add (a,b) m = Map.insertWith (++) a [b]

splitStreams = Map.fromList . splitStreamsMap


Twan


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimization problem

2006-09-13 Thread Magnus Jonsson

On Thu, 14 Sep 2006 [EMAIL PROTECTED] wrote:


Any approach, even sieving, will struggle with infinite
lists, won't it?

   (take 2 . snd . head . splitStreams) [(i, i) | i -
[0..]]


Yes, if you expect two messages but only one comes then you'll 
wait forever, true.




Regards,
Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe