Re: [Haskell-cafe] Re: Optimization problem

2006-09-28 Thread Ross Paterson
On Tue, Sep 19, 2006 at 01:52:02PM +0100, Conor McBride wrote:
 [EMAIL PROTECTED] wrote:
 Btw, why are there no irrefutable patterns for GADTs?
 
 Just imagine
 
  data Eq a b where Refl :: Eq a a
 
  coerce :: Eq a b - a - b
  coerce ~Refl a = a
 
 coerce undefined True :: String
 
 Bang you're dead. Or rather... Twiddle you're dead.

Does anything go wrong with irrefutable patterns for existential types?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Re: Optimization problem

2006-09-28 Thread Simon Peyton-Jones
| Does anything go wrong with irrefutable patterns for existential
types?

Try giving the translation into System F.

Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-28 Thread Ross Paterson
On Thu, Sep 28, 2006 at 03:22:25PM +0100, Simon Peyton-Jones wrote:
 | Does anything go wrong with irrefutable patterns for existential
 types?
 
 Try giving the translation into System F.

Hmm, that's not quite as satisfying as Conor's answer for GADTs.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-20 Thread Ross Paterson
On Tue, Sep 19, 2006 at 08:06:07PM +0200, [EMAIL PROTECTED] wrote:
 For our optimization problem, it's only a matter of constructors on the
 right hand side. They should pop out before do not look on any
 arguments, so it's safe to cry so you just know, i'm a Just.

It seems the appropriate encoding would be for the shape to be an
inductive datatype and the contents (as well as the lists) to be
coinductive, as access to the contents is gated through the shape.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-19 Thread apfelmus
Ross Paterson wrote:
 On Sun, Sep 17, 2006 at 01:08:04PM +0200, [EMAIL PROTECTED] wrote:
 Ross Paterson wrote:
 It's interesting that these composed transformations don't seem to cost
 too much.
 That the composed transformations are indeed cheap is not necessarily
 disturbing.
 
 I meant the composition of the transformations of the tree (update or
 reverse insert) that Bertram does for each list element.  They're cheap
 because they're bounded by the depth of the tree, and even cheaper if
 you're probing some other part of the tree.

Me too :) I tried to point out that separating uninsert from in insert
increases time complexity. In general

   uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a)

   fst $ uninsert _ bp m == differenceWith (curry snd) bp m

with needs O(size of blueprint) time. This is to be compared with O(log
(size of blueprint)) for the combined insertdelete.

That there (likely) must be such a difference can already be seen from
the types of insertdelete and (insert,uninsert) !

But you already pointed out that something like

   insertdelete :: Ord k = k - Map shape k -
   (exists shape' . (Map shape' k, Map shape' a - (a, Map shape a))

is the best type for insertdelete. Here, it is clear that insertdelete
likely can do a fast uninsert.


Btw, why are there no irrefutable patterns for GADTs? I mean, such a sin
should be shame for a non-strict language...


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-19 Thread Ross Paterson
On Tue, Sep 19, 2006 at 01:41:51PM +0200, [EMAIL PROTECTED] wrote:
 Btw, why are there no irrefutable patterns for GADTs?

Not GADTs, but existential types (whether done with GADTs or not).
They can't be analysed with irrefutable patterns, of which let bindings
are a special case:

http://www.haskell.org/ghc/docs/6.4.2/html/users_guide/type-extensions.html#id3143545

See the second restriction.  Irrefutable patterns aren't mentioned
there, but they're the general case.  See also

http://www.haskell.org/pipermail/haskell-cafe/2006-May/015609.html
http://www.haskell.org/pipermail/glasgow-haskell-users/2006-May/010278.html

though I don't buy the rationale there.  Hugs has no such restriction.

 I mean, such a sin should be shame for a non-strict language...

It certainly bites in this case.  We could define

data TLeaf
data TNode s1 s2

data MapShape s k where
SLeaf :: MapShape TLeaf k
SNode :: !Int - k - MapShape s1 k - MapShape s2 k -
MapShape (TNode s1 s2) k

data MapData s a where
DLeaf :: MapData TLeaf a
DNode :: a - MapData s1 a - MapData s2 a -
MapData (TNode s1 s2) a

data InsertResult s k =
forall s'. InsertResult
(MapShape s' k)
(forall a. MapData s' a - (a, MapData s a))

insert :: Ord k = k - MapShape s k - InsertResult s k

and have the compiler check that the transformations on the shape match
the transformations on the data, but first we need to turn lots of lets
into cases and erase the tildes.  Of course the resulting program no
longer works, but it does have verifiably correct transformations.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-19 Thread Conor McBride

Hi folks

[EMAIL PROTECTED] wrote:

Btw, why are there no irrefutable patterns for GADTs? I mean, such a sin
should be shame for a non-strict language...
  



Just imagine

 data Eq a b where Refl :: Eq a a

 coerce :: Eq a b - a - b
 coerce ~Refl a = b

coerce undefined True :: String

Bang you're dead. Or rather... Twiddle you're dead.

Moral: in a non-total language, if you're using indexing to act as 
evidence for something, you need to be strict about checking the 
evidence before you act on it, or you will be vulnerable to the 
blandishments of the most appalling liars.


As Randy Pollack used to say to us when we were children, the best thing 
about working in a strongly normalizing language is not having to 
normalize things.


All the best

Conor

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-19 Thread Robert Dockins


On Sep 19, 2006, at 8:52 AM, Conor McBride wrote:


Hi folks

[EMAIL PROTECTED] wrote:
Btw, why are there no irrefutable patterns for GADTs? I mean, such  
a sin

should be shame for a non-strict language...




Just imagine

 data Eq a b where Refl :: Eq a a

 coerce :: Eq a b - a - b
 coerce ~Refl a = b



I think you mean:

 coerce ~Refl x = x



coerce undefined True :: String

Bang you're dead. Or rather... Twiddle you're dead.

Moral: in a non-total language, if you're using indexing to act as  
evidence for something, you need to be strict about checking the  
evidence before you act on it, or you will be vulnerable to the  
blandishments of the most appalling liars.


As Randy Pollack used to say to us when we were children, the best  
thing about working in a strongly normalizing language is not  
having to normalize things.


All the best

Conor



Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
  -- TMBG



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-19 Thread Conor McBride

Robert Dockins wrote:


On Sep 19, 2006, at 8:52 AM, Conor McBride wrote:


Hi folks

[EMAIL PROTECTED] wrote:
Btw, why are there no irrefutable patterns for GADTs? I mean, such a 
sin

should be shame for a non-strict language...




Just imagine

 data Eq a b where Refl :: Eq a a

 coerce :: Eq a b - a - b
 coerce ~Refl a = b



I think you mean:

 coerce ~Refl x = x


Indeed I do. My original program was much safer!

Thanks

Conor


This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-19 Thread apfelmus
Conor McBride wrote:

 Just imagine
 
 data Eq a b where Refl :: Eq a a
 
 coerce :: Eq a b - a - b
   Robert Dockins wrote:
 coerce ~Refl x = x 

 coerce undefined True :: String
 
 Bang you're dead. Or rather... Twiddle you're dead.

Mom, he's scaring me! ;)

Who says this function may not be strict in the first argument despite
the irrefutable pattern? Also, for the sarcastically inclined, death is
semantically equivalent to _|_. As is (fix id :: a - b), too.

Alas, one can say

   dontcoerce :: Eq a (Maybe b) - a - Maybe b
   dontcoerce ~Refl x = Just x

and

   dontcoerce' _|_ == (\x - Just (x :: b(~Refl)))
   == \(x::_|_) - Just (x :: _|_)
   == \_ - Just _|_

The type of x on the right-hand side inherently depends on ~Refl and
should be _|_ if ~Refl is. Type refinement makes the types depend on
values, after all.

For our optimization problem, it's only a matter of constructors on the
right hand side. They should pop out before do not look on any
arguments, so it's safe to cry so you just know, i'm a Just.

Maybe one can remedy things by introducing a _|_ on type-level with the
only inhabitant _|_ :: _|_. That would treat objections Ross Paterson
referenced:
 See the second restriction.  Irrefutable patterns aren't mentioned
 there, but they're the general case.  See also
 
 http://www.haskell.org/pipermail/haskell-cafe/2006-May/015609.html
 http://www.haskell.org/pipermail/glasgow-haskell-users/2006-May/010278.html

Admittedly, it's a thin ice to trample on, but otherwise, for this
special splitSeq-problem, one runs into the more haste less speed
dilemma (i mean Wadler's paper ). Bertram's lazy algorithm actually is
an online-algorithm and it should remain one when making it type safe.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-18 Thread Ross Paterson
On Sun, Sep 17, 2006 at 01:08:04PM +0200, [EMAIL PROTECTED] wrote:
 Ross Paterson wrote:
  It's interesting that these composed transformations don't seem to cost
  too much.
 
 That the composed transformations are indeed cheap is not necessarily
 disturbing.

I meant the composition of the transformations of the tree (update or
reverse insert) that Bertram does for each list element.  They're cheap
because they're bounded by the depth of the tree, and even cheaper if
you're probing some other part of the tree.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-18 Thread Ross Paterson
On Mon, Sep 18, 2006 at 12:23:11AM +0100, Ross Paterson wrote:
 To prove that (even for partial and infinite lists ps)
 
   splitSeq ps = [(a, seconds a ps) | a - nub ps]
 
 [...] we can establish, by induction on the input list,
 
 (1)   fst (splitSeq' s ps) =
   [(a, seconds a ps) | a - nub ps, not (member a s)]
 (2)   member x s  =
   get x s (snd (splitSeq' s ps)) = seconds x ps

Oops, nub ps should be nub (map fst ps) in each case.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-17 Thread apfelmus
Ross Paterson wrote:

 How embarrassing.  Still, your code is quite subtle.  As penance, here's
 my explanation, separating the main idea from the knot-tying.
 
 The ingredients are a map type with an insertion function
 [...]
   insert :: Ord k = k - a - Map k a - Map k a
 
 with a partial inverse
 
   uninsert :: Ord k = k - Map k () - Map k a - (a, Map k a)
 
 [...]
 Applying a tupling transformation to insert+uninsert gives your version.
 
 It's interesting that these composed transformations don't seem to cost
 too much.

That the composed transformations are indeed cheap is not necessarily
disturbing. Let's call Bertram's original insert version insertdelete
from now on. When analyzing the magic behind, you do

ross_analysis :: ((bp,map') - (bp',map)) - (bp-bp', (bp,map') - map)
ross_analysis insertdelete = (insert, uninsert)

Of course a tupling transformation will reverse this

tupletiple :: (bp-bp', (bp,map') - map) - ((bp,map') - (bp',map))
tupletiple (insert,uninsert) = insertdelete

But as @djinn will tell you, ross_analysis cannot be realized :) So
the original version possesses additional computational power, it can
reverse everything it's doing with bp on the map'. uninsert does not
have information about the single steps that have been done, it only
knows what should come out. From that, it would have to reconstruct
quickly what happened, if it wants to be fast.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-17 Thread Ross Paterson
On Thu, Sep 14, 2006 at 07:51:59PM +0200, Bertram Felgenhauer wrote:
 It's a result of thinking about lazy evaluation, and
 especially lazy patterns (and let bindings) for some time. A wiki article
 that helped me a lot to understand these is
 
   http://www.haskell.org/hawiki/TyingTheKnot
 
 I'd like to point out the trustList function there which uses the idea
 of encoding the structure of a term and its actual values in different
 arguments, i.e. a blueprint.

One view of your device is as separating the shape (blueprint) from the
contents, e.g. one can split a finite map type

data Map k a  = Node !Int k a (Map k a) (Map k a) | Leaf

into a pair of types

data MapShape k = SNode !Int k (MapShape k) (MapShape k) | SLeaf
data MapData a = DNode a (MapData a) (MapData a) | DLeaf

(We don't even need DLeaf, as it's never examined.)
We need functions

fill :: a - MapShape k - MapData a
update :: Ord k = k - (a - a) - MapShape k -
MapData a - MapData a
insert :: Ord k = k - MapShape k -
(MapShape k, MapData a - (a, MapData a))

In a dependently typed language we could parameterize MapData with shapes
to give more precise types:

fill :: a - forall s :: MapShape k. MapData s a
update :: Ord k = k - (a - a) -
forall s :: MapShape k.  MapData s a - MapData s a
insert :: Ord k = k - forall s :: MapShape k.
(exists s' :: MapShape k, MapData s' a - (a, MapData s a))

hinting that the function returned by insert reverses the effect of
the insertion.  It's not possible to code this with GADTs, because
existentials are incompatible with irrefutable patterns, at least
in GHC.

Anyway, splitSeq then becomes

splitSeq :: Ord a = [(a,b)] - [(a,[b])]
splitSeq = fst . splitSeq' SLeaf

splitSeq' :: Ord a = MapShape a - [(a,b)] - ([(a,[b])], MapData [b])
splitSeq' bp [] = ([], fill [] bp)
splitSeq' bp ((a,b):xs)
  | member a bp = let (l, m) = splitSeq' bp xs
  in  (l, update a (b:) bp m)
  | otherwise   = let (bp', extract) = insert a bp
  (l, m')= splitSeq' bp' xs
  (bs, m)= extract m'
  in  ((a, b:bs) : l, m)

Again, no knot-tying is required.

To prove that (even for partial and infinite lists ps)

splitSeq ps = [(a, seconds a ps) | a - nub ps]

where

seconds :: Eq a = a - [(a,b)] - [b]
seconds a ps = [b | (a', b) - ps, a' == a]

we need another function

get :: Ord k = k - MapShape k - MapData a - a

and the properties

member k s  =
get k s (fill v s) = v

member k s  =
get k s (update k f s m) = f (get k s m)
member k s /\ member x s /\ x /= k  =
get x s (update k f s m) = get x s m

not (member k s) /\ insert k s = (s', extract)  =
member x s' = member x s || x == k
not (member k s) /\ insert k s = (s', extract)  =
fst (extract m) = get k s' m
not (member k s) /\ insert k s = (s', extract) /\ member x s  =
get x s (snd (extract m)) = get x s' m

Then we can establish, by induction on the input list,

(1) fst (splitSeq' s ps) =
[(a, seconds a ps) | a - nub ps, not (member a s)]
(2) member x s  =
get x s (snd (splitSeq' s ps)) = seconds x ps

This is enough to sustain the induction and obtain the desired property,
but it's a bit inelegant not to have an exact characterization of the
second component.  It seems essential to observe the map data only
though get.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-17 Thread Donald Bruce Stewart
ross:
 On Thu, Sep 14, 2006 at 07:51:59PM +0200, Bertram Felgenhauer wrote:
  It's a result of thinking about lazy evaluation, and
  especially lazy patterns (and let bindings) for some time. A wiki article
  that helped me a lot to understand these is
  
http://www.haskell.org/hawiki/TyingTheKnot
  
  I'd like to point out the trustList function there which uses the idea
  of encoding the structure of a term and its actual values in different
  arguments, i.e. a blueprint.
 
 One view of your device is as separating the shape (blueprint) from the
 contents, e.g. one can split a finite map type
 
   data Map k a  = Node !Int k a (Map k a) (Map k a) | Leaf
 
 into a pair of types
 
   data MapShape k = SNode !Int k (MapShape k) (MapShape k) | SLeaf
   data MapData a = DNode a (MapData a) (MapData a) | DLeaf

...

Nice description.

Ross, I added a wiki page for this technique. Would you like to
either elaborate on the wiki, or include the text of your email to it?

http://haskell.org/haskellwiki/Separating_shape_and_content

Cheers, 
  Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-15 Thread apfelmus
Ross Paterson wrote:
 On Thu, Sep 14, 2006 at 05:22:05PM +0200, Bertram Felgenhauer wrote:
 [much subtle code]
 We can now build the splitStream function, using the following helper
 function:

 splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 
 This works for infinite lists if there is no balancing, but if insert does
 balancing, the top of the map will not be available until the last key
 is seen, so splitSeq' could only be used for finite chunks.  Then you'll
 need a way to put the partial answers together.  It might be possible
 to amortize the cost of that for an appropriate choice of chunk length.
 It would also cost some laziness: the chunked version would work for
 infinite lists, but wouldn't produce all the output for a partial list.

No. The point is that in
   let blueprint = empty
   (_,m) = splitSeq' blueprint $ concat $ repeat [(1,'a'),(2,'b')],
the map m contains only as many keys as there are in blueprint, i.e. not
a single one! After munching the first (1,'a'), the first recursive call
to splitSeq', the returned map m' fulfills
   toList m' == [(1,a...)]

The trick is to throw away all keys from the future, so that there is no
need to wait on them.

Also, if your argument would have been correct, then the version without
balancing wouldn't work either because insert already exchanges Leafs
for Nodes in m. So the top of the map would be unavailable until all
Leafs have been exchanged.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-15 Thread apfelmus
[EMAIL PROTECTED] wrote:
 type BiMap a b = (Map.Map a b, Map.Map b a)

Actually BiMap is not needed at all, it suffices to have

 splitStreams :: Ord a = [(a,b)] - [(a,[b])]
 splitStreams xs =
 takeWhile (not . null . snd) $ toList $ splitStreams' Map.empty xs

 splitStreams' :: Ord a = Map.Map a Position - [(a,b)] - Imp (a,[b])
 splitStreams' map [] =
fmap (const (undefined,[])) $ fromList [1..]
 splitStreams' map ((a,b):xs) =
update fun pos $ splitStreams' map' xs
where
fun ~(_,bs) = (a,b:bs)
sz  = Map.size map
pos = Map.findWithDefault (sz+1) a map
map'=
   (if Map.member a map then id else Map.insert a (sz+1)) map

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-14 Thread Bertram Felgenhauer

I wrote:

[1] Balancing can be done with the information in the blueprint, and
mapping back is easily done by doing the transformation on the tree
in reverse.


I should add that this possibility was the main reason for dealing with
blueprints at all. As Ross Paterson's solution shows, it's possible to
get simpler code without balancing the tree.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-14 Thread Magnus Jonsson
Thanks Bertran for your solution. I have difficulty understanding it so I 
can't comment on it right now but I'll try to have a look at it. Do you 
know of any article that explains this technique, starting from very 
simple examples?


/Magnus

On Thu, 14 Sep 2006, Bertram Felgenhauer wrote:


I wrote:

[1] Balancing can be done with the information in the blueprint, and
mapping back is easily done by doing the transformation on the tree
in reverse.


I should add that this possibility was the main reason for dealing with
blueprints at all. As Ross Paterson's solution shows, it's possible to
get simpler code without balancing the tree.

regards,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Optimization problem

2006-09-14 Thread Bertram Felgenhauer
Magnus Jonsson wrote:
 Thanks Bertran for your solution. I have difficulty understanding it so I 
 can't comment on it right now but I'll try to have a look at it. Do you 
 know of any article that explains this technique, starting from very 
 simple examples?

Not really. It's a result of thinking about lazy evaluation, and
especially lazy patterns (and let bindings) for some time. A wiki article
that helped me a lot to understand these is

  http://www.haskell.org/hawiki/TyingTheKnot

I'd like to point out the trustList function there which uses the idea
of encoding the structure of a term and its actual values in different
arguments, i.e. a blueprint.

HTH,

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Optimization problem

2006-09-14 Thread apfelmus
Bertram Felgenhauer wrote:

 splitSeq' :: Ord a = Map a () - [(a,b)] - ([(a,[b])], Map a [b])
 splitSeq' bp [] = ([], map (const []) bp)
 splitSeq' bp ((a,b):xs) = case lookup a bp bp of
 Just _ - let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m)
 _  - let (bp', m) = insert a bp m'
   (l, m')  = splitSeq' bp' xs
 in
   ((a, b : (fromJust $ lookup a bp' m')) : l, m)
 
 splitSeq' takes a blueprint for a map with all keys seen so far, and
 a list tail. It returns the result list for all new keys, and a map
 (corresponding to the given blueprint) with the tails of the seen
 elements.
 
 The in the base case it just fills the blueprint with empty lists and
 returns to the caller.
 
 If a new element is seen, insert is used to create a new blueprint,
 including the new key a, which is passed to the recursive call of
 splitSeq'. The resulting map from that call is threaded back into
 insert, which gives us a new map without the a key which matches
 the structure of the original blueprint.

Very interesting! So the map with the seen tails matches the blueprint
and therefore throws away information (the future keys) from the future.
This is what effectively allows the key-tree structure to be rebalanced.
   If one would return the tails-map with all future keys, it would be
_|_ because the key-order in the tree depends on the future keys and
changes everytime when a new key is found.

I thought that there can only be a static solution, i.e. like the one
Ross Paterson presented where the structure (I mean the outermost
constructors) of the returned tree are determined before the future.
This obviously excludes rebalancing.

I found a static solution by trying to fit the key-tails pairs into an
infinite tails-map which is filled first comes first:
  map ! 1 := (first key seen, tails)
  map ! 2 := (second key seen, tails)
By keeping another key-position-map around which records where each key
has been inserted, so that the future knows where to put the tails. The
point is that the structure of tails-map is known before the future
comes as its keys are just 1,2,3,... and so on.

It remains to construct such an infinite random access list, but this is
turns out to be even easier than finite random access lists (when using
non-uniform recursion from Okasaki's chapter 10) :)

 data Imp a = Imp a (Imp (a,a)) deriving (Show)

 instance Functor Imp where
fmap h ~(Imp x xs) = Imp (h x) (fmap (\(x,y) - (h x, h y)) xs)

 update :: (a - a) - Position - Imp a - Imp a
 update f 1 ~(Imp x xs) = Imp (f x) xs
 update f n ~(Imp x xs) = Imp x $ update f2 (n `div` 2) xs
where
f2 ~(y,z) = if even n then (f y, z) else (y, f z)

Note that we can use irrefutable patterns without hesitation as there is
only one constructor.

Folding over an infinite thing is strange but here we are

 fold :: (a - b - b) - Imp a - b
 fold f ~(Imp x xs) = f x (fold f2 xs)
where
f2 (x,y) z = f x (f y z)

It's not so strange anymore when we realize that this can be used to
convert it into an infinite list

 toList = fold (:)

The implementation of fromList is fun as well, so I won't tell it. As
fold and unfold can be defined generically for Mu f where f is a
functor, i wonder how to deal with it in the case of non-uniform recursion.


For splitStreams, the key-position-map is needed in both directions, so
we quickly define a bijective map

 type BiMap a b= (Map.Map a b, Map.Map b a)

 switch :: BiMap a b - BiMap b a
 switch (x,y) = (y,x)

with the usual operations (empty, member, size etc.) omitted here.

Now comes splitStreams itself:

 splitStreams :: Ord a = [(a,b)] - [(a,[b])]
 splitStreams xs =
takeWhile (not . null . snd) $ toList $ splitStreams' empty xs

 splitStreams' :: Ord a = BiMap a Position - [(a,b)] - Imp (a,[b])
 splitStreams' bimap [] =
fmap (\i - (findWithDefault undefined i $ switch bimap,[])) $
fromList [1..]
 splitStreams' bimap ((a,b):xs) =
update fun pos $ splitStreams' bimap' xs
where
fun ~(_,bs) = (a,b:bs)
sz  = size bimap
pos = findWithDefault (sz+1) a bimap
bimap'  =
   (if member a bimap then id else insert a (sz+1)) bimap

Note that update actually generates fresh constructors, so the structure
of our tails-Imp is not really static. But this is no problem as the
form of the constructors is completely known: there is only one.

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe