Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Bas van Dijk
On 27 September 2011 01:07, Nicu Ionita nicu.ion...@acons.at wrote:
 I wonder why the transformers library does not use this kind of state monad
 definition.

One disadvantage of ContT and I guess any CPS based monad transformer
is that they interact badly with exception handling functions like
catch.

See [1] for a bug that was caused because of using catch in ContT:

Because of this reason I don't provide a MonadTransControl instance
for ContT in monad-control[2].

Regards,

Bas

[1] http://thread.gmane.org/gmane.comp.lang.haskell.cafe/76262/
[2] http://hackage.haskell.org/package/monad-control

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Yves Parès
Interesting, so what have you used to get that speedup?
A monad stack of ContT and State (*)? Just the Cont monad?

(*) If so, were you using the strict version of State?

Would it be possible to see the differences between the 2 versions of you
code?

2011/9/27 Nicu Ionita nicu.ion...@acons.at

 Hello list,

 Starting from this emails (http://web.archiveorange.com/**
 archive/v/nDNOvSM4JT3GJRSjOm9Phttp://web.archiveorange.com/archive/v/nDNOvSM4JT3GJRSjOm9P
 **) I could refactor my code (a UCI chess engine, with complex functions,
 in which the search has a complex monad stack) to run twice as fast as with
 even some hand unroled state transformer! So from 23-24 kilo nodes per
 second it does now 45 to 50 kNps! And it looks like there is still some
 improvement room (I have to play a little bit with strictness annotations
 and so on).

 (Previously I tried specializations, then I removed a lot of polimorphism,
 but nothing helped, it was like hitting a wall.)

 Even more amazingly is that I could program it although I cannot really
 understand the Cont  ContT, but just taking the code example from Ryan
 Ingram (newtype ContState r s a = ...) and looking a bit at the code from
 ContT (from the transformers library), and after fixing some compilation
 errors, it worked and was so fast.

 I wonder why the transformers library does not use this kind of state monad
 definition. Or does it, and what I got is just because of the unrolling? Are
 there monad (transformers) libraries which are faster? I saw the library
 kan-extensions but I did not understand (yet) how to use it.

 Nicu

 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Ertugrul Soeylemez
Bas van Dijk v.dijk@gmail.com wrote:

 Because of this reason I don't provide a MonadTransControl instance
 for ContT in monad-control[2].

Is that even possible?  I tried hard to come up with just a MonadFix
instance for CPS-based monads, and I failed.  I would think that
MonadTransControl is just as hard, if not even harder.


Greets,
Ertugrul


-- 
nightmare = unsafePerformIO (getWrongWife = sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Bas van Dijk
On 28 September 2011 14:25, Ertugrul Soeylemez e...@ertes.de wrote:
 Bas van Dijk v.dijk@gmail.com wrote:

 Because of this reason I don't provide a MonadTransControl instance
 for ContT in monad-control[2].

 Is that even possible?

I once tried and failed so I believe it's not possible.

Bas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Thomas Schilling
Well, you can get something close with the help of IORefs, but I
forgot the details.  I believe this is the paper that explains it:

Value recursion in the continuation monad by Magnus Carlsson
http://www.carlssonia.org/ogi/mdo-callcc.pdf

On 28 September 2011 15:15, Bas van Dijk v.dijk@gmail.com wrote:
 On 28 September 2011 14:25, Ertugrul Soeylemez e...@ertes.de wrote:
 Bas van Dijk v.dijk@gmail.com wrote:

 Because of this reason I don't provide a MonadTransControl instance
 for ContT in monad-control[2].

 Is that even possible?

 I once tried and failed so I believe it's not possible.

 Bas

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
Push the envelope. Watch it bend.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Nicu Ionita

Am 28.09.2011 14:05, schrieb Yves Parès:

Interesting, so what have you used to get that speedup?
A monad stack of ContT and State (*)? Just the Cont monad?

This is a module with a state monad transformer that I used before (the 
name STPlus is misleading - and sorry for the long email):


{-# LANGUAGE RankNTypes, MultiParamTypeClasses, FlexibleInstances #-}

module Search.SearchMonad (
STPlus,
return, (=),
get, put, gets, modify,
lift, liftIO,
runSearch, execSearch
) where

import Control.Monad
import Control.Monad.State hiding (lift, gets, modify)

newtype STPlus s m a = STPlus { runSTPlus :: s - m (a, s) }
{-# INLINE runSTPlus #-}

instance Monad m = Monad (STPlus s m) where
{-# INLINE return #-}
return v = STPlus (\s - return (v, s))
{-# INLINE (=) #-}
(=)= bindSTPlus

{-# INLINE bindSTPlus #-}
bindSTPlus :: Monad m = STPlus s m a - (a - STPlus s m b) - STPlus s m b
bindSTPlus ms f = STPlus $ \s - case runSTPlus ms s of
 m - m = \(v', s') - case f v' of
fv - 
runSTPlus fv s'


instance Monad m = MonadState s (STPlus s m) where
{-# INLINE get #-}
get   = STPlus $ \s - return (s,  s)
{-# INLINE put #-}
put s = STPlus $ \_ - return ((), s)

instance MonadIO m = MonadIO (STPlus s m) where
{-# INLINE liftIO #-}
liftIO = lift . liftIO

runSearch :: Monad m = STPlus s m a - s - m (a, s)
runSearch = runSTPlus

execSearch ms s = liftM snd $ runSearch ms s

{-# INLINE lift #-}
lift :: Monad m = m a - STPlus s m a
lift m = STPlus $ \s - m = \v - return (v, s)

{-# INLINE gets #-}
gets :: Monad m = (s - a) - STPlus s m a
-- gets f = STPlus $ \s - return (f s, s)
gets f = STPlus $ \s - case f s of fs - return (fs, s)

{-# INLINE modify #-}
modify :: Monad m = (s - s) - STPlus s m ()
modify f = STPlus $ \s - case f s of fs - return ((), fs)

And this is how the module looks now:

{-# LANGUAGE RankNTypes, MultiParamTypeClasses, FlexibleInstances #-}

module Search.SearchMonadCPS (
STPlus,
return, (=),
get, put, gets, modify,
lift, liftIO,
runSearch, execSearch
) where

import Control.Monad
import Control.Monad.State hiding (lift, gets, modify)

newtype STPlus r s m a = STPlus { runSTPlus :: s - (a - s - m r) - m r }

instance Monad (STPlus r s m) where
return a = STPlus $ \s k - k a s
c = f  = STPlus $ \s0 k - runSTPlus c s0 $ \a s1 - runSTPlus (f 
a) s1 k


instance MonadState s (STPlus r s m) where
get   = STPlus $ \s k - k s  s
put s = STPlus $ \_ k - k () s

instance MonadIO m = MonadIO (STPlus r s m) where
{-# INLINE liftIO #-}
liftIO = lift . liftIO

runSearch :: Monad m = STPlus (a, s) s m a - s - m (a, s)
runSearch c s = runSTPlus c s $ \a s0 - return (a, s0)

execSearch ms s = liftM snd $ runSearch ms s

{-# INLINE lift #-}
lift :: Monad m = m a - STPlus r s m a
lift m = STPlus $ \s k - m = \a - k a s

{-# INLINE gets #-}
gets :: Monad m = (s - a) - STPlus r s m a
gets f = STPlus $ \s k - k (f s) s

{-# INLINE modify #-}
modify :: Monad m = (s - s) - STPlus r s m ()
modify f = STPlus $ \s k - k () (f s)

And then I have (in different modules):

Client code (starting an PV search to a given depth):

type CtxIO = ReaderT Context IO

bestMoveCont :: Int - MyState - Maybe Int - [Move] - [Move] - CtxIO 
IterResult

bestMoveCont ... = do
 ...
 ((sc, path, rmvsf), statf) - runSearch (alphaBeta abc) stati
 ...

Search framework:

class Monad m = Node m where
staticVal :: m Int  -- static evaluation of a node
materVal  :: m Int  -- material evaluation (for prune purpose)
genEdges :: Int - Int - Bool - m ([Move], [Move])  -- generate 
all legal edges

genTactEdges :: m [Move]  -- generate all edges in tactical positions
...
type Search m a = forall r. STPlus r PVState m a

alphaBeta :: Node m = ABControl - m (Int, [Move], [Move])
alphaBeta abc = do
let !d = maxdepth abc
rmvs = Alt $ rootmvs abc
lpv  = Seq $ lastpv abc
searchReduced a b = pvRootSearch a   b  d lpv rmvs True
searchFull= pvRootSearch salpha0 sbeta0 d lpv rmvs False
r - if useAspirWin
...
pvRootSearch :: Node m = Int - Int - Int - Seq Move - Alt Move - Bool
 - Search m (Int, Seq Move, Alt Move)
...

And then the chess specific implementation of the game state in another 
module:


type Game r m = STPlus r MyState m
...
instance CtxMon m = Node (Game r m) where
staticVal = staticVal0
materVal  = materVal0
genEdges = genMoves
...
genMoves :: CtxMon m = Int - Int - Bool - Game r m ([Move], [Move])
genMoves depth absdp pv = do

Nicu

(*) If so, were you using the strict version of State?

Would it be possible to see the differences between the 2 versions of 
you code?


2011/9/27 Nicu Ionita nicu.ion...@acons.at mailto:nicu.ion...@acons.at

Hello list,

Starting from this emails

Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-28 Thread Nicu Ionita

Am 28.09.2011 02:35, schrieb Ryan Ingram:
My guess is that Cont plays really nicely with GHC's inliner, so 
things that end up looking like


 return x = \y - ...

get optimized really well

return x = f
-- inline =
= ContState $ \s0 k - runCS (return x) s0 $ \a s1 - runCS (f a) s1 k
-- inline return
= ContState $ \s0 k - runCS (ContState $ \s2 k2 - k2 x s2) s0 $ 
\a s1 - runCS (f a) s1 k

-- runCS record selector
= ContState $ \s0 k - (\s2 k2 - k2 x s2) s0 $ \a s1 - runCS (f 
a) s1 k

-- beta
= ContState $ \s0 k - (\k2 - k2 x s0) $ \a s1 - runCS (f a) s1 k
-- beta
= ContState $ \s0 k - (\a s1 - runCS (f a) s1 k) x s0
-- beta
= ContState $ \s0 k - runCS (f x) s0 k

and then further inlining of f can take place.


I was even thinking - and this would have been the next idea to try if I 
couldn't get your example code to run so fast - to define some rules for 
the state monad (transformer) to fuse such expressions like


m = f = g = ...

or even

modify f = modify g = modify (g . f)

and perhaps other variations, so that it would perhaps end up in some 
nice combination of f and g, avoiding the intermediate tuples, hopefully 
with better performance. But then I did not follow it, and I want to 
concentrate on further improvements with the new code. The way is still 
long, because the top engines (written in C or C++) can do about 10 mil 
nps on my machine :-)


Nicu

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-27 Thread Ryan Ingram
My guess is that Cont plays really nicely with GHC's inliner, so things that
end up looking like

 return x = \y - ...

get optimized really well

return x = f
-- inline =
= ContState $ \s0 k - runCS (return x) s0 $ \a s1 - runCS (f a) s1 k
-- inline return
= ContState $ \s0 k - runCS (ContState $ \s2 k2 - k2 x s2) s0 $ \a s1
- runCS (f a) s1 k
-- runCS record selector
= ContState $ \s0 k - (\s2 k2 - k2 x s2) s0 $ \a s1 - runCS (f a) s1
k
-- beta
= ContState $ \s0 k - (\k2 - k2 x s0) $ \a s1 - runCS (f a) s1 k
-- beta
= ContState $ \s0 k - (\a s1 - runCS (f a) s1 k) x s0
-- beta
= ContState $ \s0 k - runCS (f x) s0 k

and then further inlining of f can take place.

On Mon, Sep 26, 2011 at 4:07 PM, Nicu Ionita nicu.ion...@acons.at wrote:

 Hello list,

 Starting from this emails (http://web.archiveorange.com/**
 archive/v/nDNOvSM4JT3GJRSjOm9Phttp://web.archiveorange.com/archive/v/nDNOvSM4JT3GJRSjOm9P
 **) I could refactor my code (a UCI chess engine, with complex functions,
 in which the search has a complex monad stack) to run twice as fast as with
 even some hand unroled state transformer! So from 23-24 kilo nodes per
 second it does now 45 to 50 kNps! And it looks like there is still some
 improvement room (I have to play a little bit with strictness annotations
 and so on).

 (Previously I tried specializations, then I removed a lot of polimorphism,
 but nothing helped, it was like hitting a wall.)

 Even more amazingly is that I could program it although I cannot really
 understand the Cont  ContT, but just taking the code example from Ryan
 Ingram (newtype ContState r s a = ...) and looking a bit at the code from
 ContT (from the transformers library), and after fixing some compilation
 errors, it worked and was so fast.

 I wonder why the transformers library does not use this kind of state monad
 definition. Or does it, and what I got is just because of the unrolling? Are
 there monad (transformers) libraries which are faster? I saw the library
 kan-extensions but I did not understand (yet) how to use it.

 Nicu

 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Much faster complex monad stack based on CPS state

2011-09-26 Thread Nicu Ionita

Hello list,

Starting from this emails 
(http://web.archiveorange.com/archive/v/nDNOvSM4JT3GJRSjOm9P) I could 
refactor my code (a UCI chess engine, with complex functions, in which 
the search has a complex monad stack) to run twice as fast as with even 
some hand unroled state transformer! So from 23-24 kilo nodes per second 
it does now 45 to 50 kNps! And it looks like there is still some 
improvement room (I have to play a little bit with strictness 
annotations and so on).


(Previously I tried specializations, then I removed a lot of 
polimorphism, but nothing helped, it was like hitting a wall.)


Even more amazingly is that I could program it although I cannot really 
understand the Cont  ContT, but just taking the code example from Ryan 
Ingram (newtype ContState r s a = ...) and looking a bit at the code 
from ContT (from the transformers library), and after fixing some 
compilation errors, it worked and was so fast.


I wonder why the transformers library does not use this kind of state 
monad definition. Or does it, and what I got is just because of the 
unrolling? Are there monad (transformers) libraries which are faster? I 
saw the library kan-extensions but I did not understand (yet) how to use it.


Nicu

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe