I'll make a random comment... Tim Newsham wrote: > I am writing some code with complex nested state. I have a question > about performance with respect to the State monad and the Reader monad.
A side comment is this code: > instance MonadReader s (State s) where > ask = get > local f m = fmap (fst . runState m . f) get or > local f m = do s <- get > put (f s) > a <- m > put s > return a With an instance like this you can mix in generic MonadReader code. Feel free to replace State with your own instance of MonadState. > This is somewhat long, so a quick summary of the question up front: > Can the compiler optimize out state updates that dont change the > state? I am not the compiler expert, but I think this is unlikely. For GHC you should compile with optimizations and dump the Core code to see if it does what you want. (-ddump-simp option I think). > The question inline with the code: > >> module Test where >> import Control.Monad.State >> import Control.Monad.Reader > > I have some nested state. This is a simplified example where > we have one record inside another. In my real-world example there > is more nesting and there are lists and maps involved as well. > >> data T1 = T1 { f1 :: Int, f2 :: T2 } deriving(Show) >> data T2 = T2 { f3 :: Int, f4 :: Int } deriving(Show) > > I want to build generic modifiers and reuse them often. > A good example is modifying a numeric value: > >> adjNum :: (Num a) => a -> State a () >> adjNum n = modify (+ n) This is not much shorthand. But by way of example it is fine. > > I'm going to be writing state code for my T1 structure which is my > master structure. If I'm going to be able to reuse adjNum I am ^^^^^^^^^^^^^ This is a slightly odd engineering goal. > going to have to run a nested state action inside an enclosing ^^^^^^^^^^^^^^^^^^^^^^^^^ You want performance but this pushes more work to the compiler. If this were all functional instead of Monadic it might be simpler to start with. > state monad. I can build a lifter that does this as long as > I know how to extract the nested state and set it back in the > enclosing state: > >> withInnerM :: (o -> i) -> (i -> o -> o) -> State i a -> State o a >> withInnerM gettor settor act = do >> outer <- get >> let inner = gettor outer >> (ret, inner') = runState act inner >> outer' = settor inner' outer >> put outer' >> return ret > > Now we can make lifters for each of the fields: > >> withF1M = withInnerM f1 (\f r -> r {f1=f}) >> withF2M = withInnerM f2 (\f r -> r {f2=f}) >> withF3M = withInnerM f3 (\f r -> r {f3=f}) >> withF4M = withInnerM f4 (\f r -> r {f4=f}) Main *main comment* is to separate manipulating the complex data structure from the State commands. And to make it more abstract: get1,get3,get4 :: T1 -> Int get2 :: T1 -> T2 get1 = f1 get2 = f2 get3 = f3 . get2 get4 = f4 . get2 put1,put3,put4 :: Int -> T1 -> T1 put2 :: T2 -> T1 -> T1 put1 x o = o {f1=x} put2 x o = o {f2=x} put3 x o = mod2 (\o2 -> o2 {f3=x}) o put4 x o = mod2 (\o2 -> o2 {f4=x}) o mod1,mod3,mod4 :: (Int->Int) -> T1 -> T1 mod2 :: (T2->T2) -> T1 -> T1 mod1 f o = put1 (f (get1 o)) o mod2 f o = put2 (f (get2 o)) o mod3 f o = put3 (f (get3 o)) o mod4 f o = put4 (f (get4 o)) o And note the different but important design choice: You don't need to know how the data is nested to access a field. If you insert a T1'and'a'half data field "between" T1 and T2 then you just need to update get2/put2 to fix (get|put|mod)(3|4) and this also fixes all the code that uses any of these functions. > which lets us write some state code for T1 using building blocks like > adjNum. For example, the following code will add a value to > f1, add another value to f2's f3 and finally return the value of f2's > f4: > >> tweakT1 :: Int -> Int -> State T1 Int >> tweakT1 v1 v3 = do >> withF1M $ adjNum v1 >> withF2M $ withF3M $ adjNum v3 >> withF2M $ withF4M $ get The above would break if you added T1'and'a'half since it needs to know the structure of the data. This is why withF3M is not a good abstraction. Now for my version of tweakT1: -- My choice is to use a strict modify that also returns the new value modify' :: (MonadState a m) => (a -> a) -> m a modify' f = do x <- liftM f get put $! x return x -- Now tweakT1 can be a one-liner or longer tweakT1,tweakT1'long :: (MonadState T1 f) => Int -> Int -> f Int tweakT1 v1 v3 = liftM get4 (modify' (mod1 (+ v1) . mod3 (+ v3))) tweakT1'long v1 v3 = do modify (mod1 (+ v1)) modify (mod3 (+ v3)) liftM get4 get If you want something like adjNum, how about adjNum' :: (Num a) => a -> ((a->a) -> s->s) -> State s a adjNum' x mod = modify' (mod (+ x)) tweakT1'adj v1 v3 = do adjNum' v1 mod1 adjNum' v2 mod3 liftM get4 get The compiler may or may not optimize the mod1 and mod3 into one T1 construction instead of two. If you modify parts 1 and 3 of the state in tandem a lot then mod13 g1 g3 o@(T1 {f1=v1,f2=v2@(T2 {f3=v3})}) = o {f1=g1 v1,f2=v2 {f3=g3 v3}} will certainly avoid reconstructing T1 twice. > My question here has to do with efficiency. In order to update > f2's f3 a new T2 had to be constructed and used to construct a > new T1. There's no way around this (I assume). True. The number of T1's and T2's constructed is the issue. Reading the -dump-simpl Core text is the best way to check. Using {-# INLINE ... #-} could help. > But when doing > a mere get of f4, there's no reason why we should have to build a > new T2 and then a new T1 since nothign changed. But that's how > the code is written. The "run a nested state action" decision combined with always building on withInnerM with calls put is the culprit. > We could write a different lifter that does not perform a state > put on the return path. If we did this with the state monad then > someone could accidentally use that non-modify version of the lifter > and lose an update. So instead I chose to use the Reader monad and > let the type system enforce the difference between lifters that > modify (M) and those that just read (R). This involved two kinds > of lifters. > > The first lifter runs a Reader action inside of a State monad: > >> withRead :: Reader s a -> State s a >> withRead act = do >> s <- get >> return $ runReader act s That type signature is withReader :: Reader T1 a -> State T1 a withReader act = liftM (runReader act) ask -- or withReader act = liftM (runReader act) get but this is not directly helpful. Running nested monads is not the best course. > > The second lifter runs a nested Reader action inside of an enclosing > Reader monad. It is similar in spirit to withInnerM: > >> withInnerR :: (o -> i) -> Reader i a -> Reader o a >> withInnerR gettor act = do >> i <- asks gettor >> return $ runReader act i > > again we can generate lifters for each field: > >> withF1R = withInnerR f1 >> withF2R = withInnerR f2 >> withF3R = withInnerR f3 >> withF4R = withInnerR f4 > > And finally we can use a reader monad to eliminate the extra state > updates from our previous tweakT1 implementation: > >> tweakT1' :: Int -> Int -> State T1 Int >> tweakT1' v1 v3 = do >> withF1M $ adjNum v1 >> withF2M $ withF3M $ adjNum v3 >> withRead $ withF2R $ withF4R $ ask And that was a very roundabout way to derive liftM get4 == withRead . withF2R . withF4R >> main = do >> let x = T1 1 (T2 3 4) >> print $ runState (tweakT1 5 6) x >> print $ runState (tweakT1' 5 6) x > > My question here is: is it worth it? > > If I run > >> v1 = T2 1 1 >> v2 = v1 {f3 = 1} > > is the compiler smart enough to notice that the record update doesn't > result in a change, and avoid constructing an entirely new T2? > If so, then I think the original implementation TweakT1 would be > about as efficient as the more complicated TweakT1'. Otherwise, > I think the latter would be a lot more efficient when the state is > large and complex. > > Tim Newsham > http://www.thenewsh.com/~newsham/ > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe