Yeah, feel free to reproduce it anywhere you like > On Jun 12, 2017, at 3:40 AM, [email protected] wrote: > > Hello, Gabriel! Thank a lot for such detailed explanation. It's so good that > I want to save it in my blog (with author name sure), but only if you do not > mind and allow me to do it. > > Best regards, > Paul > > понедельник, 12 июня 2017 г., 6:23:29 UTC+3 пользователь Gabriel Gonzalez > написал: > I think it's important to distinguish between two separate concepts: "fusion" > vs "one-pass". "Fusion" refers to avoiding the allocation of intermediate > data structures when you transform a stream multiple times whereas "one-pass" > means that you don't traverse the sequence of elements twice when you > transform the stream multiple times (i.e. you go over the stream in one > pass). You can have a "one-pass" implementation without "fusion" but you > cannot have "fusion" without a "one-pass" implementation. > > "Fusion" is purely an optimization, meaning that whether or not an > implementation uses "fusion" only affects your program's performance but > won't affect its behavior. However, "one-pass" is not just an optimization: > one-pass versus multiple pass changes the behavior of your program, > especially once your stream has effects like in these streaming libraries. > > Out of the two properties, "one-pass" is *much* more important. The reason > why is that "one-pass" ensures that certain functor laws hold. To see why, > let's consider a case where they *don't* hold, which is `Data.List.mapM`. > Normally, you mtigh expect the following functor laws to hold: > > Data.List.mapM (f <=< g) = Data.List.mapM f . Data.List.mapM g > > Data.List.mapM return = id > > However, the above two laws don't actually hold for `Data.List.mapM`. For > example, the first law does not hold because the order of effects are not the > same for the left-hand and right-hand sides of the equation. The left-hand > side of the equation interleaves the effects of `f` and `g` whereas the > right-hand side runs all of `g`'s effects first followed by all of `f`'s > effects. The second equation is also wrong because `Data.List.mapM` > misbehaves on infinite lists: > > Data.List.mapM return (repeat x) = _|_ > > id (repeat x) = repeat x > > This is the root of why `Data.List.mapM` is "bad" > > However, the streaming libraries have their own versions of `mapM` which do > obey the above functor laws. For example, if you take the `list-transformer` > library and define: > > mapM :: (a -> m b) -> ListT m a -> ListT m b > mapM f as = do > a <- as > lift (f a) > > ... then this *does* obey the following functor laws: > > mapM (f <=< g) = mapM f . mapM g > > mapM return = id > > For the first equation, both sides of the equation interleave the effects of > `f` and `g`. For the second equation, both sides of the equation behave > correctly on infinite `ListT` streams. These functor laws hold because > `ListT` is has the "one-pass" property. > > So to answer your question: it's not exactly the Haskell `Functor` type class > per se that is important here, but the functor laws are important (for a more > general notion of functor) in establishing why a single pass implementation > matters. > > > On Wed, Jun 7, 2017 at 12:02 PM, aqu... <[email protected] <javascript:>> > wrote: > > They will compile successive transformations (i.e. multiple > > maps/mapMs/filters, for example) into a single pass over the data. > > This is true for the > > `streaming`/`pipes`/`list-transformer`/`logict`/`conduit`/`turtle` > > libraries or any "ListT-done-right" implementation. All of these > > libraries do not rely on any special GHC optimizations or rewrite > > rules. They will still go over the data in one pass even if you > > disable all optimizations. > > So, this is so, because of implementation of their Functor and > Applicative instances? Am I right, that there occurs this combination > ("single pass")? > > But what about Data.List and Prelude's `map` (in GHC.Base)? If I'll > do > > map fn1 (map fn2 lst) > > ? > > > > > This is *not* true for `Control.Monad.mapM` or the `ListT` type from > > `transformers` (a.k.a. "ListT done wrong"), which is why I wanted to > > make sure which `mapM` you were talking about. > > OK, I got it. > > > > > > > > On Jun 7, 2017, at 12:32 AM, [email protected] <javascript:> wrote: > > > > > > Hello, Gabriel! > > > > > > Real module (which is compilable) is big, so I'll show on email's > > > top extraction from it (little snippets, with imports and > > > functions) and will add compilable module (I used it for > > > experiments) at the end of the email - because it's not very > > > small :) So, compilable code is on the bottom of the mail. > > > > > > Snippets: > > > > > > ... > > > > > > import Streaming > > > import qualified Streaming.Prelude as S > > > > > > ... > > > > > > (|>) = flip ($) > > > > > > ... > > > > > > flow = items > > > |> S.mapM (liftIO . doRequest connection) > > > |> S.map (liftM fun1) > > > |> S.zip fun2 > > > |> S.filter filter1 > > > |> S.mapM fun3 > > > > > > ... > > > > > > So, I supposed that serial "map"s/"zip"s/"filter"s will be compiled > > > into one loop with serial application of functions-arguments of > > > that "map"s/"zip"s/"filter"s. My Python background "says" me that > > > maps/filters/etc are types, not functions, so they are > > > "combinatorable": I mean no problem to combine serial flat > > > iterations into one iteration. I am newbie in Haskell and I don't > > > know is it true for Haskell... may be it should happens on > > > optimization phase of compilation, I don't know. > > > > > > Does it happen automatically, on GHC side, or developer should take > > > special actions when works with Streaming/Pipes/Conduits? Is it the > > > same for usual maps/filters from Prelude/Data.List? I supposed 99% > > > that GHC reduces ASTs then optimizes the result and we get at the > > > end flat simple iterations and so on, and developer should not > > > think about such things totally. But I had a doubt crept in, so I > > > decided to ask... > > > > > > Another example is: > > > > > > where flow = getItems connection > > > |> S.filter getInteresting > > > |> S.map fun1 > > > |> fixItems fun2 > > > |> S.mapM fun3 > > > |> S.mapM fun4 > > > ... > > > > > > where fixItems iterates over stream's items and "yields" them with > > > S.yield, like this: > > > > > > ... > > > > > > import qualified Control.Monad.Trans.State as > > > ST import qualified Control.Monad.Trans.Writer as W > > > > > > ... > > > > > > fixItems :: Monad m => FixItem e ps st -> Stream (Of e) (ST.StateT > > > st m) Result -> Stream (Of e) (ST.StateT st m) Result fixItems fi = > > > loop where loop str = do > > > e <- lift $ S.next str > > > e' <- case e of > > > Left err -> return err > > > Right (e', str') -> > > > let (fixes, problems) = W.runWriter $ fixItem fi e' > > > fix = mconcat fixes > > > in (lift $ ST.modify $ reportProblems fi problems e') > > > >> case fix of > > > ItemFix ff -> (S.yield $ ff e') > > > ItemSkip -> pure () > > > >> loop str' > > > return NoResult -- FIXME how to return stream result? > > > ... > > > > > > > > > Next is compilable module. I experimented with "fixing" of stream's > > > items; code is terrible, I suppose, but it was experiment ;): > > > > > > {-# LANGUAGE FlexibleContexts #-} > > > module Main where > > > > > > import Control.Monad > > > import Control.Monad.Trans.Reader > > > import Control.Monad.Trans.State > > > import Control.Monad.Trans.Writer > > > import Data.Functor.Identity > > > import Data.Traversable > > > import Streaming > > > import qualified Streaming.Prelude as S > > > > > > > > > type M = StateT [String] IO > > > > > > subgen :: Int -> S.Stream (S.Of Int) M Int > > > subgen n = > > > if odd n then do { return 0 } > > > else do > > > S.yield (n*100) > > > S.yield (n*1000) > > > lift $ modify (++["!!!"]) > > > return 0 > > > > > > gen :: S.Stream (S.Of Int) M Int > > > gen = do > > > S.yield 0; S.yield 100; S.yield 101; S.yield 102; S.yield 103; > > > S.yield 104; S.yield 105; S.yield 106 liftIO $ putStrLn "enter x: " > > > x <- liftIO getLine > > > let n = read x::Int > > > S.yield n > > > lift $ modify (++["gen1"]) > > > lift $ modify (++["gen2"]) > > > return 0 -- $ do > > > > > > proc1 :: S.Stream (S.Of Int) M Int -> S.Stream (S.Of Int) M Int > > > proc1 str = do > > > st <- lift get > > > loop str st > > > where > > > loop str st = do > > > e <- lift $ S.next str > > > e' <- case e of > > > Left err -> return err > > > Right (e', str') -> > > > (if e' == 100 then (lift $ put $ st ++ ["proc1"]) else > > > pure ()) >> subgen e' >> (S.yield $ e' + 123) >> loop str' st > > > return 1 > > > > > > data Cr = Cr { > > > crName :: String > > > , crAge :: Int > > > } deriving Show > > > > > > data FixItem a = FixItem (a -> a) | SkipItem > > > instance Monoid (FixItem a) where > > > mempty = FixItem id > > > mappend SkipItem _ = SkipItem > > > mappend _ SkipItem = SkipItem > > > mappend (FixItem f) (FixItem g) = FixItem (f . g) > > > > > > fixWhen :: Monad m => Bool -> m (FixItem a) -> m (FixItem a) > > > fixWhen cond act = if cond then act else return $ FixItem id > > > > > > fixcr1 :: Int -> Writer [String] (FixItem Int) > > > fixcr1 n = > > > let (fixes, errs) = runWriter $ sequence [ > > > fixWhen (n == 100) (tell ["panic:"++show n++"==100"] >> > > > return SkipItem) , fixWhen (n > 100) (tell ["err1:"++show n > > > ++">100"] >> return (FixItem (10+))) , fixWhen (n > 1) (tell > > > ["err2:"++show n++">1"] >> return (FixItem (100+))) ] > > > in > > > writer (mconcat fixes, errs) > > > > > > fixItems :: (Monad m, Num a) => Stream (Of Int) (StateT [String] m) > > > a -> Stream (Of Int) (StateT [String] m) a fixItems = loop where > > > loop str = do > > > e <- lift $ S.next str > > > e' <- case e of > > > Left err -> return err > > > Right (e', str') -> > > > let (fix, errs) = runWriter $ fixcr1 e' > > > in (lift $ modify (++errs)) > > > >> case fix of > > > FixItem ff -> (S.yield $ ff e') > > > SkipItem -> pure () > > > >> loop str' > > > return 1 > > > > > > > > > (|>) = flip ($) > > > > > > main :: IO () > > > main = do > > > p <- runStateT flow [] > > > print p > > > print "end." > > > where flow = gen > > > |> fixItems > > > |> S.map (+100) > > > |> S.mapM_ (liftIO . print) > > > > > > and to build it I included in cabal 2 dependencies: > > > ... > > > , transformers >= 0.5 > > > , streaming > > > ... > > > > > > > > > /Best regards, Paul > > > > > > > > > Could you provide an example that compiles, including imports? The > > > reason I ask is that I'm not sure which mapM that you are using > > >> On Jun 6, 2017, at 1:53 AM, aqu...@ <>gmail.com <http://gmail.com/> > > >> <http://gmail.com/ <http://gmail.com/>> wrote: > > >> > > >> Hello, everyone. Because I'm newbie, my question may seem naive. > > >> but: I'm doing something like: > > >> do > > >> str' <- S.mapM fun1 str > > >> str'' <- S.mapM fun2 str' > > >> -- so on > > >> and I suppose that real iteration (if I use `mapM` or `iterM`, map > > >> other functions of streaming) happens only one and resulting code > > >> after compilation will looks like for item in str: item' = fun1 > > >> item item'' = fun2 item' > > >> -- so on > > >> Am I right? Or are there some pitfalls here which we should > > >> remember to accomplish such result? > > >> > > >> > > >> /Best regards, Paul > > >> > > >> > > >> > > >> > > >> -- > > >> You received this message because you are subscribed to the Google > > >> Groups "Haskell Pipes" group. To unsubscribe from this group and > > >> stop receiving emails from it, send an email to haskell-pipe...@ > > >> <>googlegroups.com <http://googlegroups.com/> <http://googlegroups.com/ > > >> <http://googlegroups.com/>>. To post to this > > >> group, send email to haskel...@ <>googlegroups.com > > >> <http://googlegroups.com/> > > >> <http://googlegroups.com/ <http://googlegroups.com/>>. > > > > > > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "Haskell Pipes" group. To unsubscribe from this group and > > > stop receiving emails from it, send an email to > > > [email protected] <javascript:> > > > <mailto:[email protected] <javascript:>>. To > > > post to > > > this group, send email to [email protected] <javascript:> > > > <mailto:[email protected] <javascript:>>. > > > > > > -- > Best regards, > Paul a.k.a. 6apcyk > > -- > You received this message because you are subscribed to the Google Groups > "Haskell Pipes" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To post to this group, send email to [email protected] <javascript:>. > > > -- > You received this message because you are subscribed to the Google Groups > "Haskell Pipes" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>.
-- You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected].
