Re: [haskell-pipes] Re: Parallelizing fold of Producer

Pierre Radermecker Thu, 05 Nov 2015 08:44:07 -0800

My use case is the following: 10 to 60 independent IO computations
that takes about if not less than a quarter of sec each; to be run on
machine with 2 to 4 CPUs max.


What would me my best bet performance wise ?
- the usual sequential `mapM` (casual testing seems to suggest it is
less optimal but the difference is not that big either)
- parallel-io (casual testing seems to suggest it is the more optimal option)
- mapConcurrently (for some reasons always a bit less optimal compared
to parallel-io)
- parMonad or another form of "true parallelism" (my feeling is that
it does not really fit the use case because there isn't a lot of core
CPUs and the worker tasks doesn't take much time either)

As a note (not using any -Nx seems to offer the best speed on
multithreading CPUs).

I guess only benchmarking  would tell for sure but in case one of you
has some tips/advises on that matter ...


On Thu, Nov 5, 2015 at 5:21 PM, Pierre Radermecker
<p.radermec...@gmail.com> wrote:
> There is also parallel-io. FWIW I have a project where `parallel-io`
> seems to give me better performance compared to `mapConcurrently` from
> async. Though I was willing to believe the opposite as I was about to
> replace the first with the second (`parallel-io` is an older project).
> I have never really understood why nor playing with more accurate
> benchmarking for it.
>
> I would be interested in hearing any practical experience on the
> subject. I guess the par monad is really interesting when you have
> more free cores available for your application (is there a good lucky
> number here ?)
>
>
>
> On Thu, Nov 5, 2015 at 2:22 PM, Michael Thompson
> <practical.wis...@gmail.com> wrote:
>> Are you thinking of regular pure parallelism, as with `parallel` or
>> `monad-par` or of something fancier like the work stealing example in the
>> pipes concurrency tutorial (which isn't itself appropriate here, I think,
>> since the order of events is important)?
>>
>>
>> If you are thinking of pure parallelism here is a flat-footed approach.  In
>> choosing a batch size you would be surveying the whole producer, so you
>> can't think inside the pipeline. You can first freeze each batch to a list
>> or something, say
>>
>>
>>
>>      batched :: Monad m => Int -> Producer a m x -> Producer [a] m x
>>
>>      batched n p = L.purely folds L.list (view (chunksOf n) p)
>>
>>
>> then resume piping with something like
>>
>>
>>     >>> :t \n f p -> batched n p >-> P.mapM (runParIO . parMap f) >->
>> P.concat      --  or P.map (runPar . parMap f)
>>
>>     \n f p -> batched n p >-> P.mapM (runParIO . parMap f) >-> P.concat
>>
>>      :: NFData c =>
>>
>>          Int -> (a -> c) -> Producer a IO r -> Producer IO r
>>
>>
>> The equivalent could be done with `async`.  You'd have to think out whether
>> waiting to accumulate a batch and then processing simultaneously and
>> continuing would be an improvement on processing blocks as they come.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Haskell Pipes" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to haskell-pipes+unsubscr...@googlegroups.com.
>> To post to this group, send email to haskell-pipes@googlegroups.com.
>
>
>
> --
> Pierre



-- 
Pierre

-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.

Re: [haskell-pipes] Re: Parallelizing fold of Producer

Reply via email to