Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1. Re:  Parallel Processing in Libraries? (Sébastien Leblanc)


----------------------------------------------------------------------

Message: 1
Date: Sun, 19 Jan 2020 15:42:51 -0500
From: Sébastien Leblanc <s...@sebleblanc.net>
To: beginners@haskell.org
Subject: Re: [Haskell-beginners] Parallel Processing in Libraries?
Message-ID: <8b786165-cb63-de7f-43f0-33a0b083e...@sebleblanc.net>
Content-Type: text/plain; charset="utf-8"

> I'd like to have a library which utilized parallel programming (mostly for 
> map-reduce tasks). 

Since parallel computation is a complex topic, there are many solutions
that might not apply to a problem. I find that often, the simplest,
often overlooked method of implementing parallel computation is by
breaking down a problem into chunks that can easily be computed by a
single core.

Multi-threading is another topic entirely, and while it is often related
to parallelization, it is also often used only to allocate more cores
(improve performance) rather than implement an algorithm that is
otherwise serial, but then you get to have to deal with all the
difficulty of multi-threading like synchronization, shared memory
access, thread-safe code, having to deal with errors potentially
affecting the whole process vs. a single worker.

Thus, once you break down the problem into individual components, one
way to implement parallelization is by using a message queue task list
system. The principle is pretty simple, using a message broker such as
RabbitMQ or ZeroMQ, workers connect to this message queue and listen to
messages from a controller telling them to process some data using some
function. First come, first served. Once the worker is done, the reply
is sent back to the message queue or it is made available by any other
means (for example in a folder with shared access). The worker could
then notify other workers to further process this data.

The elegance of this system lies in how flexible the actual hardware
architecture that processes the load can be. For example, using some
cloud provider, you can automatically spawn more VMs to process a higher
load of data, and discard the VMs once they get idle long enough. On a
single machine, you can fire up one or as many processes as you have
computing cores (if your workload is almost 100% CPU-bound) and let the
OS take care of scheduling the tasks.

I am not aware of a complete task processing library for Haskell; for an
example of a mature project that provides such features, check out
Celery on Python. If you cannot find a substitute, I suppose you could
always use Celery and run your Haskell code from inside the Python
interpreter.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: pEpkey.asc
Type: application/pgp-keys
Size: 2464 bytes
Desc: not available
URL: 
<http://mail.haskell.org/pipermail/beginners/attachments/20200119/be5e69e2/attachment-0001.key>

------------------------------

Subject: Digest Footer

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners


------------------------------

End of Beginners Digest, Vol 139, Issue 7
*****************************************

Reply via email to