Re: [Haskell] select(2) or poll(2)-like function?

Jeremy Gibbons Mon, 18 Apr 2011 05:05:10 -0700

Please can this discussion be moved to haskell-cafe?

  http://www.haskell.org/haskellwiki/Mailing_Lists


Ta.
Jeremy

On 18 Apr 2011, at 12:55, Mike Meyer wrote:

On Mon, 18 Apr 2011 12:56:39 +0200
Ertugrul Soeylemez <[email protected]> wrote:

Mike Meyer <[email protected]> wrote:

On Mon, 18 Apr 2011 11:07:58 +0200
Johan Tibell <[email protected]> wrote:

On Mon, Apr 18, 2011 at 9:13 AM, Mike Meyer <[email protected]> wrote:

I always looked at it the other way 'round: threading is a hack to
deal with system inadequacies like poor shared memory performance
or an inability to get events from critical file types.

Real processes and event-driven programming provide a more robust,
understandable and scalable solutions.
<end rant>


We need to keep two things separate: threads as a way to achieve
concurrency and as a way to achieve parallelism [1].


Absolutely. Especially because you shouldn't have to deal with

concurrency if all you want is parallelism. Your reference [1]coverswhy this is the case quite nicely (and is essentially theargument for

"understandable" in my claim above).


You also don't need Emacs/Vim, if all you want is to write a simple

plain text file. There is nothing wrong with concurrency, becauseyou

are confusing the high level model with the low level implementation.

Concurrency is nothing but a design pattern, and GHC shows that ahigh

level design pattern can be mapped to efficient low level code.


Possibly true. The question is - can it be mapped to a design that's
as robust and scalable as the ones I'm used to working on?

In Haskell you should not use explicit, manual OS threading/forking for
the same reason you shouldn't write machine code manually.


That's a good thing - providing it doesn't compromise robustness and
scalability.

It's useful to use non-determinism (i.e. concurrency) to model a
server processing multiple requests. Since requests are independent
and shouldn't impact each other we'd like to model them as
such. This implies some level of concurrency (whether using threads
and processes).


But because the requests are independent, you don't need concurrency
in this case - parallelism is sufficient.

Perhaps Haskell is the wrong language for you. How aboutprogramming in

C/C++?  I think you want more control over low level resources than

Haskell gives you. But I suggest having a closer look atconcurrency.


Personally, I don't want to have to worry about low-level resources,
or even concurrency. Having to do so feels to much like having to
explicitly allocate and free memory, or worry about register
allocations. But if I have to do those things to get robustness and
scalability until the languages start being able to deal with it, then
I need the RTS to get out of the way and let me do my job.

If I'm using a value that needs protection from concurrent access
without providing that protection, I want the system give me an
error. At run-time is acceptable, but compile time is better. I want
the system to make sure the concurrent protection mechanisms work
properly - no deadlocks, no stuck process, etc - without my having to
do anything but indicate which values need such protection.

The unix process model works quite well. Compared to a threadedmodel,this is more robust (if a process breaks, you can kill andrestart it

without affecting other processes, whereas if a thread breaks,
restarting the process and all the threads in it is the only safe
option) and scalable (you're already doing ipc, so moving processes
onto more systems is easy, and trivial if you design for it). The
events handled by a single process are simple enough that your
callback/event spaghetti can line up in nice, straight strands.

When writing concurrent code you don't care about how the RTS mapsit to

processes and threads.  GHC chose threads, probably because they are
faster to create/kill and consume less memory.  But this is an
implementation detail the Haskell developer should not have to worry
about.


So - what happens when a thread fails for some reason? I'm used to
dealing with systems that run 7x24 for weeks or even months on
end. Hardware hiccups, network failures, bogus input, hung clients,
etc. are all just facts of life. I need the system to keep running
properly in the face of all those, and I need them to disrupt the
world as little as possible.

Given that the RTS has taken control over this stuff, I sort of expect
it to take care of noticing a dead process and restarting it as
well. All of which is fine by me.

We don't need to do this. We can keep a concurrent programmingmodel
and get the execution efficiency of an event driven model. This is
what GHC's I/O manager achieves. On top of that we also get
parallelism for free. Another way to look at it is that GHCprovides
the scheduler (using a thread for the event loop and a separate
worker pool) that you end up writing manually in event driven
frameworks.


So my question is - can I still get the robustness/scalability
features I get from the unix process model using haskell? In
particular, it seems like ghc starts threads I don't ask it to, and

using both threads & forks for parallelism causes even moreheadaches

than concurrency (at least on unix & unix-like systems), so just
replicating the process model won't work well. Do any of the haskell
parallel processing tools work across multiple systems?


Effectively no (unless you want to use the terribly outdated GPH

project), but that's a shortcoming of the current RTS, not of thedesignpatterns you use in Haskell. By design Haskell programs are wellsuited

for an auto-distributing RTS.  It's just that no such RTS exists for
recent versions of the common compilers.


So is anyone working on such a package for haskell? I know clojure's
got some people working on making STM work in a distributed
environment, but that's outside the goals of the core team.

In other words: Robustness and scalability should not be yourbusiness
in Haskell.  You should concentrate on understanding and using the
concurrency concept well.  And just to encourage you:  I write
productive concurrent servers in Haskell, which scale very well and
probably better than an equivalent C implementation would.Reason: AHaskell thread is not mapped to an operating system thread (unlessyouused forkOS). When it is advantageous, the RTS can well decide tolet
another OS thread continue a running Haskell thread.  That way the
active OS threads are always utilized as efficiently as possible.  It
would be a pain to get something like that with explicit threadingand
even more, when using processes.


Well, *someone* has to worry about robustness and scalability. Users
notice when their two minute system builds start taking four minutes
(and will be at my door wanting me to fix it) because something didn't
scale fast enough, or have to be run more than once because a failing
component build wasn't restarted properly. I'm willing to believe that
haskell lets you write more scalable code than C, but C's tools for
handling concurrency suck, so that should be true in any language
where someone actually thought about dealing with concurrency beyond
locks and protected methods. The problem is, the only language I've
found where that's true that *also* has reasonable tools to deal with
scaling beyond a single system is Eiffel (which apparently abstracts
things even further than haskell - details like how concurrency is
achieved or how many concurrent operations you can have are configured
when you start an application, *not* when writing it). Unfortunately,
Eiffel has other problems that make it undesirable.

That's why the RTS lets you choose the number of OS threads onlyinstead
of giving you low level control over the threads.  It spawns as many
threads as you ask it to spawn and manages them with its ownstrategy.
The only way to manipulate this strategy is by deciding whether a
particular Haskell thread is bound (forkOS) or not (forkIO).


Does the programmer have to worry about such trivia as the number of
threads to use?

    <mike
--
Mike Meyer <[email protected]>               http://www.mired.org/consulting.html

Independent Software developer/SCM consultant, email for moreinformation.


O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

_______________________________________________
Haskell mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell


[email protected]
  Oxford University Computing Laboratory,    TEL: +44 1865 283508
  Wolfson Building, Parks Road,              FAX: +44 1865 283531
  Oxford OX1 3QD, UK.
  URL: http://www.comlab.ox.ac.uk/oucl/people/jeremy.gibbons.html

_______________________________________________
Haskell mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell

Re: [Haskell] select(2) or poll(2)-like function?

Reply via email to