On Fri, Aug 19, 2011 at 09:07:55AM +0200, Laurent Bercot wrote:
> > One of the big reasons I now consider threads one of the best (but not
> > always "the best", of course) approach to non-blocking operations is
> > that they're the only idiom that allows one to mechanically transform
> > blocking code into a non-blocking form.
> 
>  Multiprocessing does that too.

Multiprocessing can accomplish *some* of the same things, but not all,
and it has a lot more troublesome corner cases and the transformation
is not entirely mechanical or transparent. In particular:

- A library can't fork new processes without the calling applicating
  being aware of this and tip-toeing around it. You have to make sure
  neither consumes the wait/exit status wanted by the other, and there
  can be only one (process-wide) SIGCHLD handler (not to mention
  signals are evil).

- Unless you're sure you'll never use library code that uses threads,
  it's unsafe to keep the same program image after fork; you need to
  use exec. There are many reasons continuing in the child without
  calling exec may be dangerous if threads have been used, and
  although pthread_atfork was designed to address these issues, it
  suffers from unfixable conceptual flaws that make it nearly useless.

- Multi-process code must be able to handle the case where one or more
  of its processes terminates unexpectedly; restarting the terminated
  process can be nontrivial if the first process to discover it's
  missing is not the parent but one of its siblings or an unrelated
  process. Process ids are not meaningful except to the parent because
  there are race conditions where they can be reused. All of these
  issues disappear if you use threads instead, because (1) it's
  impossible to terminate a single thread; any asynchronous
  termination kills the whole process, and (2) there is no
  parent/child hierarchy among threads, and any thread in a process
  can join any joinable thread, not just ones it created.

Note that just using threads does not commit you to making heavy use
of memory sharing. You can still write your code very similar to how
it would work with fork, except that all the ugly process handling
goes away, and your program's commit charge requirements go down.

> > One of the most significant results of this observation in relation to
> > writing *new* code is the fact that, with threads, you can actually
> > use stdio FILEs for most or all of your io, even in programs dealing
> > with asynchronous events from many sources, reducing or eliminating
> > the need to roll your own buffering.
> 
>  From my point of view, you have it backwards, probably because of your
> dedication to standards.
>  stdio sucks, badly. Why WOULD you want to use it for most or all of
> your IO ? To me, your saying "writing threaded code allows me to use
> stdio" sounds like "by contorting myself, I can fit into that ill-tailored
> suit".

>From my point of view, you have it backwards, probably because of your
dedication to non-thread-based event-driven IO. If your goal is to be
able to use something like select/poll, stdio sucks, yes. But
otherwise the only major flaw in stdio is the course-grained error
handling. I wouldn't use stdio for updating a database file where you
need full control over order of writes, atomicity, locking, ability to
back out operations that fail to complete, etc. but for the vast
majority of IO tasks, it's acceptable to treat any write failure as an
unrecoverable file.

One area where stdio inherently beats any third-party library is that
you can write arbitrarily long formatted strings using fprintf without
the need for space to buffer the whole result. If you want a
printf-like function for your own buffering system, you need to
allocate space for the entire output, then use snprintf, and handle
the possibility of allocation failure. Or you can write your own
formatting routines, and then.... welcome to NIH hell...

>  I have written my own buffer library a long time ago, without even
> really thinking about asynchronism - just because I was fed up with
> stdio. And when I needed to perform asynchronous IO, lo and behold, unlike
> stdio, my buffer library was still working and usable. I have never come
> back to stdio since, and never had any regrets.

This works fine as long as all the code is your code. It doesn't work
so well when you're integrating code from multiple places. Good luck
getting the whole world to adopt your buffering library...

Rich
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to