Re: [sqlite] Multi-threading.

Paul G Sat, 16 Jul 2005 07:26:21 -0700

----- Original Message ----- 
From: "Kervin L. Pierre" <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Saturday, July 16, 2005 9:31 AM
Subject: Re: [sqlite] Multi-threading.

> Paul G wrote:
> >
> > richard's advice is solid. use async io/event loops if possible,
separate
> > processes if possible, threads as a last resort, in that order. the grey
> > area is the 'if possible' test, since it's a multi-way tradeoff between
> > performance, simplicity and provable (to an extent) correctness. i fully
> > expect that a lot of folks *do* need to use threads and the probability
of
> > that being the case on windows is much higher than on posixish
platforms.
> >
>
> I agree with you, but it doesn't seem like
> you're exactly concurring with what DRH
> said though.

right, i'm not concurring with *everything* he wrote. eg. structured
programming wasn't meant to reduce buggage directly, but rather to improve
maintainability of code (which has the indirect effect of reducing buggage).
however, his main point that arbitrary use of a shared address space with
concurrent access is a bit like drinking and skydiving (ie a bad idea which
you come to regret very fast but never get the chance to regret for very
long ;) is indeed very cogent.

the key is understanding that nothing is ever black and white. the more
restrictions you place on usage of shared data structures, the less
concurrency-related issues you are going to have, but take it far enough and
you've got the equivalent of ipc api-wise.

> I'm guessing that that 'if possible' test
> is huge, and very frequently will fail.

right. event loops are great for i/o heavy apps, but you will still need to
farm compute heavy units of work to a thread pool (this is a very common
pattern), not to mention the fact that a lot of external dependencies (ie db
client libs) do not provide an async api. separate processes are fine for a
certain range of tasks, but as soon as you need to move a lot of data back
and forth or spawn a lot of them, your performance tanks. threads have
neither of the above issues, but you then need to figure out why the thing
crashes when jupiter is in saggitarius and the moon is full (hint: many
threadsafe libs aren't, pain invariably ensues ;).

when you're making the call on which way to go, you invariably consider all
the tradeoffs and go with the least worst of your options given your
constraints (which model fits the largest part of the app best, things you
need to interface with, capabilities of your team, deadlines, whether you're
going to still be there to take the flogging when the fit hits the shan
etc).

> Why suffer the context switch when you don't
> have to?  Would you write a non-trivial GUI
> program today without using threads?  Why
> subject the users to the added latency IPC
> is going to introduce?

if the language i'm writing it in has a good async io framework, the app is
not compute bound and the windowing toolkit supports it, most definitely.
consider the fact that the gui handling is (in most modern windowing
toolkits) an event loop and compute heavy tasks are farmed off to a thread
pool and you'll start to see the big picture.

> The funny thing is eventually multi-process
> apps go to MMap for IPC, for performance, and
> then run into a lot of the same issues they'd
> have to deal with if they were threaded.

performance must suck enough for this to happen; this is not always the
case. beyond that, there is a big difference between taking a shared address
space for granted and being forced to think about access to shared data
structures from a psychological point of view. i must confess, however, that
i usually oscillate between the pure event loop, event loop + task queue +
thread pool and pure multithreading for the things i write (no separate
processes included here, with the hybrid model being the most common by
far), but that is down to the things i usually need to do and the fact that
i generally trust myself to do a decent job of fencing my shared data access
given the relative rigidity of the task queue + thread pool pattern.

> And as far as the 'threads introduce
> difficult to debug' errors; Isn't that the
> age-old performance versus complexity trade-
> off?

not necessarily. i routinely see multithreaded applications which would have
a simpler and more performant event loop implementation. this is a bit of a
contradiction given my previous statement that the two are isomorphic, but
in the real world mt apps are often suboptimally implemented due to the
complexity of an optimal implementation. iow, given locking which is fine
grained enough, the overhead of locking shouldn't differ significantly from
the overhead of the event loop, but locking which is fine grained enough is
too tough and you see lock contention (very hard to debug) and
inefficiencies elsewhere because the deadline has come and passed while
everyone was debugging elusive random data corruption/memory
leak/what-have-you somewhere. i'm sure you've seen apps which are so complex
by the time they're feature complete that noone dares to muck with it to
optimize it.

> Processes are easier to use, but very often
> perform worse under the same conditions as
> the more complex threaded application.  That
> is a fact many of us can not look past easily.

agreed, for certain types of work this is the case. all about tradeoffs
really - do you want an internittently crashing fast app or a solid slouchy
app? sometimes you just want something that works, as soon as possible. at
other times, certain performance levels are a hard requirement and you just
have to make sure you write your mt code right, debug it till it's fixed or
die trying ;) yet in another situation (mtas are a good example), you just
dont need to do that much ipc for the performance hit to be anything but
negligible.

> PS. It's funny, this discussion seems like its
> been taked right from a early '90s newsgroup :)

ahh, i was busy coding in asm back then ;) being serious again for a moment,
i don't feel rehashing this over and over is particularly valuable -
everyone's point of view is likely to be derived from their experience and
trying to agree on an objective view of parallel realities has never seemed
like a worthwhile endeavour to me. personally, i'm just happy i've got tools
which let me forget about all this complexity most of the time these days -
hlls are now practical for the majority of things i need to do, thanks to
mr. moore ;)

-p

Re: [sqlite] Multi-threading.

Reply via email to