I think we should sample a range of parallel problems so we can
benchmark implementations based on some objective criteria.
Concurrency problems exist on a scale that ranges from
"embarrassingly parallel" to "inherently sequential" and can be
benchmarked by Amdahl's Law: Some tasks can not be computed in
parallel, and the minimum time to compute a parallel task depends on
the maximal serially computed component. This is an ideal that will
allow objective scoring the "concurrency potential" of a given
problem with Big-O notation.
http://en.wikipedia.org/wiki/Embarrassingly_parallel
http://en.wikipedia.org/wiki/Amdahl%27s%5Flaw
[notice the urlencoded title in the URL looks like Amdahl's Flaw]
In reality, doing computing in parallel comes with two overhead
factors. First you have to distribute the workload. That means
setting up each parallel execution context and dealing out the
workload. Second, you have to coalesce (some of) the computed
products of those parallel operations. Besides benchmarking
performance, any given approach/tool/API may require different levels
of effort to implement/read/debug those two chunks.
So I guess what I'm suggesting is a three dimensional matrix of
benchmarks: x Frameworks/APIs/Tools, y representative sample problems
on the scale of concurrency potential, and z performance measurements.
On Jun 9, 2009, at 9:39 AM, Pete wrote:
On Jun 8, 2009, at 5:10 PM, Paul Boddie wrote:
On http://wiki.python.org/moin/Concurrency/99Bottles
even made a remark on the "99 Concurrent Bottles of Beer" page
which appears
to have been wide of the mark - that one would surely try and use
operating
system features in the given example in order to provide a more
optimal
To clarify: my objective with this page is to give potential users
a general sense of what code using a variety of toolkits/libraries/
techniques looks like. The technical term for such a collection is
a chrestomathy: "a collection of similar programs written in
various programming languages, for the purpose of demonstrating
differences in syntax, semantics and idioms for each language"
http://en.wikipedia.org/wiki/Program_Chrestomathy
AFAIC, such a collection is *not* the place for optimal solutions -
that would be appropriate a benchmark (something that's probably
worth doing as well). Accordingly, I'd encourage submitters to
minimize dependencies on external libraries (other than the toolkit
being demonstrated, obviously) and focus on clarity &
comprehensibility for new users.[0]
implementation - and I note that Glyph appears to regard the
stated problem
as not really being concurrent.
How should we extend the problem on the Wiki to be something that
doesn't have
a workable serial solution?
The particular problem (tail | grep) came out of Beaz's class and
was incredibly helpful for comparing generators vs. coroutines. We
*should* find a problem that is actually concurrent - how about
tail|grep'ing multiple input files?
Does anyone have any suggestions of more realistic problems, or
are we back at the level of Wide Finder?
I don't see realism as the primary goal here - we could just use
tail & grep after all. ;-) That said, ideas for reasonable
benchmarks would be helpful - thoughts?
--Pete
[0] I'm -0 on the use of time.sleep() & assuming input comes in
full lines._______________________________________________
concurrency-sig mailing list
[email protected]
http://mail.python.org/mailman/listinfo/concurrency-sig
_______________________________________________
concurrency-sig mailing list
[email protected]
http://mail.python.org/mailman/listinfo/concurrency-sig