Re: perl's threads

Shlomi Fish Sat, 12 Feb 2011 05:08:56 -0800

Hi Rob,

well, if we are already discussing threads in general, and not in Perl in 
particular, I might as well contribute to the discussion.

On Wednesday 09 Feb 2011 15:03:04 Rob Coops wrote:
> Just adding a little to this discussion then. :-)
> 
> Threads regardless of the language are meant to do multiple things in
> parallel. For instance I'm at the moment working on a tool that extracts
> vast amounts of data from a web service as the calls to that service are
> pretty much independent and the data on my side is first stored in a
> database before further processing (using separate tables for each of the
> calls) there is very little point in doing these actions sequential.
> After all both the database and the webservice are fully capable of
> handling hundreds or even thousands of requests at a time without fail so
> why not run things in parallel?

Well, there are some downsides like overloading the network, CPU, etc. and it 
may even get you blocked from the web-service if it's not under your control. 
Parallelising like that won't always increase performance like that.

> I found that even with a minimal amount of data I can speed up the process
> a lot by simply creating an object that deals with the database and using
> this as a base class to build the various API calling objects on. Each
> object then is fully self contained deals with the retrieval of its own
> data massaging that in a format that fits in its own tables and uses its
> parents code to deal with the database insertion.
> 
> It depends on your intended usage of threads, if you are thinking along the
> lines of a multi threaded process that operates on shared data, where the
> threads are constantly sharing outcomes and using the results other threads
> have produced to continue their processing so they can supply the yet other
> threads with data... think again. Though of course possible in Perl it is
> also an very very complex thing to do and the performance will simply not
> be there at least not when comparing to other solutions.
> But as said for simple parallel and independent actions using threads in
> Perl is not such a bad idea.

Well, what you are describing (which I think is called cooperative multi-
tasking) is tricky to do right with complex software and a mess to debug if 
you have bugs (which you'll likely have) or if you try to extend it a little. 
What "The Algorithm Design Manual" (a very good book) recommends to do is to 
have non-cooperative mult-tasking where each task works on a different set of 
results with as little cooperation as possible (possibly in a round robin 
fashion).

> 
> The idea that threads are always bad regardless of programming language or
> are bad i many languages is simply not true. Threads certainly have a place
> in programming and in many if not all programing languages. With the many
> core processors of to day and tomorrow threads will become a much more
> important and useful tool in the programmers arsenal.
> The main thing is that for a very long time programmers have had very
> little real use for threads as most computers had a single core with 1 or
> in later years 2 threads per core. That simple fact has convinced many
> programmers that threads are not all that they are hyped up to be. After
> all you are still waiting on the hardware to become available and if you
> run more then 2 threads at a time the time the processor needs for
> switching between threads removes many of the performance benefits threads
> could offer.

Well, from what I understood threads were also useful in multi-processor 
machines (including many non-clustered super computers), but in that case, 
processes are often a good subtitute. 

> Now though that one sees processors that can handle up to 32 threads and
> soon even 128 threads at a time multiple threads will prove to be a lot
> more useful then most programmers have given them credit for. As the
> processor does not need to do all the switching between threads and can
> simply continue working on the thread without interruption the additional
> speed that literally processing in parallel can bring will be very
> noticeable. 

Agreed. Though it's possible that forked processes that don't EXECVE (e.g: 
perldoc -f exec) will still also be a good subtitue.

> The main problem that developers will find in terms of raw
> performance is the I/O of these systems which will not be able to provide
> data fast enough to keep the threads from having to wait for data to
> become available. Of course the old monster of how do you deal with
> multiple threads that do need to exchange data will be back on the table
> and certainly if you are doing funky stuff such as AI where many different
> inputs are processed at the same time and all can and should to an extend
> influence each other then you will spend many nights waking up screaming
> in the middle of the night as you got tangled in the multiple thread
> nightmare again.

Well, this "funky stuff such as AI" is not limited to AI (not that you said it 
is), and I've witnessed or read about several complex C or C++ servers (either 
UNIX/POSIX or Win32 or both) trying to be implemented using threads and lots 
and lots of locks, which often fail. 

Regarding AI, I only have some experience (and not a very formal one) in game 
AI, which normally involves building graphs (often with directed cycles) of 
positions with calculating moves between them, pruning positions and 
evaluating them. There are many popular graph traversal algorithms for that, 
and normally (if we take a typical game such as Chess or a card Solitaire 
variant) you'll have a hard time holding all the states in memory and it will 
take some time to traverse them all. Manipulating this graph using several 
scans in several threads will be a pretty hard task.

For Freecell Solver ( http://fc-solve.berlios.de/ ) which is currently written 
in C for performance reasons, I've implemented a system of "instances" which 
manage the entire collection of positions, which have several "hard threads" 
which were supposed to be an individual POSIX or Win32 threads and are aimed 
at reducing the amount of locking, which in turn have many "soft threads" 
which is a fancy name for an individual scan. In turn, what I do is switch-
task between the various soft threads (which can be specified by the user) 
until I reach the final state.

Now, I still have not implemented the cooperative multi-threading and the 
scheme I figured out to implement it with locks
(see http://groups.yahoo.com/group/fc-solve-discuss/message/377 ) based on an 
older version of the code, seems incredibly hard to get right, and very 
complex. What I did instead was implement a multi-threaded program that has a 
different instance (i.e: a separate collection of states) in every thread and 
it solves a range of Freecell deals in a round-robin fashion. This does not 
help with a single deal, naturally, which can still be very time consuming, 
especially if it cannot be solved (which is the case for many Freecell deals 
with fewer than 4 freecells in the waste.).

> 
> As for Perl the threads implementation on it offers is not to bad and quite
> convenient for the simpler parallel run scenario. For serious performance
> in the multi threaded arena Perl is simply not the tool for the job, it is
> as simple as that.

Maybe it's true. Of course, Larry Wall commented here - 

http://perl.org.il/presentations/larry-wall-present-continuous-future-
perfect/transcript.html

that:

[quote]
"If I wanted it fast, I'd write it in C" - That's almost a direct quote from 
the original awk page.
[/quote]

He said it in the context of several "Irrationalities" in other languages, so 
I assume he intended to mean that even the implementation of a high level 
language (HLL) should perform fast enough to be usable. On the other hand, my 
friend once said "How can you have a programming language that will be good 
for anything if you can't even have such a screwdriver?", and perl 5 
algorithmic code is an order of magnitude or two slower than C's one, but you 
don't normally notice it because either you stick to routines written in C 
(regular expressions, XS code, etc.) or because modern computers are so fast 
that it doesn't seem too significant, or because you're doing a lot of I/O or 
other non-CPU-bound operations.

Nevertheless, you are right that Perl is not the right language for everything 
(or for everyone), and neither is likely any other language, from many 
reasons.

> But any developer that tells you that threads are not worth the trouble
> they cause or anything along those lines is clearly an experienced
> programmer who unfortunately has not realized yet that the computer world
> has changed incredibly over the past few years.
> 

Well, it depends what you want to do. If you have a 4-core machine and without 
threads your task will finish in 1 minute, then implementing threads to make 
your task 4 times faster *in the best case* [Amdahl], then it is not worth it. 
On the other hand if you're going to deploy a simulation that takes days on 
end to solve and you have a very high performance machine with many processors 
and cores on it, then it's a good idea to look into parallelisation. And if 
you're writing or maintaining an OS for servers or a service program, that is 
expected to be deployed on very powered machines, then you likely would want 
to make it parallelised too. 

There's always the case of cost vs. benefit.

Regards,

        Shlomi Fish

[Amdahl] - http://en.wikipedia.org/wiki/Amdahl%27s_law

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Why I Love Perl - http://shlom.in/joy-of-perl

Chuck Norris can make the statement "This statement is false" a true one.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: perl's threads

Reply via email to