On Wed, Mar 10, 2004 at 08:44:56PM +0100, allan juul wrote: > hi > > i have a problem while trying to build a spider using perl threads. > Consider the program below which is just an example to get going. > i wish to hit a certain site's frontpage for any number of times (for > example 300) > i imagine that since theres a lot of content on the page each request > will take some time to process, and therefore i imagine it would be nice > to delegate the tasks using threads. > my problem is in this very naive example that the unthreaded version is > much faster. > > two questions: > 1) is there something wrong with the threaded code ?
Yes. You fire off one thread, wait for it to finish, then start another thread. ie only one worker thread is active at a time. You need to do something like: push @threads, threads->new(\&lwp) for 1..$MAX; # spawn MAX threads; $_->join for @threads; # reap all threads. However, you probably don't want to do this anyway, because: * hitting a website with 300 simultaneous queries may be considered antisocial; * Perl threads work by copying most of the image of the Perl interpreter; spawning 300 threads may well cause you to run out of memory; also starting a thread is a time-consuming process, and you shouldn't do so unless the thread is going to be long-lived; ie if each thread will do many requests then fine; if its just going to do a single requwst and exit, then the startup time will dominate. * unless you need to make use of the specific features of Perl threads (such as shared variables), you'll be a lot better off (on UNIX systems at least) using fork instead - this is lot more efficient. Dave. -- A power surge on the Bridge is rapidly and correctly diagnosed as a faulty capacitor by the highly-trained and competent engineering staff. -- Things That Never Happen in "Star Trek" #9