Toby asks: | Good question.. John, do you have an answer? I wrote about that before seeing this message.
| On a similiar note (no pun intended), I'm actually quite impressed at how | efficient John's program is.. He's really quite a hand at Perl.. Perl | programs are notoriously CPU hungry.. John's program runs really tight.. | That machine also serves up about 10 moderate traffic websites, runs lpd | for a couple printers, and has the Thunderstone seach engine periodically | cranking away.. I never even notice John's program running away in the | background.. An interesting aspect to the perl story is that it's performance in many cases is competetive with even fairly good C code. There have been a number of reports of people who decide to rewrite an important perl program in C, and find that the C version is slower. The perl gang has learned some good tricks, and unless you know a lot about what you're doing, you'll have trouble matching what they've learned over the years. The main reason that a perl program can gobble cpu is that some things are very easy in perl that are difficult in most other languages. The language includes symbol-table lookups in a deceptively simple form, as a kind of array that takes character strings as a subscript. It's so easy to use that perl programmers learn to use it for everything. Anyone who has ever written a table lookup routine knows how much cpu time it takes. In most other languages, a symbol table is a big hairy deal that you use only as a last resort. In perl, you use them because it's easy. And if you don't understand the implications, you can end up with a very greedy little program. If you understand, it's just another very handy tool. I use tables a lot, but I'm always aware that that very simple indexing operation is expensive. But the perl interpreter has some of the most sophisticated table-handling routine known. Unless you're a real expert, you aren't going to improve on them. Perl can also gobble memory. One of the features of the language is the ability to "slurp up" (a technical term) an entire file into an array of strings. It only takes a few characters of punctuation: @data = <FILE>; This reads the entire contents of FILE into the data array. It's fast and easy, and there are a lot of things that will operate on the entire array. Then the command @data = (); frees the space. This is a powerful part of perl. But if you aren't aware of what it does, it can produce a monster program. My search bot doesn't do this. In fact, it uses fixed-length reads, to avoid the problems of web sites like Mac sites that don't have line feeds within their pages. | Of course having dual CPU's on there and alot of RAM helps :-) Yes, and my code is single-threaded, so it shouldn't ever use more than one cpu. It spends most of its time waiting for a TCP connection to go through. This typically takes longer than reading the data. A web search program that makes only one connection at a time really can't use much cpu time. Most of its time will be spent waiting on network events. OTOH, I've been contemplating stuffing some info into a database ... To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html
