Re: [abcusers] abc repository similiar to olga.net?

John Chambers Mon, 03 Mar 2003 21:04:42 -0800

Toby asks:
| Good question.. John, do you have an answer?

I wrote about that before seeing this message.


| On a similiar note (no pun intended), I'm actually quite impressed at how
| efficient John's program is.. He's really quite a hand at Perl.. Perl
| programs are notoriously CPU hungry.. John's program runs really tight..
| That machine also serves up about 10 moderate traffic websites, runs lpd
| for a couple printers, and has the Thunderstone seach engine periodically
| cranking away.. I never even notice John's program running away in the
| background..

An interesting aspect to the perl story is that it's  performance  in
many  cases  is competetive with even fairly good C code.  There have
been a number of reports of people who decide to rewrite an important
perl  program  in C, and find that the C version is slower.  The perl
gang has learned some good tricks, and unless you know  a  lot  about
what  you're doing, you'll have trouble matching what they've learned
over the years.

The main reason that a perl program  can  gobble  cpu  is  that  some
things  are  very  easy  in  perl  that  are  difficult in most other
languages.   The  language  includes  symbol-table   lookups   in   a
deceptively  simple  form,  as  a  kind of array that takes character
strings as a subscript.  It's so easy to use  that  perl  programmers
learn  to use it for everything.  Anyone who has ever written a table
lookup routine knows how much cpu  time  it  takes.   In  most  other
languages,  a symbol table is a big hairy deal that you use only as a
last resort.  In perl, you use them because it's easy.   And  if  you
don't  understand the implications, you can end up with a very greedy
little program. If you understand, it's just another very handy tool.
I  use  tables  a  lot,  but  I'm  always aware that that very simple
indexing operation is expensive. But the perl interpreter has some of
the most sophisticated table-handling routine known.  Unless you're a
real expert, you aren't going to improve on them.

Perl can also gobble memory.  One of the features of the language  is
the  ability  to "slurp up" (a technical term) an entire file into an
array of strings.  It only takes a few characters of punctuation:
   @data = <FILE>;
This reads the entire contents of FILE into the data array. It's fast
and  easy,  and  there  are  a lot of things that will operate on the
entire array.  Then the command
   @data = ();
frees the space.  This is a powerful part of perl.  But if you aren't
aware  of  what it does, it can produce a monster program.  My search
bot doesn't do this.  In fact, it uses fixed-length reads,  to  avoid
the  problems  of web sites like Mac sites that don't have line feeds
within their pages.

| Of course having dual CPU's on there and alot of RAM helps :-)

Yes, and my code is single-threaded, so it shouldn't  ever  use  more
than one cpu. It spends most of its time waiting for a TCP connection
to go through.  This typically takes longer than reading the data.  A
web  search  program  that makes only one connection at a time really
can't use much cpu time.  Most of its time will be spent  waiting  on
network events.

OTOH, I've been contemplating stuffing some info into a database ...

To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html

Re: [abcusers] abc repository similiar to olga.net?

Reply via email to