Ian Greenway wrote the following about [netconnect] Re: Voyager idea:
> How does a search engine work then? It holds (allegedly) millions of
> page references, usually at least a 256-character keyword string for
> each, often a more descriptive text. The storage space alone must be
> phenomenal, and even with advanced hash-tables and whatever it seems
> weird that it can search it all and create the answers within a few
> seconds - and do it for thousands of requests repeatedly. Do they use
> some other scheme, or is it really as brute-force as it sounds?
It's hard to talk about search engines without swearing. Somewhere in
the bowels of Yahoo! is a section on all the computer power and
storage it uses, and a couple of years ago it was terrifying, so I
can't image what it's like now. Back then they had about 30Tb of
storage, 40 parallel servers and enough bandwidth to kill a horse.
You have to remember that www.yahoo.com is the most visited web page
in the world, and it's pretty quick, so not only does it need to
search and generate the HTML, it also has to pump it across an already
clogged Internet.
As for the technicalities of searching, a well-ordered database can
search 10,000,000 entries in less than a second on a workstation. On
a parellel processing super-server, it must be 1000s of times faster
than that.
I'll try to find the tech stuff on Yahoo! later.
Totty <8^)
--
Totty has an Amiga A1200, with 68060/50 and 603e/200 PPC.
32Mb RAM. 8x ATAPI CD. 1.7Gb HD. ShapeShifter V3.10 + OS 7.5.5
_____________________________________________________________
NetConnect mailing list. To unsubscribe, send an 'unsubcribe'
message to <[EMAIL PROTECTED]>