Skip Montanaro <[EMAIL PROTECTED]> writes: > It's more than a bit unfair to compare Wikipedia with Ebay or > Google. Even though Wikipedia may be running on high-performance > hardware, it's unlikely that they have anything like the underlying > network structure (replication, connection speed, etc), total number > of cpus or monetary resources to throw at the problem that both Ebay > and Google have. I suspect money trumps LAMP every time.
I certainly agree about the money and hardware resource comparison, which is why I thought the comparison with 1960's mainframes was possibly more interesting. You could not get anywhere near the performance of today's servers back then, no matter how much money you spent. Re connectivity, I wonder what kind of network speed is available to sites like Ebay that's not available to Jane Webmaster with a colo rack at some random big ISP. Also, you and Tim Danieliuk both mentioned caching in the network (e.g. Akamai). I'd be interested to know exactly how that works and how much difference it makes. But the problems I'm thinking of are really obviously with the server itself. This is clear when you try to load a page and your browser immediately get the static text on the page, followed by a pause while the server waits for the dynamic stuff to come back from the database. Serving a Slashdotting-level load of pure static pages on a small box with Apache isn't too terrible ("Slashdotting" = the spike in hits that a web site gets when Slashdot's front page links to it). Doing that with dynamic pages seems to be much harder. Something is just bugging me about this. SQL servers provide a lot of capability (ACID for complicated queries and transactions, etc). that most web sites don't really need. They pay the price in performance anyway. > We also know Google has thousands of CPUs (I heard 5,000 at one point and > that was a couple years ago). It's at least 100,000 and probably several times that ;-). I've heard every that search query does billions of cpu operations and crunches through 100's of megabytes of data (search on "apple banana" and there are hundreds of millions of pages with each word, so two lists of that size must be intersected). 100,000 was the published number of servers several years ago, and there were reasons to believe that they were purposely understating the real number. -- http://mail.python.org/mailman/listinfo/python-list