I think that there are developers who are not familiar or comfortable with SQL and database management systems. This can lead to a number of bad choices if you are trying to develop a system which needs to store, select and maintain data. Also, using a DBMS never precludes the opportunity to create an in memory array of semi-static data, write semi-static html files, or use any other type of cache. Until traffic develops, it is hard to predict where the slow points will show up, but putting data into a key-value system from the
beginning, when you need to provide for the usual insert/update/delete
functions, will probably lengthen development time.

Of course, if scalability is really the main problem, this should imply that going for a higher initial investment in equipment and development would ensure success. Scalability probably needs to include the possibility of
scaling beyond one (reasonably sized) machine.

I'm not really sure this is an AOLServer topic, other than relating to AOLServer scalability and the ns_bdb module (which works very well).

As to scalability in general, in the decade of web interface programming, I've found three major areas of poor performance: 1) scripting overhead, esp as the application matures and more and more dependent code gets included on each page load 2) SQL database speed, especially in high concurrency situations, but also when millions-of-rows-per-table gets reached.
3) full text searching

As far as scripting overhead, my own benchmarks of "hello world" find aolserver/tcl to be by far the fastest scripted web development platform, at about 1/2 the speed of serving 1k GIFs, but about 10x faster than PHP, 3x faster than lighthttpd-fastcgi. So, AOLserver is a good platform as far as scripting scalability is concerned, as long as the developer takes care not to load too much dependent code per page.

SQL, in high concurrency situations, tends to not do well on the web, especially in cases where lots of write/reads are occurring at the end of the table, which is a common scenario. In my experience, many applications that use SQL actually only need key-lookup capability (ie, "your membership settings, your purchase history") and if the db supports multiple-matching-keys, much can be done with that. What's nice about Berkeley DB is that it runs in-process, with an aggressive cache, so that during db reads it delivers performance close to matching in-memory databases. Also, Berkeley DB can scale to running on several synced databases, which is a highly unusual feature, and if you've ever tried to make that work on Oracle or MS SQL, you know that it's tricky to get right. Google uses Berkeley DB for their universal login, for just this reason.

Thirdly, you'll notice many sites have very poor full text searching performance. Lucene, a recently popularized full text search engine, appears to finally solve this problem. However, in my case I wanted both fast full text searching, and grouped-by-type search results, such as Amazon returns. I used to run a directory web site in the late-90s that ran about 300,000 full text search requests per day on a 5 million page corpus, and I tried all the commercial and open source solutions, and nothing could keep up (ie, deliver less-than-5- seconds results under peak load). Back then, I wrote a full text engine (simple inverted index) using AOLServer and dbm under Solaris/ Intel (running as an AOLServer C extension) and managed peak load search times under 1/3rd of a second, so AOLserver + key-lookup-db definitely could scale to very high levels.

I know that going with Berkeley DB is controversial, but in my opinion it's extremely difficult to scale up a SQL backed application that's failing in-the-field. That's generally why Oracle gets all those dot-coms -- they promise that you'll scale if you go huge, if you spend enough money. In the OSS world, I've put hundreds of hours into trying to scale Postgresql and MySQL, and it's very difficult to do on a completed product. My experience is that both of those OSS dbs can scale to huge heights, but only if you're an expert in that db platform and do extensive up-front design work to that effect, which few people do.

Just my opinion... all I can say is that AOLServer+berkeley-db, if you can live with a key-lookup database only, is incredibly fast, at least 3x faster than anything else I've benchmarked.

-john


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to