Randy Johnson - CFConcepts wrote: > "If I had to make a site like Twitter that was scalable, how would I do > that?"
> But really that's only part of it. My coldfusion site needs to be > coded to be scalable too.. > So writing code to be scalable doesn't seem to be all that difficult, > writing good efficient code in a modular way seems to be a good start. Modular is good, decoupling is an important part of scalability. Static websites scale because one webserver does not need to know anything about another webserver to be able to answer a request for static content. Doubling the number of webservers quite literally doubles your capacity. Transactional databases on the other hand behave very differently, because due to their transactional nature one server needs to know what another one is doing so for each request it needs to communicate with its peers. When it gets more peers, there is more communication and in the end the number of deadlocks goes up with the third power of the number of active nodes: ftp://ftp.research.microsoft.com/pub/tr/tr-96-17.pdf > The next component is databases. I have read with Mysql that > replication is how you can use multiple databases. I haven't done to > much research on this, my initial questions would you use a db server > for users, a db server for messages, db for each component?? If you were to do that, you would need to join the users in one database to the messages in another one. If those databases are on different servers, that is incredibly slow. Even a 1 Gbit/s network is 50 times slower then the connection between a CPU and RAM and has 100 (1000?) times more latency. A good way to make your database scale is to make sure it remains small and local. Stick X users on each database server together with everything they need to answer any request from the database that may arrive from the application. And user X+1 goes on a new database on the next piece of hardware. I don't know much more about Twitter then you can I can see when I click on some page, but I guess for your Twitter example that would mean each database would have: - users table - friends table - followers table Whenever user X adds friend Y, you fix your application code to make sure that there are actually 2 inserts: in the friends table on the database server of user X you add name, image and URL of user Y, in the followers table on the database of user Y you add name, image and URL of user X. This is denormalized, double storage, you pay the price of having to run twice the updates (of hundreds of times the updates when somebody changes his image URL), but your thousands of gets for the latest update of a stream can use a simple, fast, in-memory local database. Utility tables like languages, a list of countries etc. are present on all servers and are completely mirrored. Obviously what you loose here if you look at the system as a whole are the A and C from ACID. For a site like twitter that would seem to be a reasonable price to pay, but nobody would want his bank to work this way. > I know a few people on this list have setup Scalable websites, clusters, > load balancing etc. > > Where did you all learn how to do such things? Read a lot (I second http://highscalability.com/), ask yourself "how" or "why" often enough and at some point you start seeing patterns. Jochem ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;192386516;25150098;k Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:306769 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

