[TurboGears] Re: Open Question: Turbogears and scaling...
I guess my confusion was from my original post where I said:* Application server scale easy as pie, database servers scale like hernias I should have been more specific that applications server what I meant was TG, PHP where ever the business logic of the code is. Maybe easy as pie was the wrong analogy to use. The m/custer technology is what I usually call database proxy. CJDBC was the first time I thought about using a db proxy. This is a really great solution for failover but not for scalability. If I am doing more database transaction persecond I can't just add another database server, I have to buy a bigger database server and if I am using a database proxy and have 3 database server that means I have to buy 3 database servers. You are technically correct database writes don't end up in exactly one place but those write are copied not split up. So now you have every write ending up on every single database node, which only scales if you writes stays the same and your reads increases. All the applications I have worked on reads and write increase as load increases, sometimes you can have a spike in the reads and not the writes it just depends on the applications. To write in more than one place you have to have the technology to be to load balance the writes. Then reads will need to know what server to go to for what data. I think this would be a blast to code! Fortunatly reads are a majority of most applications especially web applications. Thus why some simple caching is easier and more effective than replication of data. On 3/17/06, Robin Haswell [EMAIL PROTECTED] wrote: my time has a cost and optimisation often buys less performance than, say, a Dell SC1425 Unfortunatly my time is not worth a IBM 64way mainframe (or I would be one happy hacker). Bigger machines help but as my comment said before this will give you only linear optimization at some point you will need _exponential_ optmizaitions. This also depends on the complexity of the data relationships that your application needs. You need a machine that is 64 times faster buyNah mate you miss my point! Not bigger machines, *more* machines. A DellSC1425 is a pretty low-end piece of kit, the idea is you use multiplemachines.Let's say you have an application that is currently running at 100% above acceptable capacity. You can solve this problem in basically fourways:1. Buy hardware that is twice as powerful2. Perform optimisation, caching - etc.3. A combination of the 1) and 2)4. Buy another similar server and run them both In my experience, 4) is always the cheapest option, and requires lesshassle than 2) and 3) (and less hassle is the TG way!). The trick is tomake option 4 possible by asking questions like What will happen if I use two app or database servers - or both early on in the buildprocess. I do this for everything and it's served me right so far :-)Part of my personal PHP standard library is some wrappers around session management and database handling that means:1) All my session data is stored in the database, which means from thenon I can implement *all* my persistent storage in an RDBMS.2) My database reads and my database writes are separated and controllable, so if we need to add replication it's possible to directall writes to the master server and balance reads between the slaves.(Yes I said there are alternatives to the master/slave setup, but in web apps which are mostly read-heavy it's a pretty good solution anyway).-RobPS. If you're interested in the writing to one place problem, youshould look m/custer( http://www.continuent.com/index.php?option=com_contenttask=viewid=211Itemid=168).We have our own solution, but in general it's a pretty awesome setup fordatabase scaling through multiple servers. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
LiveJournal is an excelent example it is where I stole the idea of using memcache: http://www.linuxjournal.com/article/7451My visions for scaling up is to us url partitioning so that simular data all goes to the same host(s) and have no shared cache. For example if you have a news site all the tech articles could go to one server(s) that had all tech articles in there cache. This makes the invalidation have to iterate over a list of hosts but thems the breaks. On 3/17/06, Bob Ippolito [EMAIL PROTECTED] wrote: On Mar 17, 2006, at 3:17 PM, Robin Haswell wrote: my time has a cost and optimisation often buys less performance than, say, a Dell SC1425 Unfortunatly my time is not worth a IBM 64way mainframe (or I would be one happy hacker). Bigger machines help but as my comment said before this will give you only linear optimization at some point you will need _exponential_ optmizaitions. This also depends on the complexity of the data relationships that your application needs. You need a machine that is 64 times faster buy Nah mate you miss my point! Not bigger machines, *more* machines. A Dell SC1425 is a pretty low-end piece of kit, the idea is you use multiple machines. Let's say you have an application that is currently running at 100% above acceptable capacity. You can solve this problem in basically four ways: 1. Buy hardware that is twice as powerful 2. Perform optimisation, caching - etc. 3. A combination of the 1) and 2) 4. Buy another similar server and run them both In my experience, 4) is always the cheapest option, and requires less hassle than 2) and 3) (and less hassle is the TG way!). The trick is to make option 4 possible by asking questions like What will happen if I use two app or database servers - or both early on in the build process. I do this for everything and it's served me right so far :-) Part of my personal PHP standard library is some wrappers around session management and database handling that means:Scaling horizontally, what you list as 4, is the only real option.There's plenty of public record that shows that all the successfulguys (Google and LiveJournal come to mind) are using lots of relatively cheap servers, rather than small numbers of giantservers.If you design for that, you'll never have a problem so longas you can afford to operate, and that's not so tough of a problembecause the costs are at worst linear.With any other option, the price to upgrade grows exponentially and there's a ceiling on whatkind of power you can even buy to run an app that is mostly serial.Good optimizations can do wonders in the short term, e.g. cutimmediate hardware costs in half... but you get that anyway if you wait about a year.It's typically better to expand your service suchthat it maximizes profits, rather than optimize your service tominimize your overhead.There's only so low you can go with cuttingyour overhead.. but there's no well defined ceiling for maximum --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? My experience with setting up shared hosts tells me this: FastCGI is slow but works with SuEXEC so is secure. mod_python is fast but all processes run as the Apache user - draw your own conclusions from this. What really solves the problem is the Apache Perchild MPM - mod_* speed with suexec capabilities. Unfortunately for some unfathomable reason perchild isn't finished and isn't being developed :'( Is this accurate? -Rob --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
Gerhard Häring wrote: Robin Haswell wrote: I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? My experience with setting up shared hosts tells me this: FastCGI is slow but works with SuEXEC so is secure. mod_python is fast but all processes run as the Apache user - draw your own conclusions from this. Not considering benchmarks of hello-world style apps, is FastCGI/SCGI/mod_proxy really noticably slower than using mod_python for real applications? My experience is really only with PHP, where the statup times are quite high (PHP doesn't have the concept of runtime module selection as such). However I can't imagine any situations where FCGID would be quicker than mod_python, as mod_python's python interpreter runs within the Apache process itself. What really solves the problem is the Apache Perchild MPM - mod_* speed with suexec capabilities. Unfortunately for some unfathomable reason perchild isn't finished and isn't being developed :'( [...] Apparently developing something like the Apache perchild MPM is hard. Because it didn't get fixed and it had several problems, it got finally scrapped in Apache 2.2. There were/are several attempts to develop a replacment, none of them ready for production yet, according to their developers: metuxmpm: http://www.sannes.org/metuxmpm/ itk mpm: http://home.samfundet.no/~sesse/mpm-itk/ peruser mpm: http://www.telana.com/peruser.php -- Gerhard --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
Most things seem to be covered exception for caching so I will share my witless drivel about caching. First you need to decide how dirty your data can get. If you have a realtime stock quote system probably can't live with a lot of dirt vs say blog comments probably don't need to be instant (well at-least the ones I write probably don't need to be read at all). The second problem is the web is stateless so you can't send updates down the socket to the web browser. Thanks TG's widget and awesome AJAX support this can be minimized (and there is much rejoicing, thanks!). It doesn't matter how you architect the system, writes end up in one place. Sure you can do replication but there is a world or setup, configuration, schema changes, transactions and data synchronization oh my! You are still only writing in once place, plus replication is just a linear optimization, caching can get you exponential optimization without all the lions, tigers and bears. Cache only after you have tried to do some happy hacking to make things faster, my experience for a couple years in EJB (oh the humanity!) was cache it and forget. Most of the performance issues with EJB have to do with locking data so there is no dirty data, plus the threading architecture is... Sorry I digress often. Just cache as little as possible. In the big project I work on we cache at a couple different levels. Highest level is cache the page, thus skipping templates engine, controller code, database lookup. In this project we use memcached because it is very flexible on how we setup the caching system. No matter what, you should create a wrapper around your caching implementation so you an move from one type of cache to another without changing any controller code. memcached support a number of different languages which is nice for integration. There are many ways to invalidate cached data. The easiest way is to just set an expiration date. Since we are happy Postgresql users we use a trigger system (sure features like triggers make the database slower but I want the entire system to run faster not just the database). There are two ways that we could do this, the first is write a python function that gets called when an update or delete happens on cached data. The python function in our case invalidate all the caches that had that data. The second way is to have a trigger insert records into an invalidation table. Every n number of seconds an external process consumes the records by invalidating your caches (which by the way is basically how slony works for replication). * Optimize code and queries first * Keep an abstract layer or two from the cache implementation * Application server scale easy as pie, database servers scale like hernias * Turbogears rules! Good luck --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
On 3/17/06, Lateef [EMAIL PROTECTED] wrote: The second problem is the web is stateless so you can't send updates down the socket to the web browser. Half true. The HTTP is stateless, but you can send updates down the socket. http://alex.dojotoolkit.org/?p=545 Cool, eh? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
My $0.02 (approx £0.012 where I come from) on this: In my company, an application that scales is an application that you can throw hardware at without having to think about it. We generally don't bother with intricate caching and optimisations, because my time has a cost and optimisation often buys less performance than, say, a Dell SC1425. I guess what I want to know about this is, are there any parts of TG's standard setup (with a separate DB server) that store changes on the application machine? Stuff like sessions - are sessions stored on-disk/memory? And if they are... why? Also, I'm a firm believer that any app has a potential to become really popular, and will potentially need more hardware. In other words, optimisation is fighting a losing battle. I'm not saying don't optimise - by all means, make sure your indexes are correct, make sure you're running the right SQL statements and make sure you're not doing needless work, but after that I don't believe in investing time and effort in caching mechanisms when an expansion path will give you more bang for your buck. Sure, there are applications when caching is obvious, but those aren't the applications I'd usually choose TurboGears for. -Rob PS. You don't always write to one place :-) Most people do, but there are solutions that are fast and safe, you just need to think outside the box (but I don't think I can say much more than that). Lateef wrote: Most things seem to be covered exception for caching so I will share my witless drivel about caching. First you need to decide how dirty your data can get. If you have a realtime stock quote system probably can't live with a lot of dirt vs say blog comments probably don't need to be instant (well at-least the ones I write probably don't need to be read at all). The second problem is the web is stateless so you can't send updates down the socket to the web browser. Thanks TG's widget and awesome AJAX support this can be minimized (and there is much rejoicing, thanks!). It doesn't matter how you architect the system, writes end up in one place. Sure you can do replication but there is a world or setup, configuration, schema changes, transactions and data synchronization oh my! You are still only writing in once place, plus replication is just a linear optimization, caching can get you exponential optimization without all the lions, tigers and bears. Cache only after you have tried to do some happy hacking to make things faster, my experience for a couple years in EJB (oh the humanity!) was cache it and forget. Most of the performance issues with EJB have to do with locking data so there is no dirty data, plus the threading architecture is... Sorry I digress often. Just cache as little as possible. In the big project I work on we cache at a couple different levels. Highest level is cache the page, thus skipping templates engine, controller code, database lookup. In this project we use memcached because it is very flexible on how we setup the caching system. No matter what, you should create a wrapper around your caching implementation so you an move from one type of cache to another without changing any controller code. memcached support a number of different languages which is nice for integration. There are many ways to invalidate cached data. The easiest way is to just set an expiration date. Since we are happy Postgresql users we use a trigger system (sure features like triggers make the database slower but I want the entire system to run faster not just the database). There are two ways that we could do this, the first is write a python function that gets called when an update or delete happens on cached data. The python function in our case invalidate all the caches that had that data. The second way is to have a trigger insert records into an invalidation table. Every n number of seconds an external process consumes the records by invalidating your caches (which by the way is basically how slony works for replication). * Optimize code and queries first * Keep an abstract layer or two from the cache implementation * Application server scale easy as pie, database servers scale like hernias * Turbogears rules! Good luck --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
Yeah, I knew I would get roasted like a pig for not point out the exceptions. I hadn't seen this statefull http exception before it is cool! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
I am not a TG expert but the options I believe are file, memory, database and role your own. my time has a cost and optimisation often buys less performance than, say, a Dell SC1425 Unfortunatly my time is not worth a IBM 64way mainframe (or I would be one happy hacker). Bigger machines help but as my comment said before this will give you only linear optimization at some point you will need _exponential_ optmizaitions. This also depends on the complexity of the data relationships that your application needs. You need a machine that is 64 times faster buy a 64 proc machine, but you need a machine 1x then start hacking. You don't always write to one place My experience although limit did include some trading systems over on this side of the pond I have not seen any applications that wrote to more than one place. I wish I got to work on a project that required this kind of technical scaling but alas I am a bottom feeder :) Data is stored in RDBMS or mainframe so I can not comment on such fancy stuff. Cheers, Lateef --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
my time has a cost and optimisation often buys less performance than, say, a Dell SC1425 Unfortunatly my time is not worth a IBM 64way mainframe (or I would be one happy hacker). Bigger machines help but as my comment said before this will give you only linear optimization at some point you will need _exponential_ optmizaitions. This also depends on the complexity of the data relationships that your application needs. You need a machine that is 64 times faster buy Nah mate you miss my point! Not bigger machines, *more* machines. A Dell SC1425 is a pretty low-end piece of kit, the idea is you use multiple machines. Let's say you have an application that is currently running at 100% above acceptable capacity. You can solve this problem in basically four ways: 1. Buy hardware that is twice as powerful 2. Perform optimisation, caching - etc. 3. A combination of the 1) and 2) 4. Buy another similar server and run them both In my experience, 4) is always the cheapest option, and requires less hassle than 2) and 3) (and less hassle is the TG way!). The trick is to make option 4 possible by asking questions like What will happen if I use two app or database servers - or both early on in the build process. I do this for everything and it's served me right so far :-) Part of my personal PHP standard library is some wrappers around session management and database handling that means: 1) All my session data is stored in the database, which means from then on I can implement *all* my persistent storage in an RDBMS. 2) My database reads and my database writes are separated and controllable, so if we need to add replication it's possible to direct all writes to the master server and balance reads between the slaves. (Yes I said there are alternatives to the master/slave setup, but in web apps which are mostly read-heavy it's a pretty good solution anyway). -Rob PS. If you're interested in the writing to one place problem, you should look m/custer (http://www.continuent.com/index.php?option=com_contenttask=viewid=211Itemid=168). We have our own solution, but in general it's a pretty awesome setup for database scaling through multiple servers. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
On Mar 17, 2006, at 3:17 PM, Robin Haswell wrote: my time has a cost and optimisation often buys less performance than, say, a Dell SC1425 Unfortunatly my time is not worth a IBM 64way mainframe (or I would be one happy hacker). Bigger machines help but as my comment said before this will give you only linear optimization at some point you will need _exponential_ optmizaitions. This also depends on the complexity of the data relationships that your application needs. You need a machine that is 64 times faster buy Nah mate you miss my point! Not bigger machines, *more* machines. A Dell SC1425 is a pretty low-end piece of kit, the idea is you use multiple machines. Let's say you have an application that is currently running at 100% above acceptable capacity. You can solve this problem in basically four ways: 1. Buy hardware that is twice as powerful 2. Perform optimisation, caching - etc. 3. A combination of the 1) and 2) 4. Buy another similar server and run them both In my experience, 4) is always the cheapest option, and requires less hassle than 2) and 3) (and less hassle is the TG way!). The trick is to make option 4 possible by asking questions like What will happen if I use two app or database servers - or both early on in the build process. I do this for everything and it's served me right so far :-) Part of my personal PHP standard library is some wrappers around session management and database handling that means: Scaling horizontally, what you list as 4, is the only real option. There's plenty of public record that shows that all the successful guys (Google and LiveJournal come to mind) are using lots of relatively cheap servers, rather than small numbers of giant servers. If you design for that, you'll never have a problem so long as you can afford to operate, and that's not so tough of a problem because the costs are at worst linear. With any other option, the price to upgrade grows exponentially and there's a ceiling on what kind of power you can even buy to run an app that is mostly serial. Good optimizations can do wonders in the short term, e.g. cut immediate hardware costs in half... but you get that anyway if you wait about a year. It's typically better to expand your service such that it maximizes profits, rather than optimize your service to minimize your overhead. There's only so low you can go with cutting your overhead.. but there's no well defined ceiling for maximum profits (look at Google!). -bob --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
On 3/16/06, ajones [EMAIL PROTECTED] wrote: I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? Performance is not a core priority for TG. Almost any technology can scale with the correct server architecture and proper caching tricks. TG is no exception here. Can I host the controller, for instance, on multiple servers for better response time? Nothing in TG prevents you from doing this. The biggest restriction would probably be using the Identity framework with the default sqlobject provider, as that hits the database every pageview. I'm pretty sure you can write your own provider so that it doesn't do this, but I don't know how you would go about doing this. If you had to design a truly big system, or a lot of little interconnected remote sites, and wanted TG to do it what are the options? I've never done this, only read about it, so I'll leave that to someone else. Word on the street is that the magic google phrase is 'shared nothing'. Can I do this entire post entirely with questions? I think not. So close! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? I don't see any reason why it shouldn't scale. A lot of it boils down to the nature of your app. Python as a language is easily good enough to power large systems. Ruby, Perl and PHP have all been used to implement large systems and they're arguably slower than Python. So a lot depends on your webserver configuration and your system architecture. Cache what you can, cut down database hits etc. Can I host the controller, for instance, on multiple servers for better response time? If not, would it even be a good idea to implement that kind of functionality? Yes - here's how I'm approaching this: I'm planning to run Apache with either FastCGI or mod_python. CherryPy will sit behind Apache and handle my controllers. I'm not going to use sessions although I'll use minimal cookie data to store state. I'm going to follow the mantra of sharing nothing. The controllers will access the database server using SQLObject. The database server is running Postgres. Another server will be used for storing user images and files using the native flat file system. Scaling then: when I need to scale I'll add another web server running Apache and TurboGears/CherryPy. I'll load balance between these web servers probably in a round-robin way to begin with. There are hardware and software solutions to load balancing. So, different web servers will be able to handle different requests from the same user. This is why sharing nothing is a good idea. Try not to store state on your web servers. As the number of web servers increases they'll begin to stress the database. The database server will become the bottleneck. Cache if you can. Next on the cards then is some form of database replication, although that's sufficiently far enough into the future to not worry about just now. :-) Bottom line: I think the framework is good enough to trigger your controllers and deliever templated response in a timely fashion. The rest is down to external factors that apply to any other web system out there. I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? Can I host the controller, for instance, on multiple servers for better response time? If not, would it even be a good idea to implement that kind of functionality? If you had to design a truly big system, or a lot of little interconnected remote sites, and wanted TG to do it what are the options? Can I do this entire post entirely with questions? I think not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
I wrote previously: I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? Sorry, after doing some Googling and looking around it appears I should be considering SCGI rather than FastCGI. I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? I don't see any reason why it shouldn't scale. A lot of it boils down to the nature of your app. Python as a language is easily good enough to power large systems. Ruby, Perl and PHP have all been used to implement large systems and they're arguably slower than Python. So a lot depends on your webserver configuration and your system architecture. Cache what you can, cut down database hits etc. Can I host the controller, for instance, on multiple servers for better response time? If not, would it even be a good idea to implement that kind of functionality? Yes - here's how I'm approaching this: I'm planning to run Apache with either FastCGI or mod_python. CherryPy will sit behind Apache and handle my controllers. I'm not going to use sessions although I'll use minimal cookie data to store state. I'm going to follow the mantra of sharing nothing. The controllers will access the database server using SQLObject. The database server is running Postgres. Another server will be used for storing user images and files using the native flat file system. Scaling then: when I need to scale I'll add another web server running Apache and TurboGears/CherryPy. I'll load balance between these web servers probably in a round-robin way to begin with. There are hardware and software solutions to load balancing. So, different web servers will be able to handle different requests from the same user. This is why sharing nothing is a good idea. Try not to store state on your web servers. As the number of web servers increases they'll begin to stress the database. The database server will become the bottleneck. Cache if you can. Next on the cards then is some form of database replication, although that's sufficiently far enough into the future to not worry about just now. :-) Bottom line: I think the framework is good enough to trigger your controllers and deliever templated response in a timely fashion. The rest is down to external factors that apply to any other web system out there. I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? Can I host the controller, for instance, on multiple servers for better response time? If not, would it even be a good idea to implement that kind of functionality? If you had to design a truly big system, or a lot of little interconnected remote sites, and wanted TG to do it what are the options? Can I do this entire post entirely with questions? I think not. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
Justin Johnson [EMAIL PROTECTED] writes: I wrote previously: I've still got some decisions to make such as FastCGI vs mod_python - anybody have the pro's and con's between these two? Sorry, after doing some Googling and looking around it appears I should be considering SCGI rather than FastCGI. I dunno how all of that will play with FirstClass (TG + WSGI) since this is looking like it will be the future... Anyway, the best thing you should do is benchmarking *your* application. And you should also think about using mod_proxy, CP is pretty good and fast but combined with Apache's cache it goes even farther than one would expect. One advantage of using such a configuration is not having to take Apache down to upgrade your TG website (at least I'm used to use more than one application here, so instead of taking all of them down I just restart the TG app and it is updated ;-)). -- Jorge Godoy [EMAIL PROTECTED] --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
I was wondering how well turbogears scales? Obviously interesting caching tricks can be done on the web face, and other strange voodoo can be done in the database, but is turbogears itself designed to scale? TurboGears itself is definitely designed to scale exactly the same way you scale most LAMP type systems. You scale it the same way that you would scale Rails applications, incidentally. There is certainly some additional work that could be done with caching, etc., but overall it is definitely designed to scale if you do it right. The real question is: is your TurboGears application designed to scale. I will get to this below. Can I host the controller, for instance, on multiple servers for better response time? If not, would it even be a good idea to implement that kind of functionality? If you properly design your application to be as stateless as possible, load balancing TurboGears is actually quite easy. Here is the short list of how to make this happen: 1. Don't use sessions, use cookies. If you absolutely *have* to use sessions, don't use in-memory sessions. Use database sessions if you have to. 2. Don't store any state inside your controllers, or in memory. If you do those things, then you can fire up as many processes as you like, on as many servers as you like, and use lighttpd's built-in load-balancing to point at your application processes. I have done this myself with SCGI and lighttpd, and its crazy fast. A lot of people tend to focus on caching and minimizing trips to the database when it comes to scaling, but I think the above is much more important at an architectural level. This is how you should structure your application, and you can get to optimizing with caching and minimizing database hits later. Both caching and database optimization take considerably more time and effort than throwing a few extra processes at the problem. If you had to design a truly big system, or a lot of little interconnected remote sites, and wanted TG to do it what are the options? Too many options to speak of, but the WSGI future of TurboGears will make this much easier than it is today (not that the situation is all that bad today!). Best of luck scaling! I don't suspect you will have too many problems if you follow some of the simple rules mentioned in this thread already. -- Jonathan LaCour http://cleverdevil.org --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---
[TurboGears] Re: Open Question: Turbogears and scaling...
On 3/16/06, Jorge Godoy [EMAIL PROTECTED] wrote: Justin Johnson [EMAIL PROTECTED] writes: Sorry, after doing some Googling and looking around it appears I should be considering SCGI rather than FastCGI. I dunno how all of that will play with FirstClass (TG + WSGI) since this is looking like it will be the future... CherryPy already uses WSGI on the front end, so First Class will not actually change web server deployment. (What First Class changes is how you can compose your applications and reuse bits from elsewhere.) Kevin --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups TurboGears group. To post to this group, send email to turbogears@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/turbogears -~--~~~~--~~--~--~---