Thanks Charlie - It's really interesting to see all the different perspectives on this!
Clarke From: [email protected] [mailto:[email protected]] On Behalf Of Charlie Hubbard Sent: Friday, May 14, 2010 11:26 PM To: [email protected] Subject: Re: [AFFUG Discuss] Ruby on Rails vs MVC Spring / Hibernate Last time I cared to check Twitter couldn't split their database not so much because of rails, but mostly because of the nature of social networks is hard to split. Alex Paynt, one of twitter's engineer, lamented on his blog (can't find the link) when they were having so many problems that this would the last time he built an application that could split his data. Here's an interview where he talks about the challenges Rails architecture made it hard to scale after a certain point. http://www.radicalbehavior.com/5-question-interview-with-twitter-developer-a lex-payne/ Even 37 Signal, rails masters, had a blog post about scaling BaseCamp on rails, and they punted. 37 Signals said they scaled BaseCamp vertically by using static RAM drives which had dropped in price by quite a bit at the time. They claimed that scaling vertically was fine and it was much simpler to keep it all on one database (true to a degree). This seemed like a flip-flop from the "just throw CPU's at it" horizontal scaling marketing Rails purported when it first came out. In some ways when they really had to scale they realized how hard it was. Alex discusses using multiple read-only databases for caching, but that's a complete different thing than what I was talking about. Multiple read/write databases (no single master). And, while they might have changed it by now with Grizzard which is NOT Rails based anymore (Scala). Once they realized the single database approach is not working it's too late. Twitter has one of the hardest schema to split. Friend graphs are very complex persistence problem because it's really hard to break the graph so portions of it can be stored out a single database. Even if you can split a series of cluster of friends it only takes one person to friend someone across the database boundary to ruin your split. Once you have splits across databases querying gets much much harder. So while they might have figured something out by now. It's not easy using relational DB. Something like Neo4J would be better. (Although NEO4J wasn't available when Twitter was first written). Eventually it becomes an issue of I can't add more servers because my DB becomes a bottle neck as noted in the post. The other thing that makes Rails architecture harder to change is the singleton database connection in it (noted in the article). Unfortunately, changing that architectural decision is harder (a great example of why singletons are very damaging because they're hard to undo without major changes), but maybe Rails 3 will change this. They'll have to make a serious change to it for it to work more likely incompatible change API wise. As for the global lock this problem is well known in Rails. The work around for this is to handle multiple requests using OS processes running multiple copies of Rails. Processes are heavy weight and memory intensive so you'll reach a limit to those as well. Typically you run somewhere between 10 - 20 Mongrel processes so that's how many clients you can handle at once. 10-20 threads is nothing for a JVM. You can run 200 or more without much trouble so this means you need less servers. Since CF is written on top of Java it can take advantage of the very solid threading model Java designed for concurrency. If you use JRuby most of these issues aren't the same, but I still think there's a global lock in ActiveRecord for the singleton database connection (you can't share the DB connection). Maybe that's been fixed, but I don't think JRuby has forked Rails for this yet. I do find it funny that since Ruby VM hasn't been that strong now most people are retreating back to the JVM and using Scala, JRuby, or CF on top of it. I'm all in favor of using other languages on the JVM. It's just another language that runs on the JVM, and the JVM has turned into the universal VM for other languages. By the way Starling is one of those background job frameworks with poor performance I mentioned earlier. They rewrote it in scala. Figures. Charlie On Fri, May 14, 2010 at 4:17 PM, Clarke Bishop <[email protected]> wrote: Charlie's comment got me curious about how Rails and CFWheels compare on some of these issues, so I asked the question over on the CFWheels email list. That provoked a really interested discussion that I learned from. Here is one of the best comments: I hope this helps forward the conversation over here! Clarke -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of joshua clingenpeel Sent: Friday, May 14, 2010 11:47 AM To: [email protected] Subject: Re: [cfwheels] Re: ActiveRecord Limitations Twitter really only uses Rails for their frontend anymore. They rewrote their Starling gem in Scala to improve the resource management and crash recovery. I haven't heard that Twitter is only using one 'massive database' - that sounds a little absurd, and the Rails community is still recovering from a ton of bad press from the 'it can't scale' stories that circulated with Twitters issues. I think you might want to look into Gizzard (http://engineering.twitter.com/2010/04/introducing-gizzard-framework-for.ht ml). >From what I understand, Amazon's Rails projects have a similar separation of web and data tiers where Rails handles the former and they've written custom JRuby modules to interface with the data tier. Anyway, more to the point - Wheels is a framework that sits on top of CF, so we're really only limited to what CF gives us. Rails is a framework on top of Ruby and is again limited to what components of Ruby you tell it to use. ActiveRecord2 is little more than a SQL generator with a bunch of helper methods. A lot of people have been stripping it out in favor of some other, more robust ORMs in their Rails projects. ActiveRecord3 is supposed to be much better, but we'll see - one of the nice bits of Rails3 is that it makes it much easier to NOT use ActiveRecord. You can have more than one datasource in Rails, but Rails was built to cover 80% of all application needs - one of the big ones was that an application typically only needs one database, and changing that behavior is a pain (I have an app I took over that ended up talking to a MySQL database and two remote Oracle databases - not much fun). Wheels was built with this same assumption, but the nature of CF is to allow multiple datasources with little issue, so Wheels lets you, too. Now the question about 'restrictions in design that only allow ONE client to be processed at a time' - that sounds like the issue that Mongrel by default is a single-threaded process. If ActiveRecord is bottlenecking behind the scenes, that would be bad, and I'm quite sure it isn't. Phusion's Passenger is emerging as the standard for Rails servers, and it's got a managed worker thread pool. The alternative is to create a Mongrel pack and then a proxy pass in apache with a load balancer - yuck. But if you look at what CF is doing, you get multi-threading instantly, and CF9 has a ton of tools for distributing that load and handling performance issues. To be honest, if you're experiencing race conditions in dealing with the database, you need to look at your code first. On Fri, May 14, 2010 at 7:56 AM, raulriera <[email protected]> wrote: > You can setup different datasources in Wheels, you do that at the > model files. > > On May 14, 10:50 am, "Clarke Bishop" <[email protected]> wrote: >> A question came up the morning on the Atlanta Flex Users Group (AFFUG.com) >> list regarding Rails versus Hibernate. >> >> One of the responses said: >> >> I still like the ActiveRecord simple philosophy over HIbernate, but >> ActiveRecord has one Achilles heel. It assumes a single database, and while >> you can play tricks on it to make it move between them it's still kinda >> tough to do that. Java, on the other hand, doesn't restrict you to just one >> database out of the box and so it's more straight forward to do database >> sharding which is very important if you plan to scale (and continue to use a >> database for storage). Twitter still runs on a single database because of >> the difficultly in changing after you've already designed your application >> to be single DB focused. Furthermore, ActiveRecord has restrictions in its >> design that only allow ONE client to be processed at a time. Rails still >> has the global lock to prevent multiple clients through at once. So you >> have to use OS processes to service multiple clients at once. >> >> So, I'm wondering how much of this applies to CFWheels? We probably are >> limited to a single database. But, what about the locking to prevent >> multiple clients? Is that a CFWheels limitation, too? >> >> Thanks, >> >> Clarke ---------- Forwarded message ---------- From: Charlie Hubbard <[email protected]> Date: Fri, May 14, 2010 at 9:17 AM Subject: Re: [AFFUG Discuss] Ruby on Rails vs MVC Spring / Hibernate To: [email protected] I've been a long time user of rails, but I've drifted back towards Java recently because I think I'm more productive in Java. When Rails came out Java frameworks were pretty bad so using Rails had huge advantages. However, as time as gone one most of the Java frameworks have been inspired by Rails' simplicity in one way or another and have mostly made up that gap between them so Rails is not as productive over Java as it once was. I still like the ActiveRecord simple philosophy over HIbernate, but ActiveRecord has one Achilles heel. It assumes a single database, and while you can play tricks on it to make it move between them it's still kinda tough to do that. Java, on the other hand, doesn't restrict you to just one database out of the box and so it's more straight forward to do database sharding which is very important if you plan to scale (and continue to use a database for storage). Twitter still runs on a single database because of the difficultly in changing after you've already designed your application to be single DB focused. Furthermore, ActiveRecord has restrictions in its design that only allow ONE client to be processed at a time. Rails still has the global lock to prevent multiple clients through at once. So you have to use OS processes to service multiple clients at once. Then there's the background task item. Rails doesn't have any support for running background jobs. There are many solutions out there for doing this in Rails applications, but I found most of them be to pretty horrible performance wise. And cumbersome to setup - cough Drb. So after I weighed all of that and looked at Java again. I started to really like Java again. So my stack I like to use now is: Jetty or Tomcat Spring MVC Sprint JDBC Templates for DB MongoDB if I can get away from Relational DB JST4J to replace JSP ActiveMQ or Quartz for BG Jobs Charlie On Fri, May 14, 2010 at 7:13 AM, Eric DeCoff <[email protected]> wrote: Hey all, What is everyones opinion on Ruby on Rails vs MVC spring / hibernate? -- Eric R. DeCoff Changing the world, 1 line of code at a time ------------------------------------------------------------- To unsubscribe from this list, simply email the list with unsubscribe in the subject line For more info, see http://www.affug.com Archive @ http://www.mail-archive.com/discussion%40affug.com/ List hosted by FusionLink <http://www.fusionlink.com> ------------------------------------------------------------- ------------------------------------------------------------- To unsubscribe from this list, simply email the list with unsubscribe in the subject line For more info, see http://www.affug.com Archive @ http://www.mail-archive.com/discussion%40affug.com/ List hosted by http://www.fusionlink.com -------------------------------------------------------------
