What's interesting is if you read on how many of the very large sites have scaled (ebay, digg, facebook).. they have gone through a similar pattern as they went through their organic growth. You can find some of the excerpts here (http://highscalability.com). The general pattern seems to be a normalized database -> clustered databases -> denormalization and 'sharding'.
Of course it would be somewhat silly to plan for this type of architecture before you really see where the load really is hitting. It's a never-ending battle where you clear one bottleneck to discover another. On 9/4/07, John Bresnik <[EMAIL PROTECTED]> wrote: > > Nick, > > Denormalization is traditionally found in data marts [data warehouses, > OLAP] and is the moral opposite of everything you learn in DBA school. > That said they do make a lot of sense for performance reasons and are > really nothing more than a lot of redundancy for the purpose of > speeding up queries [normalization itself has much more to do with > disk space limitations of the 1980s than anything else and, like all > dogma, isn't questioned anymore even though it may not be as > important/relevant as it once was]. > > The idea of caching relationships is a tricky but an interesting idea > and sounds a lot like an inference engine concept called 'backward > chaining' - the idea that all associations are worked out already [as > opposed to forward chaining where they are determined on the fly]. I > say its tricky because things can complicated *real fast [esp when it > comes to refreshing this cache with new relationships]. But that's a > start.. > > Another viable idea would be, as you hinted, to keep a data structure > in memory that would represent the associations - this is of course a > 'graph' in classical terms and not an easily optimized data structure. > There are some interesting implementations using hash maps / trees, > i.e .you could probably develop a sufficient enough data structure by > digging deep into the ruby and taking the time to understand how ruby > implements each data structure [or revive/connect an old c lib that > provides a graph] - then really just use the database as metadata for > an existing relationship [graph would only contain IDs to keep it > lightweight [2-4 bytes a node on most hardware] then expanded once the > determination is made]. but yea this is some heavy s*** man.. > > brez > > > > > On 9/4/07, Nick Zadrozny <[EMAIL PROTECTED]> wrote: > > Hey all, > > > > I've been thinking about an optimization problem that I'm vaguely > familiar > > with and not quite sure how to get started on. > > > > I've got an application in which each record might be associated with > > thousands of others through a join model. A has_many :through situation. > The > > join models are important in and of themselves, but often I want to just > > grab all the associated objects, and this is starting to get a bit > > burdensome on the database. > > > > I'm tentatively thinking that denormalization would help me out here. > But > > that sort of thing is approaching the limits of my database knowledge. > The > > question comes down to this: say you want to cache the primary keys of > > thousands of associated objects. What would that look like in your > schema? > > In your queries? > > > > Like I said, database noob here, so let's have the noob explanation. > Also, > > pointers to books or tutorials are welcome. I'd welcome some looks at > > alternate caching strategies — this information doesn't necessarily have > to > > persist — but denormalization is something I would like to know more > about > > in general. > > > > -- > > Nick Zadrozny • beyondthepath.com > > _______________________________________________ > > Sdruby mailing list > > [email protected] > > http://lists.sdruby.com/mailman/listinfo/sdruby > > > > > > > -- > John Bresnik > (619) 228-6254 > _______________________________________________ > Sdruby mailing list > [email protected] > http://lists.sdruby.com/mailman/listinfo/sdruby >
_______________________________________________ Sdruby mailing list [email protected] http://lists.sdruby.com/mailman/listinfo/sdruby
