Nick, Denormalization is traditionally found in data marts [data warehouses, OLAP] and is the moral opposite of everything you learn in DBA school. That said they do make a lot of sense for performance reasons and are really nothing more than a lot of redundancy for the purpose of speeding up queries [normalization itself has much more to do with disk space limitations of the 1980s than anything else and, like all dogma, isn't questioned anymore even though it may not be as important/relevant as it once was].
The idea of caching relationships is a tricky but an interesting idea and sounds a lot like an inference engine concept called 'backward chaining' - the idea that all associations are worked out already [as opposed to forward chaining where they are determined on the fly]. I say its tricky because things can complicated *real fast [esp when it comes to refreshing this cache with new relationships]. But that's a start.. Another viable idea would be, as you hinted, to keep a data structure in memory that would represent the associations - this is of course a 'graph' in classical terms and not an easily optimized data structure. There are some interesting implementations using hash maps / trees, i.e .you could probably develop a sufficient enough data structure by digging deep into the ruby and taking the time to understand how ruby implements each data structure [or revive/connect an old c lib that provides a graph] - then really just use the database as metadata for an existing relationship [graph would only contain IDs to keep it lightweight [2-4 bytes a node on most hardware] then expanded once the determination is made]. but yea this is some heavy s*** man.. brez On 9/4/07, Nick Zadrozny <[EMAIL PROTECTED]> wrote: > Hey all, > > I've been thinking about an optimization problem that I'm vaguely familiar > with and not quite sure how to get started on. > > I've got an application in which each record might be associated with > thousands of others through a join model. A has_many :through situation. The > join models are important in and of themselves, but often I want to just > grab all the associated objects, and this is starting to get a bit > burdensome on the database. > > I'm tentatively thinking that denormalization would help me out here. But > that sort of thing is approaching the limits of my database knowledge. The > question comes down to this: say you want to cache the primary keys of > thousands of associated objects. What would that look like in your schema? > In your queries? > > Like I said, database noob here, so let's have the noob explanation. Also, > pointers to books or tutorials are welcome. I'd welcome some looks at > alternate caching strategies — this information doesn't necessarily have to > persist — but denormalization is something I would like to know more about > in general. > > -- > Nick Zadrozny • beyondthepath.com > _______________________________________________ > Sdruby mailing list > [email protected] > http://lists.sdruby.com/mailman/listinfo/sdruby > > -- John Bresnik (619) 228-6254 _______________________________________________ Sdruby mailing list [email protected] http://lists.sdruby.com/mailman/listinfo/sdruby
