Nick,

Denormalization is traditionally found in data marts [data warehouses,
OLAP] and is the moral opposite of everything you learn in DBA school.
That said they do make a lot of sense for performance reasons and are
really nothing more than a lot of redundancy for the purpose of
speeding up queries [normalization itself has much more to do with
disk space limitations of the 1980s than anything else and, like all
dogma, isn't questioned anymore even though it may not be as
important/relevant as it once was].

The idea of caching relationships is a tricky but an interesting idea
and sounds a lot like an inference engine concept called 'backward
chaining' - the idea that all associations are worked out already [as
opposed to forward chaining where they are determined on the fly]. I
say its tricky because things can complicated *real fast [esp when it
comes to refreshing this cache with new relationships].  But that's a
start..

Another viable idea would be, as you hinted, to keep a data structure
in memory that would represent the associations - this is of course a
'graph' in classical terms and not an easily optimized data structure.
There are some interesting implementations using hash maps / trees,
i.e .you could probably develop a sufficient enough data structure by
digging deep into the ruby and taking the time to understand how ruby
implements each data structure [or revive/connect an old c lib that
provides a graph] - then really just use the database as metadata for
an existing relationship [graph would only contain IDs to keep it
lightweight [2-4 bytes a node on most hardware] then expanded once the
determination is made]. but yea this is some heavy s*** man..

brez




On 9/4/07, Nick Zadrozny <[EMAIL PROTECTED]> wrote:
> Hey all,
>
> I've been thinking about an optimization problem that I'm vaguely familiar
> with and not quite sure how to get started on.
>
> I've got an application in which each record might be associated with
> thousands of others through a join model. A has_many :through situation. The
> join models are important in and of themselves, but often I want to just
> grab all the associated objects, and this is starting to get a bit
> burdensome on the database.
>
> I'm tentatively thinking that denormalization would help me out here. But
> that sort of thing is approaching the limits of my database knowledge. The
> question comes down to this: say you want to cache the primary keys of
> thousands of associated objects. What would that look like in your schema?
> In your queries?
>
> Like I said, database noob here, so let's have the noob explanation. Also,
> pointers to books or tutorials are welcome. I'd welcome some looks at
> alternate caching strategies — this information doesn't necessarily have to
> persist — but denormalization is something I would like to know more about
> in general.
>
> --
> Nick Zadrozny • beyondthepath.com
> _______________________________________________
> Sdruby mailing list
> [email protected]
> http://lists.sdruby.com/mailman/listinfo/sdruby
>
>


-- 
John Bresnik
(619) 228-6254
_______________________________________________
Sdruby mailing list
[email protected]
http://lists.sdruby.com/mailman/listinfo/sdruby

Reply via email to