On May 5, 2011, at 6:31 AM, Ted Han wrote:
> I just wanted to drop you a line, since i haven't seen any other replies (i
> figure people are probably busy). I think this is a great idea.
Thanks, Ted, I appreciate your thoughtful feedback!
> The only concerns i have (and i'm not sure this is something that would be a
> serious problem) is that what you're suggesting runs very much contrary to
> the way that NoSQL datastores function, which, possessing the capability of
> storing arbitrary attribute sets don't need blob based attribute
> serialization. So what i'd do is probably recommend as a roadmap phase
> (since most of the discussion you've jotted down appears to be RDBMS
> oriented, which is a good thing, imo), you might note what sort of
> architectural changes you would need to make your gem abstract enough that
> someone down the way can use the native facilities that NoSQL stores possess.
My thoughts were definitely RDBMS-based, as that's my current need. Thanks for
bringing up the different considerations inherent in different types of
data-stores.
> Actually upon further thought, paper_trail's design would probably work as
> well for NoSQL stores. You might look at how people deal with this on Google
> AppEngine's BigTable store too. :)
Good point. I've looked largely at solutions born in the RDBMS world. I'll cast
the net wider.
> What i would want to avoid, is basically taking a Resource and turning it
> into an entity-mapping storage system, which are incredibly slow for large
> numbers of records or fields for each ruby object.
True. The serialized blob / entity-mapping approach is lousy for the purposes
of reporting or aggregate access in general. My current project doesn't require
reporting against the versions' attributes, only individual access and
reporting on the metadata (additional columns alongside the serialized blob).
> Another mechanism you could use is to attach version numbers to properties,
> so that the when the class is loaded, it knows which properties are current,
> and which ones are deprecated. That way you have a semantic for knowing
> which fields to create for new objects (most recent record set), and the ruby
> class is also aware of what deprecated properties it *might* encounter on
> older records.
Interesting. And using DataMapper's declarative property system seems like a
perfect fit, since properties could be conditionally present.
> Then you've got 3 explicit semantics, unversioned properties (current/always
> present), versioned properties (deprecated, but still around), and deleted
> properties (once all older records are migrated up, or deleted or whatever).
I'm sure there are cases where having the full record of a Resource's evolution
in one place would be very useful, but in mine this is not necessary. It's
double edged though, because this information is present, just a question of
how explicitly.
> DataMapper errs on the side of being explicit, and i think that's usually a
> good thing. You can do some fun api stuff with this too:
>
> class Taco
> include DataMapper::Resource
>
> property :foo, Serial
> property :bar, String, :version => 4 # or maybe a range? 4..8
>
> version do
> property :baz, Integer, :default => 0, :version => 8 # from version 0..8
> version 4 do
> property :blerp, Integer, :version => 12 # from version 4..12
> property :bleep, String # only in version 4
> end
> end
> end
One thing I like about dm-is-versioned and paper_trail is that they don't
burden the versioned model with many versioning-related responsibilities. I'm
inclined to pursue that as a goal.
Also, this would work pretty well for adding/removing columns, but wouldn't it
get pretty complicated with changes to a column's data-type or length or the
like (same property name, but different definition)?
That said, you've got me thinking about the inherent problem of 'reifying' (aka
'reconstituting') Resource versions with outdated schema (ie., from a previous
'generation' of a Resource). As you pointed out, it's not just the data-store
schema that drifts out of sync, the *code* that represents the Resource will
necessarily become out of sync, too.
This is perhaps excessively complex, but I would consider splitting out the
iterations of the Taco Resource into separate models ('generations', if you
will). So perhaps:
# the current, 'live' Resource
class Taco
include DataMapper::Resource
property :foo, Serial
end
# the first iteration of the Resource
class Taco::Generation::Zero
include DataMapper::Resource
valid_between DateTime(2010,6,14)..DateTime(2011,2,6)
property :foo, Serial
property :bar, String # no longer present on current Taco Resource
end
# the second iteration of the Resource
class Taco::Generation::One
include DataMapper::Resource
valid_between DateTime(2011,2,7)..DateTime(2011,4,14)
property :foo, Serial
property :bar, String # no longer present on current Taco Resource
property :quux, Integer # added in this generation, and removed in this
generation
end
This could become a nightmare very quickly, *but* it would make it very clear
to determine how to deal with & represent the Resource at any given point in
its evolution. Also, I think I would prefer to use timestamps instead of
integers to identify generations, but that's a minor detail.
Hmm... I'm thinking about limiting the scope to only allow reifying Resources
of the current generation. Alternately (and probably the most pragmatic
option), reifying a previous version could always coerce the previous version
into the current generation's properties (this is what paper_trail does).
Hmm... I'm not quite sure on the details, but I bet @solnic's EmbeddedValue
work could be used to great effect here.
> Hopefully some food for thought,
Absolutely. Thanks again for your feedback, and any further comments would be
appreciated, too.
- Emmanuel
--
You received this message because you are subscribed to the Google Groups
"DataMapper" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/datamapper?hl=en.