[OpenBD] Re: GAE - Datastore serialization and changes to model

Vince Bonfanti Fri, 05 Jun 2009 11:20:42 -0700

I think you're getting too hung up on "serialization" as an implementation
detail of the CFC persistence layer. The real questions are: What does it
mean to save/restore a CFC to/from a persistent datastore? and, How does the
CFML developer deal with the CFC persistence layer?


For Java objects, persistence means to just save the data members, not the
"behavior" members (methods). This might make sense for a static language;
and, consider the complex annotation "language" (JDO) that's been developed
to manage persistence. Go take a look at the GAE group and you'll see most
of the questions revolve around use JDO.

CFML isn't Java, CFCs aren't Java objects, and CFML developers aren't Java
programmers. CFML is a dynamic language: you can add/remove behaviors
(functions) from CFCs at will, you can change the types of variables, you
can have one CFC instance that has some variables defined and another CFC
instance of the same type that doesn't. None of this is true for Java, which
is a static language. So the model for dealing with CFC persistence can't be
the same as the model for dealing with Java object persistence.

Forget about serialization as an implementation detail, and step back to a
higher level: as a CFML programmer, how do you want CFC persistence to work?
Here are my (initial) answers to this question:

  1) I want it to be simple. I want to write a "plain old" CFC (without any
persistence annotations) to the datastore and then read it back out again,
and get back the exact same CFC that I wrote out.

  2) I want to be able to query the datastore based on any variable I place
in the CFC "this" or "variables" scope, again, without having to do any
persistence annotations.

  3) I want it to be fast so that I don't have to worry (too much) about
performance.

The current implementation reflects the above.

So now we have the issue of CFC versioning (an important issue). Again,
without referring to any implementation details, as a CFML programmer, how
do you want this to work? If we can answer that question, then we can worry
about the implementation details.

Vince
On Fri, Jun 5, 2009 at 1:26 PM, Baz <[email protected]> wrote:

> I don't know what the *right* way is, but allow me to play devil's advocate
> for a moment.
>
> Consider "method injection" techniques that are used in some frameworks,
>> where a method is dynamically added to a CFC after it's created--don't you
>> want that method to still be there after reading from the datastore
>
>
> I'm a big fan of mixins (when used appropriately) and use them in my own
> code, so I definitely see the coolness of this.
>
> There would be considerably more overhead in "deconstructing" a CFC
>> instance and writing the constituent parts to the datastore, and then
>> rebuilding a new CFC instance "from scratch" when reading from the
>> datastore, compared to simply serializing and deserializing the CFC
>> instance.
>>
>
> Isn't this the same problem that any datastore has whether it is GAE or
> relational or a text file? As cfml coders we always have the option of
> serializing cfcs, and we can then choose to store them in a db or perhaps a
> caching layer if we wanted to. Why change the paradigm completely in GAE and
> force serialization on the data layer thereby complicating/limiting other
> options? It really seems like that should be the job of a framework or a
> caching layer. Plus won't this make it much more difficult for people to
> port their apps, especially if the apps are data-centric rather than
> object-based?
>
> On my current GAE test site (http://www.thinkloop.com) each artifact on
> the page represents a user that is saved in the datastore (if you mouse-over
> an artifact you will see the ip address and some other info about each
> user). To fill the page, I need a 1000 records, and therefore a 1000 full
> blown objects. Even with serailization it is still relatively heavy and
> slightly slower than the equivalent site using a straight recordset.
>
>  the CFC persistence layer isn't built on top of JDO, but is written
>> directly on top of the GAE datastore "low-level API."
>>
>
> You know a lot more about this than I do, but doesn't the low-level api get
> closer to treating the datastore like a flat text file and less like
> objects? It would seem that this would make it more natural to treat the
> recordset like a query than an array of objects. I really don't know much
> about this, just asking.
>
> Serialization also simplifies the code within the BD persistence layer
>> (simpler code means fewer bugs)
>>
>
> I don't want to sound like a bastard but you and the team are very skilled
> programmers and I don't think ease of implementation should be heavily
> weighted if it comes at the cost of a better end-user experience -
> especially at this stage of the game. Perhaps by investigating other
> possibilities we might find that an equally clean/easy implementation exists
> with a better end-user experience. Again, I want to re-emphasize that I am
> being devil's advocate, and I appreciate all that you do, and that I would
> not be able to do better myself! :)
>
> googleRead( oldKey, newCFC );
>
>
> Having said all that, your work-around is quite clean, especially if you
> write it in the other notation making it completely invisible:
> newCFC.googleRead('key');
>
> What is it about this datastore that makes it more suitable for built-in
> serialization than other db's?
>
> Baz
>
>
>
>   On Fri, Jun 5, 2009 at 9:30 AM, Vince Bonfanti <[email protected]>wrote:
>
>> Serialization guarantees we always restore the CFC to the exact state it
>> was in prior to being written to the datastore. Consider "method injection"
>> techniques that are used in some frameworks, where a method is dynamically
>> added to a CFC after it's created--don't you want that method to still be
>> there after reading from the datastore?
>>
>> Serialization also simplifies the code within the BD persistence layer
>> (simpler code means fewer bugs). There are also performance considerations,
>> which you raised previously. There would be considerably more overhead in
>> "deconstructing" a CFC instance and writing the constituent parts to the
>> datastore, and then rebuilding a new CFC instance "from scratch" when
>> reading from the datastore, compared to simply serializing and deserializing
>> the CFC instance.
>>
>> BTW, you said something in a previous message about "proxying to the Java
>> layer." This isn't what BD does at all; the CFC persistence layer isn't
>> built on top of JDO, but is written directly on top of the GAE datastore
>> "low-level API." Building a CFC persistence layer on top of JDO would be a
>> big mistake in terms of complexity and performance.
>>
>> Having said all that, it might be useful for BD to build in some sort of
>> versioning support to help automate this. For example, maybe we could do
>> something like this (adding a second parameter to the GoogleRead function):
>>
>>     <cfscript>
>>     newCFC = createObject( "component", "MyCFC" );
>>     googleRead( oldKey, newCFC );
>>     </cfscript>
>>
>> The above code would retrieve the old CFC represented by "oldKey" from the
>> datastore and use it to populate the "newCFC" instance by copying all
>> variables (except for functions) from the "this" and "variables" scopes from
>> "oldCFC" to "newCFC".
>>
>> It's exactly this kind of feedback that we need to make sure the product
>> is useful. Thanks.
>>
>> Vince
>>   On Fri, Jun 5, 2009 at 12:01 PM, Baz <[email protected]> wrote:
>>
>>> Why are we serializing cfc's again? Why not stick to the common workflow
>>> of returning data from the datastore, then using that to initialize a new
>>> cfc and then using memcached or some other caching layer if serialization is
>>> needed? The synchronization logic is non-trivial and will ALWAYS arise as
>>> people adjust their cfc's all the time.
>>>
>>> Baz
>>>
>>>
>>>
>>> On Fri, Jun 5, 2009 at 6:44 AM, Vince Bonfanti <[email protected]>wrote:
>>>
>>>> Yes, I've run into this problem myself while working on my BlogCFC port.
>>>> BD essentially treats CFC functions as data within the CFC "this" and
>>>> "variables" scope, which means they get serialized with the CFC when 
>>>> writing
>>>> to the GAE datastore. If you modify your CFC change the implementation of a
>>>> function or add new functions, those changes don't affect CFCs that are
>>>> already in the datastore. You'll have to deal with this manually. Two
>>>> suggestions to make this easier:
>>>>
>>>>   1) Store a "version number" within your CFC "this" or "variables"
>>>> scope and increment it whenever you modify the CFC.
>>>>
>>>>   2) Create an init() function for your CFC that takes another CFC as
>>>> the argument; initialize the CFC by looping through the "this" and
>>>> "variables" scopes and copy all the relevant variables. You can use this
>>>> init() function to upgrade a CFC.
>>>>
>>>> Something like this:
>>>>
>>>>     <cfscript>
>>>>     oldCFC = googleRead( oldKey );
>>>>     if ( oldCFC.version = "1.0" ) { // need to upgrade to version 2.0
>>>>         newCFC = createObject( "component", "MyCFC" );
>>>>         newCFC.init( oldCFC );
>>>>         googleWrite( newCFC );
>>>>         googleDelete( oldCFC );
>>>>     }
>>>>     </cfscript>
>>>>
>>>> Vince
>>>>  On Fri, Jun 5, 2009 at 2:15 AM, Baz <[email protected]> wrote:
>>>>
>>>>> If someone were persisting their cfc's to the datastore, then needed to
>>>>> update the code, a problem arises. The cfc's that were stored in the past
>>>>> will be de-serialized to old components that are missing new methods and
>>>>> code updates that may be needed by the updated app. Does the application
>>>>> have to manage the synchronization of cfc's in the datastore?
>>>>>
>>>>> Baz
>>>>
>>>>
>>>>
>>>>
>>>>

--~--~---------~--~----~------------~-------~--~----~
Open BlueDragon Public Mailing List
 http://groups.google.com/group/openbd?hl=en
 official site @ http://www.openbluedragon.org/

!! save a network - trim replies before posting !!
-~----------~----~----~----~------~----~------~--~---

[OpenBD] Re: GAE - Datastore serialization and changes to model

Reply via email to