Re: User demand and Issue 150. [was Re: Issue 150: Consistency requirements for relationships with mapped-by]

David Bullock Fri, 30 Dec 2005 07:07:43 -0800

Craig L Russell wrote:

First, let me say that I admire your passion. I wish that all expertgroup members were thus.

I figure that if it's worth mentioning in the first place, then it'sworth pursuing until it's clear to me that it's a flawed idea, or clearto others that it's a sound one. It is, admittedly, costing me muchmore time and effort to get to either state than I would like. But asthe British say, 'in for a penny, in for a pound'.

I do have a quibble with your counter example below. Your code ignoresthe return boolean value from this.lines.add(line). What value wouldyou return if the collection were not loaded?

OK, interesting point. The JDO impl would at least have to do a singleSELECT to verify if the Collection.contains() the added item. It stilldoesn't *have* to fault in the entire collection. If there were goingto be *repeated* inserts to the collection in this manner (say for adozen line items being attached to an invoice), then it might be moreefficient to fault in at least the PK's of the collection. This to mymind is just one more piece of information to be added to faultgroups/fetch plans.

So it seems that whenever Set.add() or Map.put() is invoked (regardlessof how 15.3 reads), the price of an immediate datastore access isincurred, because the contract of these methods promises to tell if thecollection was substantially modified EACH time.

Thus I concede there is some inherent performance advantage to be gainedby avoiding Collection.add() in user code, when the collection is infact transparently persisted. (A point I hadn't appreciated until now.. thanks for asking a good question).

I can also appreciate that RDBMS' present an opportunity, whereby avalue is flushed to the backing column of a <mapped-by/> fieldeffectively updates both sides anyway, so why not let the user have itas soon as practicable? The timing you propose - when DetachAllOnCommitoccurs - is even laudable given that the JDO impl apparently latelycan't be relied on to intercept mutators to bring it about immediately.In view of the performance savings attained by avoiding unnecessarycalls involving Collection.contains() (and only for such savings), thisseems a desirable hack. (Of course, when we finally get a JSR formanaged relationships, the hack won't be required any further).

I guess my main beef is that while this (performance motivated)optimization benefits me performance-wise when I care to use it, anddoesn't cost me performance-wise when I don't care to use it, it comesat the price of a cognitive burden - whether I happen to need it or not.

That burden is that I have to "watch my step", and not use the object atthe as-yet-unsynchronized end of the relationship, until after when the15.3 guarantees it will be synchronized. I dislike this prospect (evenassuming it is always possible to keep track of the necessary state,which is by no means obvious to me), because I already have my handsfull with programming obligations. Despite repeated attempts, I was notsuccessful in getting my fellow user Bin to un-captiously remark "I lovethis burden - this burden is everything I dreamed of", so I will assumefor now I am not the only person in the world to dislike it.

So far I have argued that the burden of keeping track which objects Ican and can't use is always unnecessary. But for the sake of outsmokingEJB3, I am willing to admit that it sometimes might be worth bearing.So let's concentrate on making the burden habitable.


There are 3 strategies for dealing with this burden:

#1 The SyncRelationshipsAfterCommit behaviour happens or not, but Idelcare to the PM that I am studiously not relying on it, and wish to benotified by a runtime exception if my code (or 3rd party code) fails toupdate both sides of a relationship by the time commit occurs. There isno cognitive burden. I forego the performance benefits of avoidingCollections.add(). My code works fine with non-managed objects indifferent contexts.

#2 The SyncRelationshipsAfterCommit behaviour always happens, but Ichoose as a matter of policy not to rely on it, and I always manuallyupdate both sides of the relationship. I forego the performancebenefits of avoiding Collections.add(). There is a small chance that mycode won't work with non-managed objects in different contexts, becauseJDO doesn't tell me if I unintentionally violate my own policy.(Although it allows me to selectively and intentionally violate it,which might sometimes be beneficial). So some cognitive burden remains.

#3 The SyncRelationshipsAfterCommit behaviour always happens, and Ichoose to exploit it by judicious and minimal use of the model beforecommit. I live with the burden. I attain the performance benefits, Iwin the Petstore 'benchmark'. I never update both sides of therelationship unless I can't help it. Sometimes I will have to, becausethe model objects are used by 3rd-party code I have no control over, andit willl expect the relationship to be completely mutual even beforecommit. When trying to use my code in contexts where the objects arenon-managed, I may have to rewrite my minimal pre-commit code, since theabsence of the synchronization in the non-managed environment will meanthat my post-commit code won't be receiving the model in the expectedconsistent state. In the best case, because I partitioned my codeaccording to the principles of OO, and not according to the time when itgets executed, I'll just have to touch the internals of every secondsetter involved in setting up a bi-directional relationship. If I'verelied so heavily on the synch behaviour that I omitted accessors like'Set getChildren()', mutators like 'void add(DomainEntity)' or factorymethods like 'DomainEntiry newChild()' on some of my interfaces, it'llbreak existing clients of my code.

I hope I have established that #3 might not be every developer's cup oftea, and that they might prefer to accept some performance hit to avoidboth the cognitive burden and the potential implications for theircode. Certainly, they should be given the choice. They sort of havethe choice with #2, but it is a bit hit-and-miss and they are notreceiving a lot of help from JDO in enforcing their policy. This wouldbe adequately, cheaply, and neatly addressed by #1.

I contend that providing for #1 and the user to request that partialupdates to relationships at commit be regarded as an 'inconsistentupdate' is no more of a burden on vendors than it is for them tosynchronize the memory model - they have to perform the detection in anycase. So there is no good reason why you shouldn't allow this strategy(in addtion to the others) in the JDO 2.0 spec.


in conclusion,
David.

Re: User demand and Issue 150. [was Re: Issue 150: Consistency requirements for relationships with mapped-by]

Reply via email to