Craig L Russell wrote:

First, let me say that I admire your passion. I wish that all expert group members were thus.


I figure that if it's worth mentioning in the first place, then it's worth pursuing until it's clear to me that it's a flawed idea, or clear to others that it's a sound one. It is, admittedly, costing me much more time and effort to get to either state than I would like. But as the British say, 'in for a penny, in for a pound'.

I do have a quibble with your counter example below. Your code ignores the return boolean value from this.lines.add(line). What value would you return if the collection were not loaded?


OK, interesting point. The JDO impl would at least have to do a single SELECT to verify if the Collection.contains() the added item. It still doesn't *have* to fault in the entire collection. If there were going to be *repeated* inserts to the collection in this manner (say for a dozen line items being attached to an invoice), then it might be more efficient to fault in at least the PK's of the collection. This to my mind is just one more piece of information to be added to fault groups/fetch plans.

So it seems that whenever Set.add() or Map.put() is invoked (regardless of how 15.3 reads), the price of an immediate datastore access is incurred, because the contract of these methods promises to tell if the collection was substantially modified EACH time.

Thus I concede there is some inherent performance advantage to be gained by avoiding Collection.add() in user code, when the collection is in fact transparently persisted. (A point I hadn't appreciated until now .. thanks for asking a good question).

I can also appreciate that RDBMS' present an opportunity, whereby a value is flushed to the backing column of a <mapped-by/> field effectively updates both sides anyway, so why not let the user have it as soon as practicable? The timing you propose - when DetachAllOnCommit occurs - is even laudable given that the JDO impl apparently lately can't be relied on to intercept mutators to bring it about immediately. In view of the performance savings attained by avoiding unnecessary calls involving Collection.contains() (and only for such savings), this seems a desirable hack. (Of course, when we finally get a JSR for managed relationships, the hack won't be required any further).

I guess my main beef is that while this (performance motivated) optimization benefits me performance-wise when I care to use it, and doesn't cost me performance-wise when I don't care to use it, it comes at the price of a cognitive burden - whether I happen to need it or not.

That burden is that I have to "watch my step", and not use the object at the as-yet-unsynchronized end of the relationship, until after when the 15.3 guarantees it will be synchronized. I dislike this prospect (even assuming it is always possible to keep track of the necessary state, which is by no means obvious to me), because I already have my hands full with programming obligations. Despite repeated attempts, I was not successful in getting my fellow user Bin to un-captiously remark "I love this burden - this burden is everything I dreamed of", so I will assume for now I am not the only person in the world to dislike it.

So far I have argued that the burden of keeping track which objects I can and can't use is always unnecessary. But for the sake of outsmoking EJB3, I am willing to admit that it sometimes might be worth bearing. So let's concentrate on making the burden habitable.

There are 3 strategies for dealing with this burden:

#1 The SyncRelationshipsAfterCommit behaviour happens or not, but I delcare to the PM that I am studiously not relying on it, and wish to be notified by a runtime exception if my code (or 3rd party code) fails to update both sides of a relationship by the time commit occurs. There is no cognitive burden. I forego the performance benefits of avoiding Collections.add(). My code works fine with non-managed objects in different contexts.

#2 The SyncRelationshipsAfterCommit behaviour always happens, but I choose as a matter of policy not to rely on it, and I always manually update both sides of the relationship. I forego the performance benefits of avoiding Collections.add(). There is a small chance that my code won't work with non-managed objects in different contexts, because JDO doesn't tell me if I unintentionally violate my own policy. (Although it allows me to selectively and intentionally violate it, which might sometimes be beneficial). So some cognitive burden remains.

#3 The SyncRelationshipsAfterCommit behaviour always happens, and I choose to exploit it by judicious and minimal use of the model before commit. I live with the burden. I attain the performance benefits, I win the Petstore 'benchmark'. I never update both sides of the relationship unless I can't help it. Sometimes I will have to, because the model objects are used by 3rd-party code I have no control over, and it willl expect the relationship to be completely mutual even before commit. When trying to use my code in contexts where the objects are non-managed, I may have to rewrite my minimal pre-commit code, since the absence of the synchronization in the non-managed environment will mean that my post-commit code won't be receiving the model in the expected consistent state. In the best case, because I partitioned my code according to the principles of OO, and not according to the time when it gets executed, I'll just have to touch the internals of every second setter involved in setting up a bi-directional relationship. If I've relied so heavily on the synch behaviour that I omitted accessors like 'Set getChildren()', mutators like 'void add(DomainEntity)' or factory methods like 'DomainEntiry newChild()' on some of my interfaces, it'll break existing clients of my code.

I hope I have established that #3 might not be every developer's cup of tea, and that they might prefer to accept some performance hit to avoid both the cognitive burden and the potential implications for their code. Certainly, they should be given the choice. They sort of have the choice with #2, but it is a bit hit-and-miss and they are not receiving a lot of help from JDO in enforcing their policy. This would be adequately, cheaply, and neatly addressed by #1.

I contend that providing for #1 and the user to request that partial updates to relationships at commit be regarded as an 'inconsistent update' is no more of a burden on vendors than it is for them to synchronize the memory model - they have to perform the detection in any case. So there is no good reason why you shouldn't allow this strategy (in addtion to the others) in the JDO 2.0 spec.

in conclusion,
David.

Reply via email to