Re: This should be easy an obvious, but it's not.

dan benanav Sat, 12 Feb 2000 08:16:34 -0800

This is a long babbling letter so if you want to hear my conclusions and not my babbling I will state it first.

1. Design your entity beans to only contain update methods. Rely on pessimistic concurrency. In fact the ability to achieve serialization of calls using pessimistic concurrency is a major programming advantage and a major reason for using entity beans. Otherwise I would just use a stateless session bean.

2. For methods that are read only use a stateless session bean.

If that entices you read on to see how I am led to this conclusion.

Note in the following pessimistic concurrency refers to using only one bean per entity object. That is, if Entity object A and B are identical (same primary key) then all business calls are delegated to the same bean instance.

I am beginning to conclude that when it comes to methods that update the bean, pessimistic concurrency is most often what is required and that it will not affect efficiency. The problems discussed below regarding incorrect results go away if pessimistic concurrency can be enforced. So the spec should allow that to be specified in the deployment descriptor.

For methods that update the beans, optimistic concurrency might be useful if the following 2 conditions are met.

1. Multiple clients will often make update calls to the same entity object.
2. The 2 calls are independent of one another. Meaning for example that they depend on and change different parts of the bean.

One could argue that 2 should rarely occur. It seems like a bad design to me. If the calls are independent then the bean should be redesigned into two classes.

So actually I am pretty happy right now with using pessimistic concurrency beans, except for 2 things. One is that there is no way to enforce it according to the spec, meaning that one should not depend on it. I think however in practice all ejb servers use pessimistic concurrency. Also if I am not developing beans for resale this is probably not a big deal. But the real problem is that pessimistic concurrency is not what I want for methods that do not change the bean data. It is quite reasonable that many clients will often want to read the data simultaneously. And it is in that case that you don't want blocking. ( In contrast in cases where multiple clients are updating the bean concurrently you often want blocking even though it slows things down. That is because without blocking the data would get screwed up).

So now we have a problem. Pessimistic concurrency seems best for updates and optimistic concurrency seems best for read only operations. If I am willing to take the performance hit, and in practice that may not be so bad, I still have one problem. When read only operations are implemented the ejbLoad -> call method -> ejbStore paradigm is not what is wanted. The problem is the ejbStore at the end. For read only operations why update the database? I guess this can be resolved somewhat by using a dirty flag. In ejbStore only update the database if the dirty flag is set. I believe the WLS has some solution to this but it doesn't look very clean to me.

So if we want multiple readers single writers we have a problem. The solution to this might be to use a stateless session bean for reads and the entity bean only for methods that update. This seems OK to me except for one thing. The benefits of a cached ejb bean are lost now. (Am I correct here? Is there a way to enforce caching of the entity bean? Meaning that when findByPrimarykey the container looks for a bean that already exists with the same primary key. Also it doesn't call ejbLoad on that bean?)

So now the question is whether it is better to use single writer, single reader with cached beans, or to use multiple reader, single writer without cached beans? Which is better from a performance point of view? Is it application dependent?

Now after all this babbling I think I should make some conclusions based on today's technology.

1. For most cases using entity beans with pessimistic concurrency is what you want. Furthermore you actually want to rely on pessimistic concurrency in terms of correctness of your program.
2. For methods that do not update the database try to write the bean so that ejbStore only makes updates if the fields have changed. (What is the best way to do that?) I looked at WLS examples and it seems ugly and dangerous to me.

If you can manage to do 1 and 2 then I think you can still achieve a pretty high performance system. Because we have to remember that the above discussion applies specifically to the case where the same ejb object is called by multiple clients. We still are using multiple beans for different object and that is where you will have the greatest performance gains. I am stilled bothered by having to write in extra coding to prevent the ejbLoad -> method call -> ejbStore in the case of read-only operations. Is there an easy way to do that? Looking at the WLS example it appears that you have to do things all over the place to get that to work. So now I have gained some benefits but sacrificed others. (Namely I have to write more code to avoid the cycle).

Maybe then I should make another conclusion (instead of the one above).

2. For methods that are read only use a stateless session bean.

The reason I have to do 2 is because otherwise I would have to do all that extra programming and thinking to avoid the ejbLoad -> call method -> ejbStore cycle. If I had a very easy way to do that without having to think and do extra coding I would use the previous approach. That is I would put read operations in the entity bean as well.

Using this approach I lose the efficiency of caching but the database may take care of that to some extent.

This whole discussion brings us back to another conclusion. "This should be easy and obvious, but it's not"! And that is what I don't like about ejb. It is a love hate relationship I have with ejb. Can't live with it and can't live without it. :-)

dan

Chris Raber wrote:

Assaf,
Some fine points. Comments:
> -----Original Message-----
> From: Assaf Arkin [SMTP:[EMAIL PROTECTED]]
> Sent: Wednesday, February 09, 2000 4:23 PM
> To:   [EMAIL PROTECTED]
> Subject:      Re: This should be easy an obvious, but it's not.
>
> Chris, one clarification (I'm now dicussing the same issue with RMH on a
> separate thread).
>
> Assume we're always talking about the same identity, because that's
> where the problem is. You have two transactions T1 and T2 finding and
> then calling methods on instances of that identity.
>
> I agree that T1 and T2 should not attempt to access the same instance,
> that would just kill performance.
>
Check.
> My question is, if T1 attempts to find the identity twice, will it
> always get the same identity A, or will it get two identities in the
> same transaction?
>
It should find the same physical instance A as long as it is in the same
address space. If T1 is distributed over more than one server, another
identical A could get instantiated, using yet another resource... In
GemStone/J we have a "co-location" policy so that once we cross over into
the server, activation of one component by another forces a load in the same
server process.
I can't think through the detail on short notice, but perhaps if you had a
long transaction started way out at the client, and the client looked up
components in multiple app server instances, you could get a distributed
transaction where the same identity was accessed with two/more different XA
resources. Seems far fetched, and I would never architect that way, but you
never know what people will dream up!
> Also, what happens if T1 is using a remote reference Ra, T2 is using a
> remote reference Rb, and now T1 is using Rb as well?
>
Luckily the remote reference and the instances that gets invoked can be
separate in EJBs. Remember it is the XA resource Behind the beans that get
registered with the transaction manager, so as long as the EJB server
properly propagates transaction resources and keeps its resource
book-keeping straight, everything should be ok.
> arkin
>
>
> Chris Raber wrote:
> >
> > Dan,
> >
> > I think the bottom line is that the EJB server has to provide
> transaction
> > isolation to the simultaneous operations on our beans. In GemStone/J we
> do
> > this by having separate instances of the bean per transaction, and the
> > datasource handles the concurrency issues. This works as long as we have
> > over lapping transactions.
> >
> > If you pull data out of the beans, copy it to the client, update the
> data,
> > and write it back to the bean in a separate transaction than the
> original
> > read, then you must use optimistic concurrency control (i.e. "dirty
> > detection"...).
> >
> > Other servers will handle this by synchronizing access to a single bean
> > instance at the Java level. This will perform very poorly for update
> intense
> > applications.
> >
> > Regards,
> >
> > -Chris.
> >
> > > -----Original Message-----
> > > From: dan benanav [SMTP:[EMAIL PROTECTED]]
> > > Sent: Wednesday, February 09, 2000 7:50 AM
> > > To:   [EMAIL PROTECTED]
> > > Subject:      This should be easy an obvious, but it's not.
> > >
> > > I have a question about how to implement something using EJB. I am
> sure
> > > that this problem is very common and that it has come up in
> discussions
> > > in various forms on this list, but I don't think the right approach
> has
> > > been clarified. This is surprising since the spec should be able to
> > > handle such a situation rather easily. It appears to me that many
> > > people have a misunderstanding about EJB that leads to an incorrect
> way
> > > to solve this problem. Including in that misunderstanding is the Sun
> > > BluePrints guide. In the following I will assume we are talking EJB
> 1.1
> > > spec and I am interested in a solution that conforms to that spec.
> > >
> > > Problem: Suppose you have an Account class that is the remote
> interface
> > > to an Account bean. AccountBean is the entity bean implementation
> > > class. You want to write a method increment(int x) that adds x to the
> > > balance on the account represented by the Account instance. There are
> 2
> > > approaches that come to mind about how to do this. The first is the
> > > most obvious but I believe there are problems with it.
> > >
> > > 1) public void increment(int x) {balance += x;};
> > >
> > > 2) public void increment(int x) { //use jdbc to increment the balance
> in
> > > the db using sql like "updateaccount_table set balance = balance + ?
> > > where account_key = ?". The first question mark is set to x and the
> > > second question mark is set to the primary key of the bean.}
> > >
> > > Why isn't 1) good? Containers are free to use multiple bean instances
> > > to handle concurrent calls to an entity object as long as those calls
> > > are occurring under a different transaction. This means that you can
> > > have bean instance A and bean instance B both handling calls to
> > > increment x, where both A and B represent the same entity object (same
> > > primary key). Some containers my choose not to do this, however there
> > > is no guarantee that the container provider won't change their
> approach
> > > in the future. Furthermore if you rely on the container not doing
> that
> > > your code will not work on any EJB server. This would make it
> difficult
> > > for EJB bean providers (like the Theory Center) for providing server
> or
> > > container independent beans. So if you agree now that your code
> should
> > > work if the Container provider uses multiple bean instances then I
> think
> > > there is a problem.   Suppose that in two separate threads calls are
> > > made to an entity object. One thread calls increment(1) and the other
> > > increment(2). These calls occur concurrently. Assume also that the
> > > initial balance before the calls is 3. After the call the new balance
> > > should be 6. Here is what can occur:
> > >
> > > T1:Container calls ejbLoad on Bean A: Balance gets set to 3.
> > > T2:Container calls ejbLoad on Bean B: Balance still is 3.     (This
> > > happens in a separate thread and separate transaction context).
> > > T3:Container calls increment(1) on Bean A: Balance gets set to 4 in
> Bean
> > > A.
> > > T4:Container calls increment(2) on Bean B: Balance gets set to 5 in
> Bean
> > > B.
> > > T5:Container calls ejbStore on Bean A: balance in database is updates
> to
> > > 4.
> > > T6:Container calls ejbStore on Bean B: what happens??
> > >
> > > What happens may depend on what the Transaction isolation level is and
> > > perhaps on the underlying database. If you use Oracle and the default
> > > transaction isolation level (read committed) then the balance gets set
> > > to 5 in the database. (The wrong answer!). If you use serializable
> > > transaction isolation level then an exception is thrown and the client
> > > will have to react appropriately to the exception. So it would appear
> > > that the only way 1) could possibly work (in some sense) is to use
> > > transaction level serializable.   So 1) with transaction serializable
> is
> > > an approach but it is incompletely specified since we haven't said how
> > > the client should handle this. I imagine there are ways to handle it.
> > > (Catch the exception and try again for example).   Furthermore I have
> > > been told by WLS not to use transaction serializable due to a bug in
> > > Oracle so 1) does not currently work for Oracle.   Even if this would
> > > work it seems like it would be better to just force the container or
> to
> > > specify in the deployment descriptor that calls to increment should be
> > > serialized. Currently there is no way to do that in the spec.
> > >
> > > The second solution would work however there are also problems with
> it.
> > > You need to be careful to make sure that when ejbStore is called the
> > > correct value is stored in the database. For example, you cannot just
> > > increment the value in the db. You would also need to set balance in
> the
> > > bean appropriately.
> > >
> > > So the question is, how should one implement this? Surely we should
> be
> > > able to reach a consensus on this rather simple question?
> > >
> > > dan
> > >
> > >
> ==========================================================================
> > > =
> > > To unsubscribe, send email to [EMAIL PROTECTED] and include in the
> > > body
> > > of the message "signoff EJB-INTEREST". For general help, send email
> to
> > > [EMAIL PROTECTED] and include in the body of the message "help".
> >
> >
> ==========================================================================
> =
> > To unsubscribe, send email to [EMAIL PROTECTED] and include in the
> body
> > of the message "signoff EJB-INTEREST". For general help, send email to
> > [EMAIL PROTECTED] and include in the body of the message "help".
>
> --
> ----------------------------------------------------------------------
> Assaf Arkin                                           www.exoffice.com
> CTO, Exoffice Technologies, Inc.                        www.exolab.org
>
> ==========================================================================
> =
> To unsubscribe, send email to [EMAIL PROTECTED] and include in the
> body
> of the message "signoff EJB-INTEREST". For general help, send email to
> [EMAIL PROTECTED] and include in the body of the message "help".
===========================================================================
To unsubscribe, send email to [EMAIL PROTECTED] and include in the body
of the message "signoff EJB-INTEREST". For general help, send email to
[EMAIL PROTECTED] and include in the body of the message "help".

Re: This should be easy an obvious, but it's not.

Reply via email to