Okay, let's start over from the beginning. This will be lengthy...
Around 2010 I was trying various data modelling scenarios and the "ant
load-demo" task was taking to long to execute, slowing down progress.
So, I started experimenting with a multi-threaded data loader. I was
able to get the ant task to execute in 1/10th the time using
multi-threading. I mentioned my experiment on the dev mailing list and
Adam Heath got interested. He said my design was similar to SEDA.
Anyone wanting more information on SEDA can look it up, but the main
point of that project that needs to be mentioned here is the concept of
making multi-threading SMARTER. That confirmed a discovery I made while
working on my multi-threaded data loader - more threads does not equal
better performance. There was a cutoff point where more threads slowed
things down.
In that dev list conversation with Adam, I mentioned the possibility of
bringing the SEDA design concepts into OFBiz. He replied it can't be
done because of the current transaction API (specifically, ThreadLocal
variables tying a transaction to a thread).
Eventually, Adam brought my multi-threaded data loader idea into the
project, and thanks to his work, my laptop creates the 800+ tables in
less than a second, and "ant load-demo" takes a little over a minute. I
think that is pretty cool and it was worth the effort.
But we still can't multi-thread other parts of the project - because of
the use of ThreadLocal variables.
Since that discussion in 2010, others in the Java community are
beginning to discover the insight SEDA delivered. Log4J 2 uses the
Disruptor design pattern - an evolution of SEDA.
Okay, why does OFBiz use ThreadLocal variables? It's a quick-and-dirty
way to implement an execution context - you don't have to pass a
Transaction instance to every method call.
The funny thing is, we use the Delegator - and THAT needs to be passed
to every method call. So why not have the Delegator reference the
transaction it is running in and eliminate the ThreadLocal variables?
If we can pass a Delegator instance off to other threads, then we are
one step closer to a SEDA or Disruptor design pattern for OFBiz.
Even if those design patterns aren't attractive to you, then the code
simplification I demonstrated earlier should be.
Solving the invalid entity cache problem is a side effect, not the main
motivator.
Adrian Crum
Sandglass Software
www.sandglass-software.com
On 8/27/2014 9:43 AM, Scott Gray wrote:
Okay whatever, thanks for wasting my time with this thread.
On 27/08/2014, at 9:39 am, Adrian Crum <[email protected]>
wrote:
I have, but no one is paying attention. Perhaps if I worked for Hotwax, I could
get someone to pay attention.
I'm done.
Adrian Crum
Sandglass Software
www.sandglass-software.com
On 8/27/2014 9:30 AM, Scott Gray wrote:
You're yet to describe the problem with using ThreadLocal variables.
If you read the Jira issue, I point out another problem with this clunky implementation - calling
"commit" doesn't really commit the transaction. That is why we end up with invalid data
in the entity cache - because developers are fooled into thinking the "commit" calls in
Delegator code actually commit the data, but they don't.
This is the only problem you've described so far and I say it has nothing to do
with our transaction management, it's because the cache isn't transaction
aware. I can't understand what changes you could be suggesting that would
solve that problem for the entity cache?
I've read the ticket now and realize I'm just playing Adam's role. Please
reconsider using Synchronization or XAResource.
Regards
Scott
On 27/08/2014, at 2:01 am, Adrian Crum <[email protected]>
wrote:
That doesn't solve the problem with using ThreadLocal variables.
Adrian Crum
Sandglass Software
www.sandglass-software.com
On 8/26/2014 11:06 PM, Scott Gray wrote:
I don't think that's an issue with our transaction handling, it's simply a
problem that the cache isn't transaction aware.
If the cache were to be transaction aware it would need to implement the
XAResource interface or perhaps even the simpler Synchronization interface and
push cache entries to the global cache only upon commit or discard them on
rollback. I'm loathe to suggest XAResource because we don't implement it
properly in GenericXaResource/ServiceXaWrapper/DebugXaWrapper and it breaks
some transaction managers (Atomikos is the only one I've tried). I have a
strong feeling it could be implemented using the Synchronization interface
without too much trouble though.
Regards
Scott
On 26/08/2014, at 9:46 pm, Adrian Crum <[email protected]>
wrote:
The concepts of "suspend" and "resume" are implemented by a ThreadLocal stack. A
"suspend" pushes the current transaction on the stack, and a "resume" pops a transaction off the
stack.
If you read the Jira issue, I point out another problem with this clunky implementation - calling
"commit" doesn't really commit the transaction. That is why we end up with invalid data
in the entity cache - because developers are fooled into thinking the "commit" calls in
Delegator code actually commit the data, but they don't. The transaction is committed by the first
bit of code that began the transaction - either a request event or a service invocation.
This is an arcane problem and it is difficult to describe, but I will try to
diagram it:
Request Event
Service Dispatcher
Begin Transaction (actual begin)
Begin Service
Some service logic
Delegator calls "commit" - nothing happens
Delegator puts uncommitted values in cache
More service logic
Delegator calls "commit" - nothing happens
Delegator puts uncommitted values in cache
End Service
Commit Transaction (actual commit)
Return service results to event handler
If something goes wrong in the service and the transaction is rolled back, the
uncommitted values in the cache are still there!
You really have to spend time in Entity Engine code to fully appreciate how
awful the transaction implementation really is.
My approach keeps a Transaction reference in the Delegator. Instead of calling the fake
"commit", the Delegator notifies the Transaction about changed values. The
Transaction saves the changed values locally. After the transaction is committed, the
Transaction instance copies the saved values to the cache.
If you look at my previous code fragment, there will be no more "suspend" or
"resume" - if you want a new transaction, you just get another instance and use it.
Adrian Crum
Sandglass Software
www.sandglass-software.com
On 8/26/2014 9:02 PM, Scott Gray wrote:
Okay so I guess I don't really understand what you're suggesting, or how it
really differs much from what we have now. It's also not clear what your
suggested API changes have to do with the ThreadLocal usages?
Thanks
Scott
On 26/08/2014, at 3:22 pm, Adrian Crum <[email protected]>
wrote:
Just use the Delegator factory.
Adrian Crum
Sandglass Software
www.sandglass-software.com
On 8/26/2014 2:43 PM, Scott Gray wrote:
Hi Adrian,
I'll probably have plenty of questions, but the first that comes to mind is:
how would you use a delegator outside of a transaction with this approach?
Thanks
Scott
On 25/08/2014, at 10:51 am, Adrian Crum <[email protected]>
wrote:
One persistent problem with the current Entity Engine implementation is the use
of ThreadLocal variables in the Delegator and Transactions. Their use makes it
difficult (and sometimes impossible) to fix Entity Engine bugs. They also make
it impossible to multi-thread a Delegator instance.
Here is what I have had percolating in my head the last few months:
Transaction tx = TransactionFactory.newTransaction();
Delegator delegator = tx.getDelegator("default");
// Do stuff with delegator
Transaction nestedTx = TransactionFactory.newTransaction();
Delegator nestedDelegator = nestedTx.getDelegator("default");
// Do stuff with nestedDelegator
nestedTx.commit();
tx.commit();
A Delegator instance always references the transaction it is running in.
The advantage to this approach is we gain the ability to hand off Delegator
instances to other threads. Other threads can even commit/rollback a
transaction:
Transaction tx = delegator.getTransaction();
tx.commit();
After a commit, the Delegator instance is discarded. Any attempt to use it
after a commit throws an exception (the same is true with the Transaction
instance).
Another problem is Delegator localization - which also uses ThreadLocal
variables. We can localize Delegator instances like this:
Transaction tx = TransactionFactory.newTransaction();
Delegator delegator = tx.getDelegator("default", locale);
Finally, the current implementation has a caching problem:
https://issues.apache.org/jira/browse/OFBIZ-5534
With the new design, the Delegator instance, Transaction instance, and entity
cache are tightly coupled - so that problem is easy to solve.
What do you think?
--
Adrian Crum
Sandglass Software
www.sandglass-software.com