Re: Auto-retrieve or not ?

Armin Waibel Wed, 08 Feb 2006 04:51:49 -0800

Hi Bruno,

Bruno CROS wrote:

(to Armin) OK. Really do appreciate all your advices . Thanks again.


Here's my situation, resulting of a migration from 1.0.1 to 1.0.4 (within
ojb-blank) :

- I tried desperatly to use the TwoLevelCache and cannot run the first batch
(note that this batch looks like a big transaction). With 1.0.1, it executes
in a few seconds, creating 420 * 2 records (not so big). From what i saw, it
seems that TwoLevelObjectCache reads all materialized objects all time, as
if it want to check all records againsts all (following relations)! I think
it is useless reads, accordinf to my opinion. It's definitely not possible
to have such quantity of instances in memory !!

You have to differ the caching levels. The L2-cache (by defaultObjectDefaultImpl) normally will not cause memory issues, becauseSoftReferences are used for caching objects.

http://db.apache.org/ojb/docu/guides/objectcache.html#ObjectCacheDefaultImpl

On the other side to avoid endless loops while materialization ofcircular object graphs OJB always cache the whole object graph whilematerialization. This is done a "MaterializationCache" (MC) and use hardreferences. If the object is fully materialized (e.g. a ProductGroupwith 500 Article objects 1:n reference) the MC will be cleared and pushthe objects to the L1 cache (which use SoftReferences again).

So if all 420 objects belong to the same object graph (and there is noproxy which break the materialization of the circular object graph),then indeed you can run into memory issues. But this behavior is theonly way to avoid endless loops while materialization of circular objectgraphs.


Of course, I tried to break the loading mechanism. First i saw that
auto-retrieve at "false" can produce fine results, but with the
incompatibity/disadvantage of ODMG transactions ( following your advices ) i
tried to  keep auto-retrieve and mount CGLIB proxy. The batch nevers ended,
freezing.

Think I don't understand when the issue arise, does the issue arisewhile creating/insert of the 420 objects or when you read these objects?Does the test pass when you decrease the object count or increase thememory of the JVM? Or does the issue always arise when you use theTwoLevel cache?

The only solution who worked has been to first, go back to my old cache
settings with the ObjectCacheDefaultImpl. and then, yes, first batch runs
again (faster !!). ouf.

strange, I think we should verify whether or not the problem is causedby a bug in OJB (e.g. endless loop while object materialization). Pleasetry to run your test with different parameter (see above).

But after, my problem was about the second batch. It seemed that there is
problems to read object by identity. I get rid of CGLIB proxies attributes
(proxy="true" on reference and proxy="dynamic" onto class descriptor) and it
worked.

So , I'm asking me :

- Does someone really get successful utilization of TwoLevelObjectCacheImpl
with little transaction as creation/update of 800 records ?


The OJB perf-test
http://db.apache.org/ojb/docu/guides/performance.html#OJB+performance+in+multi-threaded+environments

run 12 threads handling 500 objects (flat objects without references)per thread. Running this test with 1000 objects per thread isn't aproblem. This test set JVM <jvmarg value="-Xmx256m"/>, so handling highobject count isn't a problem (except each object itself requires muchmemory).

- Is TwoLevelObjectCache really needed (to avoid dirty reads)?

yep, except if you are using repeatable-read as locking isolation leveland lock (at least read-lock) all objects read and used by the user

http://db.apache.org/ojb/docu/guides/lockmanager.html#Supported+Isolation+Levels

But this has the disadvantage that much more LockNotGrantedExceptionswill occur when concurrent threads operate on the same persistent objects.

We use to
work with ObjectCacheDefaultImpl, (re) reading all object before update and
performance are not so bad.

If you re-read the object after the write lock was established, then itis nevertheless possible for other threads to read the same objectinstance (from the cache) while the tx is running and on rollback theother threads hold an object with invalid values.It depends on your application (if this could happen and) if this willbe a problem. If the user re-read the object before update the dataintegrity is guaranteed, because the correct object version will be readand the user will note this (alternative optimistic locking can be used,this will throw an exception on commit of dirty objects).


- Did i miss something to mount CGLib ability? I set proxy="true" on the
reference-descriptor and proxy="dynamic" on referenced class.

Using a dynamic proxy for the persistent object and a proxy for the 1:1reference to this object is tautologous. The dynamic proxy replace thepersistent object by a proxy, the reference proxy replace the referencedobject by a proxy instance, thus this will result in a "proxied proxyobject".

In this case you can discard one kind of proxy.

- I didn't see significant loading chain breaks using CGLib, with
TwoLevelObjectCache. Tell me i'm wrong and that's the only fact of the cache
manager..


Sorry, I don't understand your question, could you describe more detailed.

regards,
Armin

Regards



On 2/6/06, Armin Waibel < [EMAIL PROTECTED]> wrote:

Hi Bruno,

Bruno CROS wrote:

About my precedent batch troubles:
In fact, a saw my model loaded from every where with all the actual
auto-retrieve="true", this means, every where !!  This means too, that
"back" relations are read too, circling too much. This was the cause of

my

OutOfMemoryError.

My model is a big one with a lot of relations, as complex as you can
imagine.
 So, i'm asking me about get rid of those auto-retrieve, to get all

objects

reads faster (and avoid a clogging). But i read that for ODMG dev,
precognized settings are auto-retrieve="true" auto-update="none"
auto-delete="none".  Do i have to absolutely follow this ? If yes, why ?

In generally the auto-retrieve="true" is mandatory when using the
odmg-api. When using it OJB take a snapshot (copy all fields and
references to other objects) of each object when it's locked. On commit
OJB compare the snapshot with the state of the object on commit. This
way OJB can detect changed fields, new or deleted objects in references
(1:1, 1:n,m:n).
If auto-retrieve is disabled and the object is locked OJB assume that no
references exist or the existing ones are deleted although references
exist and not be deleted. So this can cause unexpected behavior,
particularly with 1:1 references.

The easiest way to solve your problem is to use proxy-references. For
1:n and m:n you can use a collection-proxy:

http://db.apache.org/ojb/docu/guides/basic-technique.html#Using+a+Single+Proxy+for+a+Whole+Collection

For 1:1 references you can use proxies too.

http://db.apache.org/ojb/docu/guides/basic-technique.html#Using+a+Proxy+for+a+Reference
Normally this requires the usage of a interface as persistent object
reference field. But when using CGLib based proxies it's not required
and you can simply set proxy="true" without any changes in your source
code.
http://db.apache.org/ojb/docu/guides/basic-technique.html#Customizing+the+proxy+mechanism

If you can't use proxies (e.g. in a 3-tier application) you can disable
auto-retrieve if you take care and:
- disable implicit locking in generally
- carefully lock all objects before change it (new objects too)
- before you lock an object (for update, delete,...) retrieve the
references of that object using method
PersistenceBroker.retrieveAllReferences(obj)/retrieveReference(...).

At start, I saw that setting auto-retrieve to "true" everywhere wasn't
solution, but all transaction and batch processes were working fine (

until

1.0.4. ), with autoretrieve on all n relations (yes!). Chance !!
But with a
little doubt, I tell to all the dev team to avoid as possible the read

by

iterating collections without any reasons, prefering ReportQuery (single
shot) and direct object queries.
Is that the good way to have a fast and robust application ?

...yep. The fastest way to lookup a single object by PK is to use
PB.getObjectByIdentity(...). If a cache is used this method doesn't
require a DB round trip in most cases.
http://db.apache.org/ojb/docu/tutorials/pb-tutorial.html#Find+object+by+primary+key

Another thing, does defaultPersistenceBroker always return the tx broker
when tx has begun (in 1.0.4)? That's very important.
I hope that i have not to implement with TransactionExt.getBroker  method.
All ours reads are with defaultPersistenceBroker.

If you call PBF.defaultPB(...) OJB always return a separate PB instance
(of the 'default' connection defined in jdbc-connection-descriptor) from
the PB-pool using it's own connection.
TransactionExt.getBroker always returns the current used PB instance (of
tx). This behavior never changed.
If you only do read operations with the separate PB instance you will
not run into problems, except if you call tx.flush(). In that case the
objects written to database will not be noticed by the separate PB
instance (until tx.commit - with default DB isolation level).

regards,
Armin

Thank you all in advance.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Auto-retrieve or not ?

Reply via email to