Re: CMP FBPK Yields an individual SELECT per column

Matt Hogstrom Thu, 08 Sep 2005 11:34:11 -0700

Sounds like a good strategy. Is there some documentation that goesalong with this? My opinion is that the default behaviour should bethat all columns are faulted in when first referenced. My concern isthat for databases like Oracle that run at READ-COMMITTED multiple tripsto the database has the possibility of getting a fragmented view of thedata as other Txs could have updated the row in between gets. As aconsequence a transaction may act inappropriately based on thefragmented data.

This isn't a problem for other databases like DB2 which runs at ReadStability by default (this has its own set of annoying issues but atleast data corruption isn't one). Firebird would probably work ok withthis configuration but given the popularity of Oracle a conservativeapproach makes sense.

Can you provide the configuration informaiton and I'll put arecommendation on the list for a change in the default behaviour.

I'm still chasing the CMR problem but I'm confirming its not aconfiguration problem.


- Matt

Jeremy Boynes wrote:

Matt asked a couple of questions on IRC related to this:

 > SELECT Q.symbol FROM QuoteEJB Q WHERE Q.symbol = s:153
This query is checking whether a entity exists - I thought for that weactually issued
SELECT 1 FROM QuoteEJB Q WHERE Q.symbol = s:153
but the effect is similar.
The columns are probably being fetched individually because there is nopre-fetch information defined. The challenge here is to pick the rightdefault set for the work that is being performed in the transaction -not too much, not too little.
If we simply fetch all, then we may load way more information than isneeded and that gets problematic when the table has a couple hundredcolumns or large objects like attachments or images.
If we fetch too little then we see this behaviour and do way more tripsto the database than desirable.
Complicating things further is that there is often little correlationbetween the finder invoked and the fields that get accessed during thetransaction. For example, when an application is displaying data, it mayfind a bean and then need to read every field so adding all fields tothe query is desirable; when updating data, it may use the same finderbut then overwrite every field and so optimal behaviour would be to addno fields to the query.
Also, a lot of this behaviour is influenced by the caching strategy inplace. This lack of prefetch information does not cause problems if allthe data being accessed is held in a local cache; it will be faulted inonce and then reused. Of course, if caching is disabled then this won'tperform well.
Where the appropriate hook points are also depends on the front-endpersistence model. For example:
* CMP1 allows access to fields and has no relationships so you really
  need to load all fields for the bean and can't prefetch children
* Hibernate allows access to fields but does have relationships so
  you need to load all fields but can prefetch (this is without field
  access interception, if you do that you can lazy load fields)
* CMP2 intercepts all access so you can choose which fields to load
  and can prefetch relationships
TranQL supports all these different models through the concept of queryevents (e.g. when a finder runs) and though cache-miss events. The basicstrategy is "do-something-when-a-cache-miss-occurs" where "something" isdefined by the front-end depending on the access model it supports; the"something" may have side effects such as loading other values into thecache (which is how prefetch works).
So, the simple fix here is to set up prefetch associated with the finderor with the ejbLoad event which will load all the columns for the bean.
In the longer term, OpenEJB should be extended to associate cachepre-load operations with transaction initiation so that the entire datagraph can be loaded up front in one query.
--
Jeremy

Matt Hogstrom (JIRA) wrote:
CMP FBPK Yields an individual SELECT per column
-----------------------------------------------

         Key: GERONIMO-985
         URL: http://issues.apache.org/jira/browse/GERONIMO-985
     Project: Geronimo
        Type: Bug
    Versions: 1.0-M5     Environment: Geronimo w/Derby
    Reporter: Matt Hogstrom
I'm testing the DayTrader Application and it appears that FBPK findsare executing a single SELECT per field in the CMP field. Here arethe SELECTs making up a single OrderEJB.findByPrimaryKey(). Thisshould be broken down into a single SELECT for the entity.
SELECT Q.symbol FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.symbol FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.companyName FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.volume FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.price FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.open1 FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.low FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.high FROM QuoteEJB Q WHERE Q.symbol = s:153
SELECT Q.change1 FROM QuoteEJB Q WHERE Q.symbol = s:153

Re: CMP FBPK Yields an individual SELECT per column

Reply via email to