[jira] Created: (IBATIS-503) Optimised groupBy implementation with row handler support

William Shields (JIRA) Fri, 18 Apr 2008 08:35:41 -0700

Optimised groupBy implementation with row handler support
---------------------------------------------------------


                 Key: IBATIS-503
                 URL: https://issues.apache.org/jira/browse/IBATIS-503
             Project: iBatis for Java
          Issue Type: Improvement
          Components: SQL Maps
    Affects Versions: 2.3.0
            Reporter: William Shields


I previously raised a bug against groupBy usage with row handlers.  Basically 
if you had data like this:

USER_NUMBER,ACCOUNT_NUMBER,SECURITY_CODE,UNITS
1,123,GOOG,100
1,123,MSFT,200
1,123,ORCL,300
1,456,AAPL,150
2,789,GOOG,120

and an abbreviated result map of:

<resultMap id="holding" resultClass="com.myclass.Holding">
  <result proeprty="securityCode" column="SECURITY_CODE"/>
  <result property="units" column="UNITS"/>
</resultMap>

<resultMap id="account" resultClass="com.myclass.Account" groupBy="userNumber, 
accountNumber">
  <result property="userNumber" column="USER_NUMBER"/>
  <result property="accountNumber" column="ACCOUNT_NUMBER"/>
  <result property="holdings" resultMap="holding"/>
</resultMap>

And yes this could be nested 3 levels (or more).  In the original bug report if 
you used queryWithRowHandler() this was the result:

1,123,GOOG,100
--> Fire for (1,123) Note: at this point holdings is of size 1 with only GOOG 
in there.
1,123,MSFT,200
1,123,ORCL,300
1,456,AAPL,150
--> Fire for (1,456)
2,789,GOOG,120
--> Fire for (2,789)

Now eventually it was decided this wasn't a bug because ibatis groupBy does 
implicit ordering (in fact I believe it stores the entire result set in a map 
using an identity key or something similar based on the groupBy expression).  
The reason for this is of course so the query still returns the same result no 
matter the ordering of the result set.  This has several problems:

1. The entire result set must be loaded before you can safely return any 
results (which goes against why you're using row handlers in the first place);

2. The row handler behaviour is misleading and arguably wrong; and

3. It is probably more efficient to sort the result with SQL (assuming its 
appropriately indexed, which it should be anyway).  This makes the internal 
sorting unnecessary.

My first proposal is to make the use of queryWithRowHandler() for a groupBy 
query to throw an error.

My second proposal is to have some kind of optimised groupBy alternative that 
does no internal sorting.  Perhaps this could take the form of another 
attribute (eg groupBySort="false").  The groupBy expression basically forms an 
identity key.  For each unique groupBy tuple there will be one row in the 
result set.  The optimised groupBy will basically "break" (in the DB/reporting 
sense) on a change in the identity expression.  Those of you who have done 
reporting languages like Natural, Jasper/Crystal Reports and the like will be 
familiar with the concept of break conditions.

The optimised/unsorted gorupBy alternative will allow queryWithRowHandler() and 
it will behave "correctly".  By this i Mean that if, as in the above dataset, 
the row handler for the first account (3 holdings) will be called after the 
third holding is added to the Account object.

I can (and have) coded a row handler that basically does this but its tedious 
and it has the disadvantage that the sort is still happening and the entire 
data set is getting loaded into memory.  IMHO ibatis needs a better solution.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (IBATIS-503) Optimised groupBy implementation with row handler support

Reply via email to