Optimised groupBy implementation with row handler support ---------------------------------------------------------
Key: IBATIS-503 URL: https://issues.apache.org/jira/browse/IBATIS-503 Project: iBatis for Java Issue Type: Improvement Components: SQL Maps Affects Versions: 2.3.0 Reporter: William Shields I previously raised a bug against groupBy usage with row handlers. Basically if you had data like this: USER_NUMBER,ACCOUNT_NUMBER,SECURITY_CODE,UNITS 1,123,GOOG,100 1,123,MSFT,200 1,123,ORCL,300 1,456,AAPL,150 2,789,GOOG,120 and an abbreviated result map of: <resultMap id="holding" resultClass="com.myclass.Holding"> <result proeprty="securityCode" column="SECURITY_CODE"/> <result property="units" column="UNITS"/> </resultMap> <resultMap id="account" resultClass="com.myclass.Account" groupBy="userNumber, accountNumber"> <result property="userNumber" column="USER_NUMBER"/> <result property="accountNumber" column="ACCOUNT_NUMBER"/> <result property="holdings" resultMap="holding"/> </resultMap> And yes this could be nested 3 levels (or more). In the original bug report if you used queryWithRowHandler() this was the result: 1,123,GOOG,100 --> Fire for (1,123) Note: at this point holdings is of size 1 with only GOOG in there. 1,123,MSFT,200 1,123,ORCL,300 1,456,AAPL,150 --> Fire for (1,456) 2,789,GOOG,120 --> Fire for (2,789) Now eventually it was decided this wasn't a bug because ibatis groupBy does implicit ordering (in fact I believe it stores the entire result set in a map using an identity key or something similar based on the groupBy expression). The reason for this is of course so the query still returns the same result no matter the ordering of the result set. This has several problems: 1. The entire result set must be loaded before you can safely return any results (which goes against why you're using row handlers in the first place); 2. The row handler behaviour is misleading and arguably wrong; and 3. It is probably more efficient to sort the result with SQL (assuming its appropriately indexed, which it should be anyway). This makes the internal sorting unnecessary. My first proposal is to make the use of queryWithRowHandler() for a groupBy query to throw an error. My second proposal is to have some kind of optimised groupBy alternative that does no internal sorting. Perhaps this could take the form of another attribute (eg groupBySort="false"). The groupBy expression basically forms an identity key. For each unique groupBy tuple there will be one row in the result set. The optimised groupBy will basically "break" (in the DB/reporting sense) on a change in the identity expression. Those of you who have done reporting languages like Natural, Jasper/Crystal Reports and the like will be familiar with the concept of break conditions. The optimised/unsorted gorupBy alternative will allow queryWithRowHandler() and it will behave "correctly". By this i Mean that if, as in the above dataset, the row handler for the first account (3 holdings) will be called after the third holding is added to the Account object. I can (and have) coded a row handler that basically does this but its tedious and it has the disadvantage that the sort is still happening and the entire data set is getting loaded into memory. IMHO ibatis needs a better solution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.