comnetwork commented on a change in pull request #444: PHOENIX-5148
URL: https://github.com/apache/phoenix/pull/444#discussion_r265831803
 
 

 ##########
 File path: 
phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinMoreIT.java
 ##########
 @@ -423,7 +424,7 @@ public void testBug2894() throws Exception {
                         "         ) L\n" +
                         "     ON L.BUCKET = E.BUCKET AND L.TIMESTAMP = 
E.TIMESTAMP\n" +
                         " ) C\n" +
-                        " GROUP BY C.BUCKET, C.TIMESTAMP";
+                        " GROUP BY C.BUCKET, C.TIMESTAMP ORDER BY C.BUCKET, 
C.TIMESTAMP";
 
 Review comment:
   I added the order by here mainly for:
   
   1.Make the query result stable,because the original sql is only have  group 
by clause, and the group by is not `orderPreserving` before this patch, so must 
add a additional `SORTED BY [E.TIMESTAMP, E.BUCKET]`, but after this patch, the 
group by is  `orderPreserving`, so the query result is not same as without this 
patch.
   
   2.To test if the GroupBy and OrderBy could both be optimized by this patch, 
from the explain we can see the 
   `GROUP BY C.BUCKET, C.TIMESTAMP` is  `orderPreserving` and 
   `ORDER BY C.BUCKET, C.TIMESTAMP` is compiled out, which are meeting the 
expectation.
   
   If I do not add the order by, the explain assertions and query result 
assertions are same as the order by is added. 
   
   Another problem  I want to explain is  why the order of query result is 
reversed after this patch:
   
   the query result before this patch is :
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993520000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993515000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993510000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993505000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993490000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993485000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993480000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993475000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993470000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993430000000000L, rs.getLong(2));
   
   the query result after this patch is :
                  assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993430000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993470000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993475000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993480000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993485000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993490000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993505000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993510000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993515000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993520000000000L, rs.getLong(2));
   
   We can see that the query result order is reversed after the patch, that is 
because without this patch , `CLIENT SORTED BY [E.TIMESTAMP, E.BUCKET]` 
(`E.TIMESTAMP` is moved first because is fixed width and not null) is added in 
order to complete the aggregate, and we have to note is  that the `SortOrder` 
of `E.TIMESTAMP` is `DESC` in table `eventCountTableName`'s DDL,  meanwhile 
when we add `CLIENT SORTED BY [E.TIMESTAMP, E.BUCKET]` in 
`ClientAggregatePlan.iterator`, the `isAscending` parameter in following line 
159 is true, so after` OrderedResultIterator` in line 167, the result tuple is 
sort by the "DESC of E.TIMESTAMP",  that is to say , `order by E.TIMESTAMP 
DESC,E.BUCKET ASC`
   
   With this patch, the unnecessary sort  for completing GroupBy is avoid, 
after the SortMergeJoin, the ordering is `order by E.BUCKET ASC, E.TIMESTAMP 
ASC`, so the order of query result is reversed after this patch when the 
`E.BUCKET` is same.
    
   ```
   157              List<OrderByExpression> keyExpressionOrderBy = 
Lists.newArrayListWithExpectedSize(keyExpressions.size());
   158                for (Expression keyExpression : keyExpressions) {
   159                    keyExpressionOrderBy.add(new 
OrderByExpression(keyExpression, false, true));
   160               }
   161
   162             if (useHashAgg) {
   163                   // Pass in orderBy to apply any sort that has been 
optimized away
   164                   aggResultIterator = new 
ClientHashAggregatingResultIterator(context, iterator, serverAggregators, 
keyExpressions, orderBy);
   165                } else {
   166                    iterator =
   167                            new OrderedResultIterator(iterator, 
keyExpressionOrderBy,
   168                                    spoolingEnabled, thresholdBytes, 
null, null,
   169                                    projector.getEstimatedRowByteSize());
   170                    aggResultIterator = new 
ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator), 
serverAggregators, keyExpressions);
   171                }
   172            }
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to