[ https://issues.apache.org/jira/browse/PHOENIX-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013270#comment-14013270 ]
jay wong commented on PHOENIX-1006: ----------------------------------- [~jamestaylor] yep. I saw it sort per region. but it sout one by one. not Parallel. so. data increment. and region increment .as it is Parallel. the performance descend huge > 8x Performance enhancements in my group by query case. > ------------------------------------------------------ > > Key: PHOENIX-1006 > URL: https://issues.apache.org/jira/browse/PHOENIX-1006 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 3.0.0 > Reporter: jay wong > Attachments: PHOENIX-1006.patch, PHOENIX-1006v2.patch > > > As a query sql is : > SELECT count(1) as count,SUM(int_column) as sum_column, MAX(int_column) as > max_column2,MIN(int_column) as min_column,AVG(int_column) as avg_column FROM > table1 WHERE int_column IS NOT NULL GROUP BY int_column2 ORDER BY int_column > DESC LIMIT 200; > NOTE: The result has 55000 groups. > AggregatePlan tell me when match group by. the resultIterator will be set as > MergeSortRowKeyResultIterator. > And the MergeSortRowKeyResultIterator need a OrderedResultIterator. > So when you run a group by query with order or not . It ALWAYS has a ordered > First. > I think it's unnecessary at all. > So I get all result from RegionServer and Aggregate by key. > with the above sql. the execute RT is from 30s enhancements to 4s. > ============================================== > But I found If group by without order by. the result sequence could not be > safeguarded. > So the result sequence I sort by TreeSet with key. > Now the result is same as query with MergeSortRowKeyResultIterator -- This message was sent by Atlassian JIRA (v6.2#6252)