[ https://issues.apache.org/jira/browse/PHOENIX-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649794#comment-15649794 ]
chenglei edited comment on PHOENIX-3452 at 11/10/16 8:43 AM: ------------------------------------------------------------- I think the problem is cause by the GroupByCompiler, when GroupBy.compile method called the OrderPreservingTracker.track method to track the groupBy expression's order,just as following(in GroupByCompiler.java): {code:borderStyle=solid} 144 if (isOrderPreserving) { 145 OrderPreservingTracker tracker = new OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, expressions.size(), tupleProjector); 146 for (int i = 0; i < expressions.size(); i++) { 147 Expression expression = expressions.get(i); 148 tracker.track(expression); 149 } {code} The track method inappropriately used the sortOrder != SortOrder.getDefault() as the thrid "isNullsLast" parameter as following(in OrderPreservingTracker.java): {code:borderStyle=solid} 101 public void track(Expression node) { 102 SortOrder sortOrder = node.getSortOrder(); 103 track(node, sortOrder, sortOrder != SortOrder.getDefault()); 104 } 105 106 public void track(Expression node, SortOrder sortOrder, boolean isNullsLast) { {code} Once the node's SortOrder is SortOrder.DESC, the "isNullsLast" is true. it affected the GroupBy 's isOrderPreserving as following(in OrderPreservingTracker.java) : {code:borderStyle=solid} 141 if (node.isNullable()) { 142 if (!Boolean.valueOf(isNullsLast).equals(isReverse)) { 143 isOrderPreserving = false; 144 isReverse = false; 145 return; 146 } 147 } {code} As far as the sql "SELECT DISTINCT entity_id,score FROM test.test WHERE organization_id = 'org1' AND container_id = 'container1' ORDER BY score DESC" is concerned, GroupBy 's "isOrderPreserving" property is false,but the OrderByComilper thinks the OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY. Because the GroupBy 's "isOrderPreserving" property is false,so the aggregated results returned by the RegionServer's GroupedAggregateRegionObserver is unordered, and after the AggregatePlan of client side gets the aggregated results, the results only be sorted by the aggregated key through MergeSortRowKeyResultIterator, so the final results is not as expected.If the GroupBy 's "isOrderPreserving" property is true,the final results is ok. Actually, the "isNullsLast" parameter is just related to orderBy ,it should only affected the display order of "Null " in the sorted results , groupBy should not be affetced by "isNullsLast". I wrote a simple unit test to reproduce this problem in my patch: {code:borderStyle=solid} @Test public void testGroupByDesc() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { conn.createStatement().execute("DROP TABLE IF EXISTS GROUPBYDESC_TEST"); String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+ "ORGANIZATION_ID VARCHAR,"+ "CONTAINER_ID VARCHAR,"+ "CONSTRAINT TEST_PK PRIMARY KEY ( "+ "ORGANIZATION_ID DESC,"+ "CONTAINER_ID DESC"+ "))"; conn.createStatement().execute(sql); sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID"; PhoenixPreparedStatement statement = conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class); QueryPlan queryPlan = statement.optimizeQuery(sql); queryPlan.iterator(); assertTrue(queryPlan.getGroupBy().isOrderPreserving()); } finally { conn.close(); } } {code} I uploaded my patch, [~jamestaylor], please review. was (Author: comnetwork): I think the problem is cause by the GroupByCompiler, when GroupBy.compile method called the OrderPreservingTracker.track method to track the groupBy expression's order,just as following(in GroupByCompiler.java): {code:borderStyle=solid} 144 if (isOrderPreserving) { 145 OrderPreservingTracker tracker = new OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, expressions.size(), tupleProjector); 146 for (int i = 0; i < expressions.size(); i++) { 147 Expression expression = expressions.get(i); 148 tracker.track(expression); 149 } {code} The track method inappropriately used the sortOrder != SortOrder.getDefault() as the thrid "isNullsLast" parameter as following(in OrderPreservingTracker.java): {code:borderStyle=solid} 101 public void track(Expression node) { 102 SortOrder sortOrder = node.getSortOrder(); 103 track(node, sortOrder, sortOrder != SortOrder.getDefault()); 104 } 105 106 public void track(Expression node, SortOrder sortOrder, boolean isNullsLast) { {code} Once the node's SortOrder is SortOrder.DESC, the "isNullsLast" is true. it affected the GroupBy 's isOrderPreserving as following(in OrderPreservingTracker.java) : {code:borderStyle=solid} 141 if (node.isNullable()) { 142 if (!Boolean.valueOf(isNullsLast).equals(isReverse)) { 143 isOrderPreserving = false; 144 isReverse = false; 145 return; 146 } 147 } {code} As far as the sql "SELECT DISTINCT entity_id,score FROM test.test WHERE organization_id = 'org1' AND container_id = 'container1' ORDER BY score DESC" concerned, GroupBy 's "isOrderPreserving" property is false,but the OrderByComilper thinks the OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY. Because GroupByComilper 's "isOrderPreserving" property is false,so the aggregated results returned by the RegionServer's GroupedAggregateRegionObserver is unordered, and after the AggregatePlan of client side get the aggregated results,the results just be sorted by the aggregatedKey through MergeSortRowKeyResultIterator, so the final results is not as expected. Actually, the "isNullsLast" parameter is just related to orderBy ,it should just affected the display order of "Null " in the sorted results , groupBy should not be affetced by "isNullsLast". I wrote a simple unit test to reproduce this problem in my patch: {code:borderStyle=solid} @Test public void testGroupByDesc() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { conn.createStatement().execute("DROP TABLE IF EXISTS GROUPBYDESC_TEST"); String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+ "ORGANIZATION_ID VARCHAR,"+ "CONTAINER_ID VARCHAR,"+ "CONSTRAINT TEST_PK PRIMARY KEY ( "+ "ORGANIZATION_ID DESC,"+ "CONTAINER_ID DESC"+ "))"; conn.createStatement().execute(sql); sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID"; PhoenixPreparedStatement statement = conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class); QueryPlan queryPlan = statement.optimizeQuery(sql); queryPlan.iterator(); assertTrue(queryPlan.getGroupBy().isOrderPreserving()); } finally { conn.close(); } } {code} I uploaded my patch, [~jamestaylor], please review. > Secondary index and query using distinct: ORDER BY doesn't work correctly > ------------------------------------------------------------------------- > > Key: PHOENIX-3452 > URL: https://issues.apache.org/jira/browse/PHOENIX-3452 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.8.0 > Reporter: Joel Palmert > Assignee: chenglei > Attachments: PHOENIX-3452_v2.patch > > > This may be related to PHOENIX-3451 but the behavior is different so filing > it separately. > Steps to repro: > CREATE TABLE IF NOT EXISTS TEST.TEST ( > ORGANIZATION_ID CHAR(15) NOT NULL, > CONTAINER_ID CHAR(15) NOT NULL, > ENTITY_ID CHAR(15) NOT NULL, > SCORE DOUBLE, > CONSTRAINT TEST_PK PRIMARY KEY ( > ORGANIZATION_ID, > CONTAINER_ID, > ENTITY_ID > ) > ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000; > CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, > ENTITY_ID DESC); > UPSERT INTO test.test VALUES ('org1','container1','entityId6',1.1); > UPSERT INTO test.test VALUES ('org1','container1','entityId5',1.2); > UPSERT INTO test.test VALUES ('org1','container1','entityId4',1.3); > UPSERT INTO test.test VALUES ('org1','container1','entityId3',1.4); > UPSERT INTO test.test VALUES ('org1','container1','entityId2',1.5); > UPSERT INTO test.test VALUES ('org1','container1','entityId1',1.6); > SELECT DISTINCT entity_id, score > FROM test.test > WHERE organization_id = 'org1' > AND container_id = 'container1' > ORDER BY score DESC > Notice that the returned results are not returned in descending score order. > Instead they are returned in descending entity_id order. If I remove the > DISTINCT or remove the secondary index the result is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)