[jira] [Comment Edited] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

chenglei (JIRA) Thu, 10 Nov 2016 00:44:45 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649794#comment-15649794
 ]


chenglei edited comment on PHOENIX-3452 at 11/10/16 8:43 AM:
-------------------------------------------------------------

I think the problem is cause by the GroupByCompiler, when GroupBy.compile 
method called the OrderPreservingTracker.track method to track the groupBy 
expression's order,just as following(in GroupByCompiler.java):
{code:borderStyle=solid} 
144    if (isOrderPreserving) {
145         OrderPreservingTracker tracker = new 
OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, 
expressions.size(), tupleProjector);
146          for (int i = 0; i < expressions.size(); i++) {
147                    Expression expression = expressions.get(i);
148                    tracker.track(expression);
149           }
{code}


The track method inappropriately used the sortOrder != SortOrder.getDefault() 
as the thrid "isNullsLast" parameter as following(in 
OrderPreservingTracker.java):

{code:borderStyle=solid} 
  101     public void track(Expression node) {
  102           SortOrder sortOrder = node.getSortOrder();
  103           track(node, sortOrder, sortOrder != SortOrder.getDefault());
  104     }
  105    
  106    public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
{code}

Once the node's SortOrder is SortOrder.DESC,  the "isNullsLast" is true. it 
affected the GroupBy 's isOrderPreserving  as following(in 
OrderPreservingTracker.java) :

{code:borderStyle=solid}
  141    if (node.isNullable()) {
  142        if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
  143              isOrderPreserving = false;
  144              isReverse = false;
  145              return;
  146        }
  147  }
{code}

As far as the sql "SELECT DISTINCT entity_id,score FROM test.test WHERE 
organization_id = 'org1' AND container_id = 'container1' ORDER BY score DESC" 
is concerned, GroupBy 's "isOrderPreserving" property is false,but the 
OrderByComilper thinks the OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY. 
Because the GroupBy 's "isOrderPreserving" property is false,so the aggregated 
results returned by the RegionServer's GroupedAggregateRegionObserver is 
unordered, and after the AggregatePlan of client side gets the aggregated 
results, the results only be sorted by the aggregated key through 
MergeSortRowKeyResultIterator, so the final results is not as expected.If the 
GroupBy 's "isOrderPreserving" property is true,the final results is ok.

Actually, the "isNullsLast"  parameter is just related to orderBy ,it  should 
only affected the display order of "Null " in the sorted results , groupBy 
should not be affetced by "isNullsLast". I wrote a simple unit test to 
reproduce this problem in my patch:

{code:borderStyle=solid}
    @Test
    public void testGroupByDesc() throws Exception {
        Connection conn = DriverManager.getConnection(getUrl());
        try {
            conn.createStatement().execute("DROP TABLE IF EXISTS 
GROUPBYDESC_TEST");

            String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
                    "ORGANIZATION_ID VARCHAR,"+
                    "CONTAINER_ID VARCHAR,"+
                    "CONSTRAINT TEST_PK PRIMARY KEY ( "+
                    "ORGANIZATION_ID DESC,"+
                    "CONTAINER_ID DESC"+
                    "))";
            conn.createStatement().execute(sql);


            sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM 
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
            PhoenixPreparedStatement statement = 
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);

            QueryPlan queryPlan = statement.optimizeQuery(sql);
            queryPlan.iterator();
            assertTrue(queryPlan.getGroupBy().isOrderPreserving());

        } finally {
            conn.close();
        }
    }
{code}

I uploaded my patch, [~jamestaylor], please review.



was (Author: comnetwork):
I think the problem is cause by the GroupByCompiler, when GroupBy.compile 
method called the OrderPreservingTracker.track method to track the groupBy 
expression's order,just as following(in GroupByCompiler.java):
{code:borderStyle=solid} 
144    if (isOrderPreserving) {
145         OrderPreservingTracker tracker = new 
OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, 
expressions.size(), tupleProjector);
146          for (int i = 0; i < expressions.size(); i++) {
147                    Expression expression = expressions.get(i);
148                    tracker.track(expression);
149           }
{code}


The track method inappropriately used the sortOrder != SortOrder.getDefault() 
as the thrid "isNullsLast" parameter as following(in 
OrderPreservingTracker.java):

{code:borderStyle=solid} 
  101     public void track(Expression node) {
  102           SortOrder sortOrder = node.getSortOrder();
  103           track(node, sortOrder, sortOrder != SortOrder.getDefault());
  104     }
  105    
  106    public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
{code}

Once the node's SortOrder is SortOrder.DESC,  the "isNullsLast" is true. it 
affected the GroupBy 's isOrderPreserving  as following(in 
OrderPreservingTracker.java) :

{code:borderStyle=solid}
  141    if (node.isNullable()) {
  142        if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
  143              isOrderPreserving = false;
  144              isReverse = false;
  145              return;
  146        }
  147  }
{code}

As far as the sql "SELECT DISTINCT entity_id,score FROM test.test WHERE 
organization_id = 'org1' AND container_id = 'container1' ORDER BY score DESC" 
concerned, GroupBy 's "isOrderPreserving" property is false,but the 
OrderByComilper thinks the OrderBy is OrderBy.FWD_ROW_KEY_ORDER_BY. Because 
GroupByComilper 's "isOrderPreserving" property is false,so the  aggregated 
results returned by the RegionServer's GroupedAggregateRegionObserver is 
unordered, and after the AggregatePlan of client side get the aggregated 
results,the results just be sorted by the aggregatedKey through 
MergeSortRowKeyResultIterator, so the final results is not as expected.

Actually, the "isNullsLast"  parameter is just related to orderBy ,it  should 
just affected the display order of "Null " in the sorted results , groupBy 
should not be affetced by "isNullsLast". I wrote a simple unit test to 
reproduce this problem in my patch:

{code:borderStyle=solid}
    @Test
    public void testGroupByDesc() throws Exception {
        Connection conn = DriverManager.getConnection(getUrl());
        try {
            conn.createStatement().execute("DROP TABLE IF EXISTS 
GROUPBYDESC_TEST");

            String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
                    "ORGANIZATION_ID VARCHAR,"+
                    "CONTAINER_ID VARCHAR,"+
                    "CONSTRAINT TEST_PK PRIMARY KEY ( "+
                    "ORGANIZATION_ID DESC,"+
                    "CONTAINER_ID DESC"+
                    "))";
            conn.createStatement().execute(sql);


            sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM 
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
            PhoenixPreparedStatement statement = 
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);

            QueryPlan queryPlan = statement.optimizeQuery(sql);
            queryPlan.iterator();
            assertTrue(queryPlan.getGroupBy().isOrderPreserving());

        } finally {
            conn.close();
        }
    }
{code}

I uploaded my patch, [~jamestaylor], please review.


> Secondary index and query using distinct: ORDER BY doesn't work correctly
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-3452
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3452
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.8.0
>            Reporter: Joel Palmert
>            Assignee: chenglei
>         Attachments: PHOENIX-3452_v2.patch
>
>
> This may be related to PHOENIX-3451 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
>     ORGANIZATION_ID CHAR(15) NOT NULL,
>     CONTAINER_ID CHAR(15) NOT NULL,
>     ENTITY_ID CHAR(15) NOT NULL,
>     SCORE DOUBLE,
>     CONSTRAINT TEST_PK PRIMARY KEY (
>         ORGANIZATION_ID,
>         CONTAINER_ID,
>         ENTITY_ID
>     )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org1','container1','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org1','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org1','container1','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org1','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org1','container1','entityId2',1.5);
> UPSERT INTO test.test VALUES ('org1','container1','entityId1',1.6);
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org1'
> AND container_id = 'container1'
> ORDER BY score DESC
> Notice that the returned results are not returned in descending score order. 
> Instead they are returned in descending entity_id order. If I remove the 
> DISTINCT or remove the secondary index the result is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

Reply via email to