[jira] [Comment Edited] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

chenglei (JIRA) Thu, 10 Nov 2016 00:37:42 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649794#comment-15649794
 ]


chenglei edited comment on PHOENIX-3452 at 11/10/16 8:36 AM:
-------------------------------------------------------------

I think the problem is cause by the GroupByCompiler, when GroupBy.compile 
method called the OrderPreservingTracker.track method to track the groupBy 
expression's order,just as following(in GroupByCompiler.java):
{code:borderStyle=solid} 
144    if (isOrderPreserving) {
145         OrderPreservingTracker tracker = new 
OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, 
expressions.size(), tupleProjector);
146          for (int i = 0; i < expressions.size(); i++) {
147                    Expression expression = expressions.get(i);
148                    tracker.track(expression);
149           }
{code}


The track method inappropriately used the sortOrder != SortOrder.getDefault() 
as the thrid "isNullsLast" parameter as following(in 
OrderPreservingTracker.java):

{code:borderStyle=solid} 
  101     public void track(Expression node) {
  102           SortOrder sortOrder = node.getSortOrder();
  103           track(node, sortOrder, sortOrder != SortOrder.getDefault());
  104     }
  105    
  106    public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
{code}

Once the node's SortOrder is SortOrder.DESC,  the "isNullsLast" is true. it 
affected the GroupBy 's isOrderPreserving  as following(in 
OrderPreservingTracker.java) :

{code:borderStyle=solid}
  141    if (node.isNullable()) {
  142        if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
  143              isOrderPreserving = false;
  144              isReverse = false;
  145              return;
  146        }
  147  }
{code}

As far as the sql concerned, GroupBy 's "isOrderPreserving" property is 
false,but the OrderByComilper thinks the OrderBy is 
OrderBy.FWD_ROW_KEY_ORDER_BY. Because GroupByComilper 's "isOrderPreserving" 
property is false,so the  aggregated results returned by the RegionServer's 
GroupedAggregateRegionObserver is unordered, and after the AggregatePlan of 
client side get the aggregated results,the results just be sorted by the 
aggregatedKey through MergeSortRowKeyResultIterator, so the final results is 
not as expected.

Actually, the "isNullsLast"  parameter is just related to orderBy ,it  should 
just affected the display order of "Null " in the sorted results , groupBy 
should not be affetced by "isNullsLast". I wrote a simple unit test to 
reproduce this problem in my patch:

{code:borderStyle=solid}
    @Test
    public void testGroupByDesc() throws Exception {
        Connection conn = DriverManager.getConnection(getUrl());
        try {
            conn.createStatement().execute("DROP TABLE IF EXISTS 
GROUPBYDESC_TEST");

            String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
                    "ORGANIZATION_ID VARCHAR,"+
                    "CONTAINER_ID VARCHAR,"+
                    "CONSTRAINT TEST_PK PRIMARY KEY ( "+
                    "ORGANIZATION_ID DESC,"+
                    "CONTAINER_ID DESC"+
                    "))";
            conn.createStatement().execute(sql);


            sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM 
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
            PhoenixPreparedStatement statement = 
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);

            QueryPlan queryPlan = statement.optimizeQuery(sql);
            queryPlan.iterator();
            assertTrue(queryPlan.getGroupBy().isOrderPreserving());

        } finally {
            conn.close();
        }
    }
{code}

I uploaded my patch, [~jamestaylor], please review.



was (Author: comnetwork):
I think the problem is cause by the GroupByCompiler, when GroupBy.compile 
method called the OrderPreservingTracker.track method to track the groupBy 
expression's order,just as following(in GroupByCompiler.java):
{code:borderStyle=solid} 
144    if (isOrderPreserving) {
145         OrderPreservingTracker tracker = new 
OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, 
expressions.size(), tupleProjector);
146          for (int i = 0; i < expressions.size(); i++) {
147                    Expression expression = expressions.get(i);
148                    tracker.track(expression);
149           }
{code}


The track method inappropriately used the sortOrder != SortOrder.getDefault() 
as the thrid "isNullsLast" parameter as following(in 
OrderPreservingTracker.java):

{code:borderStyle=solid} 
  101     public void track(Expression node) {
  102           SortOrder sortOrder = node.getSortOrder();
  103           track(node, sortOrder, sortOrder != SortOrder.getDefault());
  104     }
  105    
  106    public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
{code}

Once the node's SortOrder is SortOrder.DESC,  the "isNullsLast" is true. it 
affected the GroupBy 's isOrderPreserving  as following(in 
OrderPreservingTracker.java) :

{code:borderStyle=solid}
  141    if (node.isNullable()) {
  142        if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
  143              isOrderPreserving = false;
  144              isReverse = false;
  145              return;
  146        }
  147  }
{code}

Actually, the "isNullsLast"  parameter is just related to orderBy ,it  should 
just affected the display order of "Null " in the sorted results , groupBy 
should not be affetced by "isNullsLast". I wrote a simple unit test to 
reproduce this problem in my patch:

{code:borderStyle=solid}
    @Test
    public void testGroupByDesc() throws Exception {
        Connection conn = DriverManager.getConnection(getUrl());
        try {
            conn.createStatement().execute("DROP TABLE IF EXISTS 
GROUPBYDESC_TEST");

            String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
                    "ORGANIZATION_ID VARCHAR,"+
                    "CONTAINER_ID VARCHAR,"+
                    "CONSTRAINT TEST_PK PRIMARY KEY ( "+
                    "ORGANIZATION_ID DESC,"+
                    "CONTAINER_ID DESC"+
                    "))";
            conn.createStatement().execute(sql);


            sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM 
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
            PhoenixPreparedStatement statement = 
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);

            QueryPlan queryPlan = statement.optimizeQuery(sql);
            queryPlan.iterator();
            assertTrue(queryPlan.getGroupBy().isOrderPreserving());

        } finally {
            conn.close();
        }
    }
{code}

I uploaded my patch, [~jamestaylor], please review.


> Secondary index and query using distinct: ORDER BY doesn't work correctly
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-3452
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3452
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.8.0
>            Reporter: Joel Palmert
>            Assignee: chenglei
>         Attachments: PHOENIX-3452_v2.patch
>
>
> This may be related to PHOENIX-3451 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
>     ORGANIZATION_ID CHAR(15) NOT NULL,
>     CONTAINER_ID CHAR(15) NOT NULL,
>     ENTITY_ID CHAR(15) NOT NULL,
>     SCORE DOUBLE,
>     CONSTRAINT TEST_PK PRIMARY KEY (
>         ORGANIZATION_ID,
>         CONTAINER_ID,
>         ENTITY_ID
>     )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org1','container1','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org1','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org1','container1','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org1','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org1','container1','entityId2',1.5);
> UPSERT INTO test.test VALUES ('org1','container1','entityId1',1.6);
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org1'
> AND container_id = 'container1'
> ORDER BY score DESC
> Notice that the returned results are not returned in descending score order. 
> Instead they are returned in descending entity_id order. If I remove the 
> DISTINCT or remove the secondary index the result is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

Reply via email to