[ 
https://issues.apache.org/jira/browse/PHOENIX-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2067:
----------------------------------
    Attachment: PHOENIX-2067_v2.patch

[~samarthjain] - please review. Here's an overview:
- No change in behavior for existing tables. Queries that have an ORDER BY for 
a variable length, descending row key will now sort correctly, but at the 
expense of forcing an ORDER BY (since they aren't sorted correctly in their 
natural order, we can't optimize out the ORDER BY).
- New tables (or indexes) will use the correct separator for DESC variable 
length row keys, so they won't be hit with the ORDER BY cost. See 
SchemaUtil.getSeparatorByte() for an overview of the logic to determine the 
separator byte.
- A new utility (psql.py -u option) is available to 1) display the physical 
tables affected by this bug, and 2) to optionally re-write them so that they 
sort correctly.

> Sort order incorrect for variable length DESC columns
> -----------------------------------------------------
>
>                 Key: PHOENIX-2067
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2067
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.4.0
>         Environment: HBase 0.98.6-cdh5.3.0
> jdk1.7.0_67 x64
> CentOS release 6.4 (2.6.32-358.el6.x86_64)
>            Reporter: Mykola Komarnytskyy
>            Assignee: James Taylor
>         Attachments: PHOENIX-2067_v1.patch, PHOENIX-2067_v2.patch
>
>
> Steps to reproduce:
> 1. Create a table: 
> CREATE TABLE mytable (id BIGINT not null PRIMARY KEY, timestamp BIGINT, 
> log_message varchar) IMMUTABLE_ROWS=true, SALT_BUCKETS=16;
> 2. Create two indexes:
> CREATE INDEX mytable_index_search ON mytable(timestamp,id) INCLUDE 
> (log_message) SALT_BUCKETS=16;
> CREATE INDEX mytable_index_search_desc ON mytable(timestamp DESC,id DESC) 
> INCLUDE (log_message) SALT_BUCKETS=16;
> 3. Upsert values:
> UPSERT INTO mytable VALUES(1, 1434983826018, 'message1');
> UPSERT INTO mytable VALUES(2, 1434983826100, 'message2');
> UPSERT INTO mytable VALUES(3, 1434983826101, 'message3');
> UPSERT INTO mytable VALUES(4, 1434983826202, 'message4');
> 4. Sort DESC by timestamp:
> select timestamp,id,log_message from mytable ORDER BY timestamp DESC;
> Failure: data is sorted incorrectly. In case when we have two longs which  
> are different only by last two digits (e.g. 1434983826155, 1434983826100)  
> and one of the long ends with '00' we receive incorrect order. 
> Sorting result:
> 1434983826202
> 1434983826100
> 1434983826101
> 1434983826018



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to