I'm using 4.4.0-HBase-0.98 2015-07-01 22:31 GMT-07:00 James Taylor <jamestay...@apache.org>:
> Yufan, > What version of Phoenix are you using? > Thanks, > James > > On Wed, Jul 1, 2015 at 2:34 PM, Yufan Liu <yli...@kent.edu> wrote: > >> When I made more tests, I find that this problem happens after table got >> split. >> >> Here is the DDL I use to create table and index: >> CREATE TABLE IF NOT EXISTS t1 ( >> uid BIGINT NOT NULL, >> timestamp BIGINT NOT NULL, >> eventName VARCHAR >> CONSTRAINT my_pk PRIMARY KEY (uid, timestamp)) COMPRESSION='SNAPPY'; >> >> CREATE INDEX timestamp_index ON t1 (timestamp) INCLUDE (eventName) >> >> Attach is the sample data I used for test. It has about 4000 rows, when >> the timestamp_index table has one region, the query returns correct result: >> 1433334048443, but when I manually split it into 4 regions (use hbase >> tool), it returns 1433333024961. >> >> Let know if you find anything. Thanks! >> >> >> 2015-07-01 11:27 GMT-07:00 James Taylor <jamestay...@apache.org>: >> >>> If you could put a complete test (including your DDL and upsert of >>> data), that would be much appreciated. >>> Thanks, >>> James >>> >>> On Wed, Jul 1, 2015 at 11:20 AM, Yufan Liu <yli...@kent.edu> wrote: >>> >>>> I have tried to use query "SELECT timestamp FROM t1 ORDER BY timestamp >>>> DESC NULLS LAST LIMIT 1". But it still returns the same unexpected result. >>>> There seems to be some internal problems related. >>>> >>>> 2015-06-30 18:03 GMT-07:00 James Taylor <jamestay...@apache.org>: >>>> >>>>> Yes, reverse scan will be leveraged when possible. Make you use NULLS >>>>> LAST in your ORDER BY as rows are ordered with nulls first. >>>>> >>>>> On Tue, Jun 30, 2015 at 5:25 PM, Yufan Liu <yli...@kent.edu> wrote: >>>>> >>>>>> I used the HBase reverse scan to find the last row on the index >>>>>> table. It returned the expected result. I would like to know is Phoenix's >>>>>> "ORDER BY" >>>>>> and "DESC" implemented based on HBase reverse scan? >>>>>> >>>>>> 2015-06-26 17:25 GMT-07:00 Yufan Liu <yli...@kent.edu>: >>>>>> >>>>>>> Thank you anyway, Michael! >>>>>>> >>>>>>> 2015-06-26 17:21 GMT-07:00 Michael McAllister < >>>>>>> mmcallis...@homeaway.com>: >>>>>>> >>>>>>>> OK, I’m a Phoenix newbie, so that was the extent of the advice I >>>>>>>> could give you. There are people here far more experienced than I am >>>>>>>> who >>>>>>>> should be able to give you deeper advice. Have a great weekend! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* Yufan Liu [mailto:yli...@kent.edu] >>>>>>>> *Sent:* Friday, June 26, 2015 7:19 PM >>>>>>>> *To:* user@phoenix.apache.org >>>>>>>> *Subject:* Re: Problem in finding the largest value of an indexed >>>>>>>> column >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi Michael, >>>>>>>> >>>>>>>> Thanks for the advice, for the first one, it's "CLIENT 67-CHUNK >>>>>>>> PARALLEL 1-WAY FULL SCAN OVER TIMESTAMP_INDEX; SERVER FILTER BY FIRST >>>>>>>> KEY >>>>>>>> ONLY; SERVER AGGREGATE INTO SINGLE ROW" which is as expected. For the >>>>>>>> second one, it's "CLIENT 67-CHUNK SERIAL 1-WAY REVERSE FULL SCAN OVER >>>>>>>> TIMESTAMP_INDEX; SERVER FILTER BY FIRST KEY ONLY; SERVER 1 ROW LIMIT" >>>>>>>> which >>>>>>>> looks correct, but still returns the unexpected result. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2015-06-26 16:59 GMT-07:00 Michael McAllister < >>>>>>>> mmcallis...@homeaway.com>: >>>>>>>> >>>>>>>> Yufan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Have you tried using the EXPLAIN command to see what plan is being >>>>>>>> used to access the data? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Michael McAllister >>>>>>>> >>>>>>>> Staff Data Warehouse Engineer | Decision Systems >>>>>>>> >>>>>>>> mmcallis...@homeaway.com | C: 512.423.7447 | skype: >>>>>>>> michael.mcallister.ha <zimmk...@hotmail.com> | webex: >>>>>>>> https://h.a/mikewebex >>>>>>>> >>>>>>>> [image: Description: Description: cid:3410354473_30269081] >>>>>>>> >>>>>>>> This electronic communication (including any attachment) is >>>>>>>> confidential. If you are not an intended recipient of this >>>>>>>> communication, >>>>>>>> please be advised that any disclosure, dissemination, distribution, >>>>>>>> copying >>>>>>>> or other use of this communication or any attachment is strictly >>>>>>>> prohibited. If you have received this communication in error, please >>>>>>>> notify the sender immediately by reply e-mail and promptly destroy all >>>>>>>> electronic and printed copies of this communication and any attachment. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* Yufan Liu [mailto:yli...@kent.edu] >>>>>>>> *Sent:* Friday, June 26, 2015 6:31 PM >>>>>>>> *To:* user@phoenix.apache.org >>>>>>>> *Subject:* Problem in finding the largest value of an indexed >>>>>>>> column >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> We have created a table (eg, t1), and a global index of one numeric >>>>>>>> column of t1 (eg, timestamp). Now we want to find the largest value of >>>>>>>> timestamp, we have tried two approaches: >>>>>>>> >>>>>>>> >>>>>>>> 1. select max(timestamp) from t1; This query takes forever to >>>>>>>> finish, so I think it maybe doing a full table scan/comparison . >>>>>>>> >>>>>>>> 2. select timestamp from t1 order by timestamp desc limit 1; This >>>>>>>> query finished fast, but the result it returns is far from the largest >>>>>>>> value. It seems it just return the largest value for a certain range of >>>>>>>> data. >>>>>>>> >>>>>>>> Did anyone else encounter this issue/have any suggestion? >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Yufan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> best, >>>>>>>> Yufan >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> best, >>>>>>> Yufan >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> best, >>>>>> Yufan >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> best, >>>> Yufan >>>> >>>> >>> >> >> >> -- >> best, >> Yufan >> >> > -- best, Yufan