Sounds like something else is going wrong. Can you adapt your test by
setting the MAX_FILESIZE very low for your table (so that it splits after 4
or 5 rows are added) and package it up as a unit test?
On Thu, Jul 9, 2015 at 1:44 PM, Yufan Liu yli...@kent.edu wrote:
Just got a chance to revisit this issue: I have rebuilt the index and it
still returns the unexpected result. By using the test case, I tried to
insert enough rows to make it auto-split and it reproduces the problem too.
It seems it still has trouble returning last row sorted by first component
of primary key on split tables. Maybe there is another issue than
PHOENIX-2096? The phoenix I am using is pulled from latest 4.x-HBase-0.98
branch which includes the patch of PHOENIX-2096.
2015-07-02 19:55 GMT-07:00 James Taylor jamestay...@apache.org:
On further investigation, I believe it should have been working before. I
did a bit of cleanup and attached a new patch to PHOENIX-2096, but this
would only prevent a merge sort when one is not required (basically
improving performance).
Maybe your index is invalid? You can try rebuilding with this command:
https://phoenix.apache.org/language/index.html#alter_index
On Thu, Jul 2, 2015 at 5:26 PM, Yufan Liu yli...@kent.edu wrote:
The query on test dataset is returning the expected result with the
patch. But on the original dataset (10million rows, 6 regions), it still
return the same unexpected result, I will dig more into this. Thank you,
James!
2015-07-02 9:58 GMT-07:00 Yufan Liu yli...@kent.edu:
Sure, let me have a try
2015-07-02 9:46 GMT-07:00 James Taylor jamestay...@apache.org:
Thanks, Yufan. I found an issue and filed PHOENIX-2096 with a patch.
Would you mind confirming that this fixes the issue you're seeing?
James
On Thu, Jul 2, 2015 at 9:45 AM, Yufan Liu yli...@kent.edu wrote:
I'm using 4.4.0-HBase-0.98
2015-07-01 22:31 GMT-07:00 James Taylor jamestay...@apache.org:
Yufan,
What version of Phoenix are you using?
Thanks,
James
On Wed, Jul 1, 2015 at 2:34 PM, Yufan Liu yli...@kent.edu wrote:
When I made more tests, I find that this problem happens after
table got split.
Here is the DDL I use to create table and index:
CREATE TABLE IF NOT EXISTS t1 (
uid BIGINT NOT NULL,
timestamp BIGINT NOT NULL,
eventName VARCHAR
CONSTRAINT my_pk PRIMARY KEY (uid, timestamp))
COMPRESSION='SNAPPY';
CREATE INDEX timestamp_index ON t1 (timestamp) INCLUDE (eventName)
Attach is the sample data I used for test. It has about 4000 rows,
when the timestamp_index table has one region, the query returns
correct
result: 144048443, but when I manually split it into 4 regions (use
hbase tool), it returns 143024961.
Let know if you find anything. Thanks!
2015-07-01 11:27 GMT-07:00 James Taylor jamestay...@apache.org:
If you could put a complete test (including your DDL and upsert of
data), that would be much appreciated.
Thanks,
James
On Wed, Jul 1, 2015 at 11:20 AM, Yufan Liu yli...@kent.edu
wrote:
I have tried to use query SELECT timestamp FROM t1 ORDER BY
timestamp DESC NULLS LAST LIMIT 1. But it still returns the same
unexpected result. There seems to be some internal problems related.
2015-06-30 18:03 GMT-07:00 James Taylor jamestay...@apache.org:
Yes, reverse scan will be leveraged when possible. Make you use
NULLS LAST in your ORDER BY as rows are ordered with nulls first.
On Tue, Jun 30, 2015 at 5:25 PM, Yufan Liu yli...@kent.edu
wrote:
I used the HBase reverse scan to find the last row on the index
table. It returned the expected result. I would like to know is
Phoenix's
ORDER BY
and DESC implemented based on HBase reverse scan?
2015-06-26 17:25 GMT-07:00 Yufan Liu yli...@kent.edu:
Thank you anyway, Michael!
2015-06-26 17:21 GMT-07:00 Michael McAllister
mmcallis...@homeaway.com:
OK, I’m a Phoenix newbie, so that was the extent of the
advice I could give you. There are people here far more
experienced than I
am who should be able to give you deeper advice. Have a great
weekend!
Mike
*From:* Yufan Liu [mailto:yli...@kent.edu]
*Sent:* Friday, June 26, 2015 7:19 PM
*To:* user@phoenix.apache.org
*Subject:* Re: Problem in finding the largest value of an
indexed column
Hi Michael,
Thanks for the advice, for the first one, it's CLIENT
67-CHUNK PARALLEL 1-WAY FULL SCAN OVER TIMESTAMP_INDEX; SERVER
FILTER BY
FIRST KEY ONLY; SERVER AGGREGATE INTO SINGLE ROW which is as
expected. For
the second one, it's CLIENT 67-CHUNK SERIAL 1-WAY REVERSE FULL
SCAN OVER
TIMESTAMP_INDEX; SERVER FILTER BY FIRST KEY ONLY; SERVER 1 ROW
LIMIT which
looks correct, but still returns the unexpected result.
2015-06-26 16:59 GMT-07:00 Michael McAllister
mmcallis...@homeaway.com:
Yufan
Have you tried using the EXPLAIN command to see what plan is
being used to access the data?
Michael McAllister
Staff Data Warehouse Engineer | Decision Systems
mmcallis...@homeaway.com | C: