Couple more interesting things:
If I change the original query to use a different data source id (leading
column in the PK) but use the 2099 date, it returns after 200 seconds.
Also, if I use the Hbase shell to count the number of rows: count
‘DIM_HOUSEHOLD_ATTRIBUTE’, it hangs after 632000. I do happen to know that the
number of rows for data source id 112000002 is 650k.
If I run the hbck command against this table, it says everything is ok:
Summary:
DIM_HOUSEHOLD_ATTRIBUTE is okay.
Number of regions: 25
Deployed on: ip-10-10-0-215.ec2.internal,60020,1447267050696
ip-10-10-0-216.ec2.internal,60020,1447267050678
ip-10-10-0-218.ec2.internal,60020,1447267050433
ip-10-10-0-219.ec2.internal,60020,1447267050748
ip-10-10-0-220.ec2.internal,60020,1447267050565
ip-10-10-0-221.ec2.internal,60020,1447267050898
ip-10-10-0-222.ec2.internal,60020,1447267050967
ip-10-10-0-223.ec2.internal,60020,1447267050538
ip-10-10-0-224.ec2.internal,60020,1447267050660
ip-10-10-0-225.ec2.internal,60020,1447267050819
ip-10-10-0-227.ec2.internal,60020,1447267050803
ip-10-10-0-229.ec2.internal,60020,1447267050921
hbase:meta is okay.
Number of regions: 1
Deployed on: ip-10-10-0-226.ec2.internal,60020,1447267050899
0 inconsistencies detected.
Status: OK
I am wondering if the table is somehow corrupted. Any way to find and fix? Any
other thoughts?
From: Kamran Saiyed
Date: Wednesday, November 11, 2015 at 4:15 PM
To: "[email protected]<mailto:[email protected]>"
Subject: OutOfOrderScannerNextException
Hi.
I am running CDH 5.3 with Hbase version 0.98.6+cdh5.3.0+73 and Phoenix v 4.2.2.
I have a table defined as follows:
create table if not exists dim_household_attribute(
data_source_id unsigned_int not null
,effective_end_date unsigned_date not null
,rpd_id unsigned_long not null
,provider_id varchar
,effective_start_date unsigned_date
,provider varchar
,product varchar
,product_version varchar
,originator varchar
,originator_locale varchar
,originator_sublocale varchar
,raw_file varchar
,creation_time unsigned_date
,batch_load_time unsigned_date
,guids varchar
,household_rpd_id unsigned_long
,data_source_household_id varchar
constraint pk primary key (data_source_id, effective_end_date, rpd_id)
) COMPRESSION = 'SNAPPY’;
The table has several million rows in it. I am having an issue with what would
seem to be a simple query on the first 2 columns of the primary key. If I run
the following select statement:
select data_source_id, effective_end_date
from dim_household_attribute
where data_source_id = 112000003
and effective_end_date = to_date('2099-12-31', 'yyyy-MM-dd')
limit 1;
I get the following error:
java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException:
Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
at sqlline.SqlLine.print(SqlLine.java:1735)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
at sqlline.SqlLine.dispatch(SqlLine.java:821)
at sqlline.SqlLine.begin(SqlLine.java:699)
at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
at sqlline.SqlLine.main(SqlLine.java:424)
However, if I only use the first column of the key, I get a result back very
quickly:
select data_source_id, effective_end_date
from dim_household_attribute
where data_source_id = 112000003
limit 1;
+------------------------------------------+---------------------+
| DATA_SOURCE_ID | EFFECTIVE_END_DATE |
+------------------------------------------+---------------------+
| 112000003 | 2015-10-17 |
+------------------------------------------+---------------------+
1 row selected (1.232 seconds)
Note that the date returning is 2015-10-17. If I change my original query to
use that date instead of the 2099-12-31 (which is a valid date in the system to
represent the row Is still valid) the query does return quickly.
Is there some configuration I am missing? Any help would be appreciated.
Thanks.
Kamran.