Thanks Aditya,
I now see more information in the log file. The count is way too low as there
are 10000000 rows in the table, here is what I see from the drillbit.log file,
running the following query: select count(*) from `hbase`.`test6c`:
Assignment Map: {0=[HBaseScanSpec [tableName=test6c, startRow=,
stopRow=row5760331, filter=null, regionServer=dev01]], 1=[HBaseScanSpec
[tableName=test6c, startRow=row5760331, stopRow=, filter=null,
regionServer=dev01]]}
…
2016-03-16 20:41:02,010 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG
o.a.d.e.s.hbase.HBaseRecordReader - Took 644 ms to get 3362 records
2016-03-16 20:41:02,013 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG
o.a.d.e.s.hbase.HBaseRecordReader - Took 662 ms to get 3362 records
…
2016-03-16 20:41:03,054 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG
o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
2016-03-16 20:41:03,062 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG
o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
…
In HBase there are two regions for this table, a count from HBase shell gives
me the following: 10000000 row(s) in 483.3140 seconds
A scan for one row returns the following, this is test data:
row1 column=cf:c1,
timestamp=1452736216785, value=1
row1 column=cf:c2,
timestamp=1452736216785, value=6
row1 column=cf:c3,
timestamp=1452736216785, value=b
row1 column=cf:c4,
timestamp=1452736216785, value=5
row1 column=cf:c5,
timestamp=1452736216785, value=b
row1 column=cf:c6,
timestamp=1452736216785, value=1
Thanks,
Kevin
From: Aditya [mailto:[email protected]]
Sent: Wednesday, March 16, 2016 1:33 PM
To: Kevin Verhoeven <[email protected]>
Cc: [email protected]
Subject: Re: Drill Storage Plugin for HBase 1.0
My apologies, I should have been more explicit.
For changing any log related settings, you'd need to modify
"$DRILL_HOME/conf/logback.xml" on all Drillbit nodes and add the following
snippet
<logger name="org.apache.drill.exec.store.hbase" additivity="false">
<level value="debug" />
<appender-ref ref="FILE" />
</logger>
On Wed, Mar 16, 2016 at 1:05 PM, Kevin Verhoeven
<[email protected]<mailto:[email protected]>> wrote:
Thanks Aditya,
Where do I set the "org.apache.drill.exec.store.hbase.HBaseRecordReader"
parameter? I set it under the drill-override.conf file and restarted Drill but
I do not see this appear in the log.
My drill-override.conf looks something like this (I abbreviated the file):
drill.exec: {
cluster-id: "drillbits1"
rpc: {
user: {
server: {
port: 31010
threads: 1
}
client: {
threads: 1
}
},
bit: {
server: {
port : 31011,
retry:{
count: 7200,
delay: 500
},
threads: 1
}
},
use.ip : false
},
debug.error_on_leak: true,
store.hbase.HBaseRecordReader: true
}
Thanks!
Kevin
From: Aditya [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, March 16, 2016 11:11 AM
To: Kevin Verhoeven
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: Drill Storage Plugin for HBase 1.0
Could you please enable debug logging for
"org.apache.drill.exec.store.hbase.HBaseRecordReader" and look for following
log message in drillbit.log from this logger.
> Took xxxx ms to get yyyyy records
Does sum(yyyyy) for a query matches the expected number of record?
On Wed, Mar 16, 2016 at 10:58 AM, Kevin Verhoeven
<[email protected]<mailto:[email protected]>> wrote:
Thanks Aditya,
We see a problem when we query HBase 1.0 tables with more than five column
qualifiers, where only a small amount of data is returned. We are able to repro
the problem with Drill 1.5 against HBase 1.0 tables of just about any size, for
example with 1,000,000 rows. A query against the table may return less than
1,300 rows.
We use Drill 1.5 against CDH 5.4.2, running HBase 1.0.0-cdh5.4.2.
Do you have any recommendations?
Kevin
-----Original Message-----
From: Aditya [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, March 16, 2016 9:41 AM
To: user <[email protected]<mailto:[email protected]>>
Subject: Re: Drill Storage Plugin for HBase 1.0
Hi Kevin,
HBase 0.98.x and 1.x are wire compatible[1] and hence the HBase 0.98 client
included in Drill distribution can access HBase 1.x cluster.
[1] https://blogs.apache.org/hbase/entry/start_of_a_new_era
On Tue, Mar 15, 2016 at 10:18 AM, Kevin Verhoeven
<[email protected]<mailto:[email protected]>
> wrote:
> Currently the HBase Storage plugin supports HBase 0.98. Is there a
> plan to update the storage plugin to support HBase 1.0?
>
> Thanks!
>
> Kevin
>