Thanks Aditya,

I now see more information in the log file. The count is way too low as there 
are 10000000 rows in the table, here is what I see from the drillbit.log file, 
running the following query: select count(*) from `hbase`.`test6c`:

Assignment Map: {0=[HBaseScanSpec [tableName=test6c, startRow=, 
stopRow=row5760331, filter=null, regionServer=dev01]], 1=[HBaseScanSpec 
[tableName=test6c, startRow=row5760331, stopRow=, filter=null, 
regionServer=dev01]]}
…
2016-03-16 20:41:02,010 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG 
o.a.d.e.s.hbase.HBaseRecordReader - Took 644 ms to get 3362 records
2016-03-16 20:41:02,013 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG 
o.a.d.e.s.hbase.HBaseRecordReader - Took 662 ms to get 3362 records
…
2016-03-16 20:41:03,054 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG 
o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
2016-03-16 20:41:03,062 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG 
o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
…

In HBase there are two regions for this table, a count from HBase shell gives 
me the following: 10000000 row(s) in 483.3140 seconds

A scan for one row returns the following, this is test data:

row1                                                     column=cf:c1, 
timestamp=1452736216785, value=1
row1                                                     column=cf:c2, 
timestamp=1452736216785, value=6
row1                                                     column=cf:c3, 
timestamp=1452736216785, value=b
row1                                                     column=cf:c4, 
timestamp=1452736216785, value=5
row1                                                     column=cf:c5, 
timestamp=1452736216785, value=b
row1                                                     column=cf:c6, 
timestamp=1452736216785, value=1

Thanks,

Kevin

From: Aditya [mailto:[email protected]]
Sent: Wednesday, March 16, 2016 1:33 PM
To: Kevin Verhoeven <[email protected]>
Cc: [email protected]
Subject: Re: Drill Storage Plugin for HBase 1.0

My apologies, I should have been more explicit.
For changing any log related settings, you'd need to modify 
"$DRILL_HOME/conf/logback.xml" on all Drillbit nodes and add the following 
snippet

  <logger name="org.apache.drill.exec.store.hbase" additivity="false">
    <level value="debug" />
    <appender-ref ref="FILE" />
  </logger>


On Wed, Mar 16, 2016 at 1:05 PM, Kevin Verhoeven 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Aditya,

Where do I set the "org.apache.drill.exec.store.hbase.HBaseRecordReader" 
parameter? I set it under the drill-override.conf file and restarted Drill but 
I do not see this appear in the log.

My drill-override.conf looks something like this (I abbreviated the file):

drill.exec: {
  cluster-id: "drillbits1"
  rpc: {
    user: {
      server: {
        port: 31010
        threads: 1
      }
      client: {
        threads: 1
      }
    },
    bit: {
      server: {
        port : 31011,
        retry:{
          count: 7200,
          delay: 500
        },
        threads: 1
      }
    },
        use.ip : false
  },
  debug.error_on_leak: true,
  store.hbase.HBaseRecordReader: true
}

Thanks!

Kevin

From: Aditya [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, March 16, 2016 11:11 AM
To: Kevin Verhoeven 
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Re: Drill Storage Plugin for HBase 1.0

Could you please enable debug logging for 
"org.apache.drill.exec.store.hbase.HBaseRecordReader" and look for following 
log message in drillbit.log from this logger.

> Took xxxx ms to get yyyyy records
Does sum(yyyyy) for a query matches the expected number of record?

On Wed, Mar 16, 2016 at 10:58 AM, Kevin Verhoeven 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Aditya,

We see a problem when we query HBase 1.0 tables with more than five column 
qualifiers, where only a small amount of data is returned. We are able to repro 
the problem with Drill 1.5 against HBase 1.0 tables of just about any size, for 
example with 1,000,000 rows. A query against the table may return less than 
1,300 rows.

We use Drill 1.5 against CDH 5.4.2, running HBase 1.0.0-cdh5.4.2.

Do you have any recommendations?

Kevin

-----Original Message-----
From: Aditya [mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, March 16, 2016 9:41 AM
To: user <[email protected]<mailto:[email protected]>>
Subject: Re: Drill Storage Plugin for HBase 1.0

Hi Kevin,

HBase 0.98.x and 1.x are wire compatible[1] and hence the HBase 0.98 client 
included in Drill distribution can access HBase 1.x cluster.

[1] https://blogs.apache.org/hbase/entry/start_of_a_new_era

On Tue, Mar 15, 2016 at 10:18 AM, Kevin Verhoeven 
<[email protected]<mailto:[email protected]>
> wrote:

> Currently the HBase Storage plugin supports HBase 0.98. Is there a
> plan to update the storage plugin to support HBase 1.0?
>
> Thanks!
>
> Kevin
>


Reply via email to