I have just submitted a pull request[1] to add support for HBase 1.x.

[1] https://github.com/apache/drill/pull/443

On Sun, Mar 20, 2016 at 10:44 PM, Aditya <[email protected]> wrote:

> Finally managed to reproduce it with CDH distribution (So far I was
> testing with HBase 1.1 distributed with MapR, which does not have this bug).
>
> This is essentially an HBase bug, HBASE-13262[1], which has been fixed in
> 1.0.1, 1.1.0.
>
> Please update your HBase distribution.
>
> [1] https://issues.apache.org/jira/browse/HBASE-13262
>
> On Wed, Mar 16, 2016 at 5:01 PM, Kevin Verhoeven <
> [email protected]> wrote:
>
>> Thanks Aditya,
>>
>>
>>
>> There were no warnings/errors in the log, and the result of the count was
>> 6724 (3362 + 3362).
>>
>>
>>
>> Kevin
>>
>>
>>
>> *From:* Aditya [mailto:[email protected]]
>> *Sent:* Wednesday, March 16, 2016 4:48 PM
>>
>> *To:* Kevin Verhoeven <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Re: Drill Storage Plugin for HBase 1.0
>>
>>
>>
>> Thanks Kevin, that is useful information. I have couple of follow up
>> questions.
>>
>> 1. Was there any warning/error message in the log?
>>
>> 2. Was the result of count query in this instance == (3362 + 3362)?
>>
>>
>>
>> On Wed, Mar 16, 2016 at 1:55 PM, Kevin Verhoeven <
>> [email protected]> wrote:
>>
>> Thanks Aditya,
>>
>>
>>
>> I now see more information in the log file. The count is way too low as
>> there are 10000000 rows in the table, here is what I see from the
>> drillbit.log file, running the following query: select count(*) from
>> `hbase`.`test6c`:
>>
>>
>>
>> Assignment Map: {0=[HBaseScanSpec [tableName=test6c, startRow=,
>> stopRow=row5760331, filter=null, regionServer=dev01]], 1=[HBaseScanSpec
>> [tableName=test6c, startRow=row5760331, stopRow=, filter=null,
>> regionServer=dev01]]}
>>
>> …
>>
>> 2016-03-16 20:41:02,010 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0]
>> DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 644 ms to get 3362 records
>>
>> 2016-03-16 20:41:02,013 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1]
>> DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 662 ms to get 3362 records
>>
>> …
>>
>> 2016-03-16 20:41:03,054 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1]
>> DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
>>
>> 2016-03-16 20:41:03,062 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0]
>> DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records
>>
>> …
>>
>>
>>
>> In HBase there are two regions for this table, a count from HBase shell
>> gives me the following: 10000000 row(s) in 483.3140 seconds
>>
>>
>>
>> A scan for one row returns the following, this is test data:
>>
>>
>>
>> row1                                                     column=cf:c1,
>> timestamp=1452736216785, value=1
>>
>> row1                                                     column=cf:c2,
>> timestamp=1452736216785, value=6
>>
>> row1                                                     column=cf:c3,
>> timestamp=1452736216785, value=b
>>
>> row1                                                     column=cf:c4,
>> timestamp=1452736216785, value=5
>>
>> row1                                                     column=cf:c5,
>> timestamp=1452736216785, value=b
>>
>> row1                                                     column=cf:c6,
>> timestamp=1452736216785, value=1
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Kevin
>>
>>
>>
>> *From:* Aditya [mailto:[email protected]]
>> *Sent:* Wednesday, March 16, 2016 1:33 PM
>>
>>
>> *To:* Kevin Verhoeven <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Re: Drill Storage Plugin for HBase 1.0
>>
>>
>>
>> My apologies, I should have been more explicit.
>>
>> For changing any log related settings, you'd need to modify
>> "$DRILL_HOME/conf/logback.xml" on all Drillbit nodes and add the following
>> snippet
>>
>>   <logger name="org.apache.drill.exec.store.hbase" additivity="false">
>>     <level value="debug" />
>>     <appender-ref ref="FILE" />
>>   </logger>
>>
>>
>>
>> On Wed, Mar 16, 2016 at 1:05 PM, Kevin Verhoeven <
>> [email protected]> wrote:
>>
>> Thanks Aditya,
>>
>>
>>
>> Where do I set the "org.apache.drill.exec.store.hbase.HBaseRecordReader"
>> parameter? I set it under the drill-override.conf file and restarted Drill
>> but I do not see this appear in the log.
>>
>>
>>
>> My drill-override.conf looks something like this (I abbreviated the file):
>>
>>
>>
>> drill.exec: {
>>
>>   cluster-id: "drillbits1"
>>
>>   rpc: {
>>
>>     user: {
>>
>>       server: {
>>
>>         port: 31010
>>
>>         threads: 1
>>
>>       }
>>
>>       client: {
>>
>>         threads: 1
>>
>>       }
>>
>>     },
>>
>>     bit: {
>>
>>       server: {
>>
>>         port : 31011,
>>
>>         retry:{
>>
>>           count: 7200,
>>
>>           delay: 500
>>
>>         },
>>
>>         threads: 1
>>
>>       }
>>
>>     },
>>
>>         use.ip : false
>>
>>   },
>>
>>   debug.error_on_leak: true,
>>
>>   store.hbase.HBaseRecordReader: true
>>
>> }
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Kevin
>>
>>
>>
>> *From:* Aditya [mailto:[email protected]]
>> *Sent:* Wednesday, March 16, 2016 11:11 AM
>> *To:* Kevin Verhoeven <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Re: Drill Storage Plugin for HBase 1.0
>>
>>
>>
>> Could you please enable debug logging for
>> "org.apache.drill.exec.store.hbase.HBaseRecordReader" and look for
>> following log message in drillbit.log from this logger.
>>
>> > Took xxxx ms to get yyyyy records
>>
>> Does sum(yyyyy) for a query matches the expected number of record?
>>
>>
>>
>> On Wed, Mar 16, 2016 at 10:58 AM, Kevin Verhoeven <
>> [email protected]> wrote:
>>
>> Thanks Aditya,
>>
>> We see a problem when we query HBase 1.0 tables with more than five
>> column qualifiers, where only a small amount of data is returned. We are
>> able to repro the problem with Drill 1.5 against HBase 1.0 tables of just
>> about any size, for example with 1,000,000 rows. A query against the table
>> may return less than 1,300 rows.
>>
>> We use Drill 1.5 against CDH 5.4.2, running HBase 1.0.0-cdh5.4.2.
>>
>> Do you have any recommendations?
>>
>> Kevin
>>
>>
>> -----Original Message-----
>> From: Aditya [mailto:[email protected]]
>> Sent: Wednesday, March 16, 2016 9:41 AM
>> To: user <[email protected]>
>> Subject: Re: Drill Storage Plugin for HBase 1.0
>>
>> Hi Kevin,
>>
>> HBase 0.98.x and 1.x are wire compatible[1] and hence the HBase 0.98
>> client included in Drill distribution can access HBase 1.x cluster.
>>
>> [1] https://blogs.apache.org/hbase/entry/start_of_a_new_era
>>
>> On Tue, Mar 15, 2016 at 10:18 AM, Kevin Verhoeven <
>> [email protected]
>> > wrote:
>>
>> > Currently the HBase Storage plugin supports HBase 0.98. Is there a
>> > plan to update the storage plugin to support HBase 1.0?
>> >
>> > Thanks!
>> >
>> > Kevin
>> >
>>
>>
>>
>>
>>
>>
>>
>
>

Reply via email to