Thanks Aditya, There were no warnings/errors in the log, and the result of the count was 6724 (3362 + 3362).
Kevin From: Aditya [mailto:[email protected]] Sent: Wednesday, March 16, 2016 4:48 PM To: Kevin Verhoeven <[email protected]> Cc: [email protected] Subject: Re: Drill Storage Plugin for HBase 1.0 Thanks Kevin, that is useful information. I have couple of follow up questions. 1. Was there any warning/error message in the log? 2. Was the result of count query in this instance == (3362 + 3362)? On Wed, Mar 16, 2016 at 1:55 PM, Kevin Verhoeven <[email protected]<mailto:[email protected]>> wrote: Thanks Aditya, I now see more information in the log file. The count is way too low as there are 10000000 rows in the table, here is what I see from the drillbit.log file, running the following query: select count(*) from `hbase`.`test6c`: Assignment Map: {0=[HBaseScanSpec [tableName=test6c, startRow=, stopRow=row5760331, filter=null, regionServer=dev01]], 1=[HBaseScanSpec [tableName=test6c, startRow=row5760331, stopRow=, filter=null, regionServer=dev01]]} … 2016-03-16 20:41:02,010 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 644 ms to get 3362 records 2016-03-16 20:41:02,013 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 662 ms to get 3362 records … 2016-03-16 20:41:03,054 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:1] DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records 2016-03-16 20:41:03,062 [29163b26-b8e4-758b-867f-b45d68edeaee:frag:1:0] DEBUG o.a.d.e.s.hbase.HBaseRecordReader - Took 1 ms to get 0 records … In HBase there are two regions for this table, a count from HBase shell gives me the following: 10000000 row(s) in 483.3140 seconds A scan for one row returns the following, this is test data: row1 column=cf:c1, timestamp=1452736216785, value=1 row1 column=cf:c2, timestamp=1452736216785, value=6 row1 column=cf:c3, timestamp=1452736216785, value=b row1 column=cf:c4, timestamp=1452736216785, value=5 row1 column=cf:c5, timestamp=1452736216785, value=b row1 column=cf:c6, timestamp=1452736216785, value=1 Thanks, Kevin From: Aditya [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, March 16, 2016 1:33 PM To: Kevin Verhoeven <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: Drill Storage Plugin for HBase 1.0 My apologies, I should have been more explicit. For changing any log related settings, you'd need to modify "$DRILL_HOME/conf/logback.xml" on all Drillbit nodes and add the following snippet <logger name="org.apache.drill.exec.store.hbase" additivity="false"> <level value="debug" /> <appender-ref ref="FILE" /> </logger> On Wed, Mar 16, 2016 at 1:05 PM, Kevin Verhoeven <[email protected]<mailto:[email protected]>> wrote: Thanks Aditya, Where do I set the "org.apache.drill.exec.store.hbase.HBaseRecordReader" parameter? I set it under the drill-override.conf file and restarted Drill but I do not see this appear in the log. My drill-override.conf looks something like this (I abbreviated the file): drill.exec: { cluster-id: "drillbits1" rpc: { user: { server: { port: 31010 threads: 1 } client: { threads: 1 } }, bit: { server: { port : 31011, retry:{ count: 7200, delay: 500 }, threads: 1 } }, use.ip : false }, debug.error_on_leak: true, store.hbase.HBaseRecordReader: true } Thanks! Kevin From: Aditya [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, March 16, 2016 11:11 AM To: Kevin Verhoeven <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: Drill Storage Plugin for HBase 1.0 Could you please enable debug logging for "org.apache.drill.exec.store.hbase.HBaseRecordReader" and look for following log message in drillbit.log from this logger. > Took xxxx ms to get yyyyy records Does sum(yyyyy) for a query matches the expected number of record? On Wed, Mar 16, 2016 at 10:58 AM, Kevin Verhoeven <[email protected]<mailto:[email protected]>> wrote: Thanks Aditya, We see a problem when we query HBase 1.0 tables with more than five column qualifiers, where only a small amount of data is returned. We are able to repro the problem with Drill 1.5 against HBase 1.0 tables of just about any size, for example with 1,000,000 rows. A query against the table may return less than 1,300 rows. We use Drill 1.5 against CDH 5.4.2, running HBase 1.0.0-cdh5.4.2. Do you have any recommendations? Kevin -----Original Message----- From: Aditya [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, March 16, 2016 9:41 AM To: user <[email protected]<mailto:[email protected]>> Subject: Re: Drill Storage Plugin for HBase 1.0 Hi Kevin, HBase 0.98.x and 1.x are wire compatible[1] and hence the HBase 0.98 client included in Drill distribution can access HBase 1.x cluster. [1] https://blogs.apache.org/hbase/entry/start_of_a_new_era On Tue, Mar 15, 2016 at 10:18 AM, Kevin Verhoeven <[email protected]<mailto:[email protected]> > wrote: > Currently the HBase Storage plugin supports HBase 0.98. Is there a > plan to update the storage plugin to support HBase 1.0? > > Thanks! > > Kevin >
