Thanks for filing the issue. I haven't worked much with HBase, but this is a critical wrong results issues, so I will be taking a look at this soon if no one else raises their hand.
On Wed, Jan 13, 2016 at 6:20 PM, Kumiko Yada <[email protected]> wrote: > I opened the bug on this. The drill is returning the correct rows when > the hbase contains 5 or less columns, but not 6 or more columns. > > https://issues.apache.org/jira/browse/DRILL-4271 > > Thanks > Kumiko > > -----Original Message----- > From: Kumiko Yada [mailto:[email protected]] > Sent: Wednesday, January 13, 2016 4:52 PM > To: [email protected] > Cc: Aditya Kishore <[email protected]>; Kevin Verhoeven < > [email protected]> > Subject: RE: Drill query does not return all results from HBase > > We are using the HBase 1.0.0. & CDH 5.4. I found out the correct row > count returned when the Hbase table contains only 1 column family, 1 > column, but the incorrect row count is returned for the Hbase table > contains 1 column family, 6 columns. > > This looks like the Drill issue. Has anyone found any workaround? > > Thanks > Kumiko > > -----Original Message----- > From: Abhishek Girish [mailto:[email protected]] > Sent: Tuesday, January 12, 2016 6:51 PM > To: user <[email protected]> > Cc: Aditya Kishore <[email protected]> > Subject: Re: Drill query does not return all results from HBase > > Well, the major version din't change if I remember it right, hence did not > share the info in my previous mail. I'm on HBase 1.1.1 right now and don't > see the issue. Also, I am on a MapR setup, which might not be comparable > with their CDH setups. > > On Tue, Jan 12, 2016 at 5:50 PM, Jason Altekruse <[email protected] > > > wrote: > > > Abhishek, > > > > What version of HBase did you have the problem with, and what version > > did you upgrade to that solved the problem? I assume this would be > > useful information to compare your setup with Kevin's and Kumiko's. > > > > - Jason > > > > On Tue, Jan 12, 2016 at 10:41 AM, Abhishek Girish < > > [email protected] > > > wrote: > > > > > I hit a very similar issue recently. Via HBase shell, i was able to > > > fetch all records, whereas I was only able to see a small subset of > > > records > > when > > > queried from Drill. Each time I inserted 1000 records, only about 50 > > > of those would show up. > > > > > > Although I could repro' the problem consistently, it was resolved > > > once i updated my Hadoop setup. My guess is that it was a HBase bug > > > which got resolved. Although strange as it seems, it might not have > > > to do with > > Drill > > > itself. > > > > > > -Abhishek > > > > > > On Tue, Jan 12, 2016 at 7:52 AM, Jason Altekruse < > > [email protected] > > > > > > > wrote: > > > > > > > I'm not sure why this is happening, we have tests in our automated > > suite > > > > that I believe run some pretty large queries against Hbase and > > > > verify > > the > > > > results. > > > > > > > > Aditya, do you have some time available to try to reproduce this > > > > and diagnose the problem? > > > > > > > > On Wed, Jan 6, 2016 at 2:03 PM, Kumiko Yada > > > > <[email protected]> > > > wrote: > > > > > > > > > I'm having the same issue. Is there any workaround for this? > > > > > > > > > > Thanks > > > > > Kumiko > > > > > > > > > > -----Original Message----- > > > > > From: Kevin Verhoeven [mailto:[email protected]] > > > > > Sent: Monday, December 21, 2015 10:37 AM > > > > > To: [email protected] > > > > > Subject: Drill query does not return all results from HBase > > > > > > > > > > We have a problem where a Drill query against HBase does not > > > > > return > > all > > > > > results. The following query should return over 100,000 rows, > > > > > but we > > > only > > > > > get about 1,030 back. > > > > > > > > > > SELECT row_key FROM `hbase`.`customer_staged` WHERE > > > > > customer_number = > > > 800 > > > > > > > > > > If we scan directly using the hbase shell we see over 100,000 > > > > > rows, > > but > > > > > the same Drill query does not return a fraction of the expected > > > results. > > > > We > > > > > have also run a count against the table and Drill returns the > > > > > same > > > 1,030 > > > > > number, which is far less than expect. What could be going wrong? > > > > > > > > > > We are running Drill 1.2 on Ubuntu 14.04 against CDH 5.4.3 > > > > > (HBase > > 1.0). > > > > We > > > > > run HBase on six RegionServers, the table has about 1.3 billion > rows. > > > > > > > > > > Thanks, > > > > > > > > > > Kevin > > > > > > > > > > > > > > > > > > > >
