I am curious as to whether the current Hive query support against HBase can handle your use case (as a way to by pass the export to a rel store)?
-b On Wed, May 5, 2010 at 12:22 AM, Michelan Arendse <miche...@addynamo.com>wrote: > I don't know what the row start and end keys are - they GUID keys (improves > writes across cluster - had help with this from this user-group before). > I need to export data written between "startDate" and "endDate" into a > relational database so I can interrogate the data (SUM/AVG, etc). > > That is why I am are using: scan.setTimeRange(fromDate.getTime(), > toDate.getTime()); > In my test with live data, I only took between 2010-03-26 00:00:00 and > 2010-03-26 01:00:00 - there should only be a few thousand rows in-between > those dates. > > Will hbase still take forever to find the data I look for unless I use > startRow/endRow? > > -----Original Message----- > From: TuX RaceR [mailto:tuxrace...@gmail.com] > Sent: 04 May 2010 05:52 PM > To: hbase-user@hadoop.apache.org > Subject: Re: Improving HBase scanner > > Michelan Arendse wrote: > > Is there a way to speed up the fetching of data from HBase? > > > > > > Divide your key space in smaller chunks? > using closer |startRow, and ||stopRow?| > |*cf: > < > http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/client/Scan.html#Scan%28byte%5B%5D,%20byte%5B%5D%29 > > > > Scan > < > http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/client/Scan.html#Scan%28byte%5B%5D,%20byte%5B%5D%29>*(byte[] > startRow, > byte[] stopRow)| > > > TuX >