How does your client call looks like? Get? Scan? Filters? Is 3000/sec is client side calls or is it in numbers of rows per sec? If you measure in MB/sec how much read throughput do you get? Where is your client located? Same router as the cluster? Have you activated dfs read short circuit? Of not try it. Compression - try switching to Snappy - should be faster. What else is running on the cluster parallel to your reading client?
On Monday, April 1, 2013, Vibhav Mundra wrote: > What is the general read-thru put that one gets when using Hbase. > > I am not to able to achieve more than 3000/secs with a timeout of 50 > millisecs. > In this case also there is 10% of them are timing-out. > > -Vibhav > > > On Mon, Apr 1, 2013 at 11:20 PM, Vibhav Mundra <[email protected]> wrote: > > > yes, I have changes the BLOCK CACHE % to 0.35. > > > > -Vibhav > > > > > > On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu <[email protected]> wrote: > > > >> I was aware of that discussion which was about MAX_FILESIZE and > BLOCKSIZE > >> > >> My suggestion was about block cache percentage. > >> > >> Cheers > >> > >> > >> On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra <[email protected]> wrote: > >> > >> > I have used the following site: > >> > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow > >> > > >> > to lessen the value of block cache. > >> > > >> > -Vibhav > >> > > >> > > >> > On Mon, Apr 1, 2013 at 4:23 PM, Ted <[email protected]> wrote: > >> > > >> > > Can you increase block cache size ? > >> > > > >> > > What version of hbase are you using ? > >> > > > >> > > Thanks > >> > > > >> > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra <[email protected]> wrote: > >> > > > >> > > > The typical size of each of my row is less than 1KB. > >> > > > > >> > > > Regarding the memory, I have used 8GB for Hbase regionservers and > 4 > >> GB > >> > > for > >> > > > datanodes and I dont see them completely used. So I ruled out the > GC > >> > > aspect. > >> > > > > >> > > > In case u still believe that GC is an issue, I will upload the gc > >> logs. > >> > > > > >> > > > -Vibhav > >> > > > > >> > > > > >> > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan < > >> > > > [email protected]> wrote: > >> > > > > >> > > >> Hi > >> > > >> > >> > > >> How big is your row? Are they wider rows and what would be the > >> size > >> > of > >> > > >> every cell? > >> > > >> How many read threads are getting used? > >> > > >> > >> > > >> > >> > > >> Were you able to take a thread dump when this was happening? > Have > >> you > >> > > seen > >> > > >> the GC log? > >> > > >> May be need some more info before we can think of the problem. > >> > > >> > >> > > >> Regards > >> > > >> Ram > >> > > >> > >> > > >> > >> > > >> On Mon, Apr 1, 2013 at 3:39 PM, Vibhav Mundra <[email protected]> > >> > wrote: > >> > > >> > >> > > >>> Hi All, > >> > > >>> > >> > > >>> I am trying to use Hbase for real-time data retrieval with a > >> timeout > >> > of > >> > > >> 50 > >> > > >>> ms. > >> > > >>> > >> > > >>> I am using 2 machines as datanode and regionservers, > >> > > >>> and one machine as a master for hadoop and Hbase. > >> > > >>> > >> > > >>> But I am able to fire only 3000 queries per sec and 10% of them > >> are > >> > > >> timing > >> > > >>> out. > >> > > >>> The database has 60 million rows. > >> > > >>> > >> > > >>> Are these figure okie, or I am missing something. > >> > > >>> I have used the scanner caching to be equal to one, because for > >> each > >> > > time > >> > >
