Re: Read thruput

Asaf Mesika Mon, 01 Apr 2013 13:12:32 -0700

How does your client call looks like? Get? Scan? Filters?
Is 3000/sec is client side calls or is it in numbers of rows per sec?
If you measure in MB/sec how much read throughput do you get?
Where is your client located? Same router as the cluster?
Have you activated dfs read short circuit? Of not try it.
Compression - try switching to Snappy - should be faster.
What else is running on the cluster parallel to your reading client?


On Monday, April 1, 2013, Vibhav Mundra wrote:

> What is the general read-thru put that one gets when using Hbase.
>
>  I am not to able to achieve more than 3000/secs with a timeout of 50
> millisecs.
> In this case also there is 10% of them are timing-out.
>
> -Vibhav
>
>
> On Mon, Apr 1, 2013 at 11:20 PM, Vibhav Mundra <[email protected]> wrote:
>
> > yes, I have changes the BLOCK CACHE % to 0.35.
> >
> > -Vibhav
> >
> >
> > On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu <[email protected]> wrote:
> >
> >> I was aware of that discussion which was about MAX_FILESIZE and
> BLOCKSIZE
> >>
> >> My suggestion was about block cache percentage.
> >>
> >> Cheers
> >>
> >>
> >> On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra <[email protected]> wrote:
> >>
> >> > I have used the following site:
> >> > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow
> >> >
> >> > to lessen the value of block cache.
> >> >
> >> > -Vibhav
> >> >
> >> >
> >> > On Mon, Apr 1, 2013 at 4:23 PM, Ted <[email protected]> wrote:
> >> >
> >> > > Can you increase block cache size ?
> >> > >
> >> > > What version of hbase are you using ?
> >> > >
> >> > > Thanks
> >> > >
> >> > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra <[email protected]> wrote:
> >> > >
> >> > > > The typical size of each of my row is less than 1KB.
> >> > > >
> >> > > > Regarding the memory, I have used 8GB for Hbase regionservers and
> 4
> >> GB
> >> > > for
> >> > > > datanodes and I dont see them completely used. So I ruled out the
> GC
> >> > > aspect.
> >> > > >
> >> > > > In case u still believe that GC is an issue, I will upload the gc
> >> logs.
> >> > > >
> >> > > > -Vibhav
> >> > > >
> >> > > >
> >> > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan <
> >> > > > [email protected]> wrote:
> >> > > >
> >> > > >> Hi
> >> > > >>
> >> > > >> How big is your row?  Are they wider rows and what would be the
> >> size
> >> > of
> >> > > >> every cell?
> >> > > >> How many read threads are getting used?
> >> > > >>
> >> > > >>
> >> > > >> Were you able to take a thread dump when this was happening?
>  Have
> >> you
> >> > > seen
> >> > > >> the GC log?
> >> > > >> May be need some more info before we can think of the problem.
> >> > > >>
> >> > > >> Regards
> >> > > >> Ram
> >> > > >>
> >> > > >>
> >> > > >> On Mon, Apr 1, 2013 at 3:39 PM, Vibhav Mundra <[email protected]>
> >> > wrote:
> >> > > >>
> >> > > >>> Hi All,
> >> > > >>>
> >> > > >>> I am trying to use Hbase for real-time data retrieval with a
> >> timeout
> >> > of
> >> > > >> 50
> >> > > >>> ms.
> >> > > >>>
> >> > > >>> I am using 2 machines as datanode and regionservers,
> >> > > >>> and one machine as a master for hadoop and Hbase.
> >> > > >>>
> >> > > >>> But I am able to fire only 3000 queries per sec and 10% of them
> >> are
> >> > > >> timing
> >> > > >>> out.
> >> > > >>> The database has 60 million rows.
> >> > > >>>
> >> > > >>> Are these figure okie, or I am missing something.
> >> > > >>> I have used the scanner caching to be equal to one, because for
> >> each
> >> > > time
> >> > >

Re: Read thruput

Reply via email to