What is your scanner caching set to? I haven't worked with Phoenix so I'm
not sure what defaults if any it uses. In 0.94 HBase, I believe the default
caching is set to 1. This could be exacerbating your problem.


On Sat, Dec 21, 2013 at 7:52 PM, Kristoffer Sjögren <sto...@gmail.com>wrote:

> Yes, im waiting on a response from them. It's just.. the ping difference is
> tiny while the scan difference is huge, 2sec vs 4sec.
>
> Note the ping I mentioned is within the cluster. Ping from outside into the
> cluster have hardly any (if at all) noticeable difference.
>
>
> On Sat, Dec 21, 2013 at 8:37 PM, Pradeep Gollakota <pradeep...@gmail.com
> >wrote:
>
> > Do you know if machines 19-23 are on a different rack? It seems to me
> that
> > your problem might be a networking problem. Whether it is hardware,
> > configuration or something else entirely, I'm not sure. It might be
> > worthwhile to talk to your systems administrator to see why pings to
> these
> > machines are slow. What are the pings like from a bad RS to another bad
> RS?
> >
> >
> > On Sat, Dec 21, 2013 at 7:17 PM, Kristoffer Sjögren <sto...@gmail.com
> > >wrote:
> >
> > > Hi
> > >
> > > I have been performance tuning HBase 0.94.6 running Phoenix 2.2.0 the
> > last
> > > couple of days and need some help.
> > >
> > > Background.
> > >
> > > - 23 machine cluster, 32 cores, 4GB heap per RS.
> > > - Table t_24 have 24 online regions (24 salt buckets).
> > > - Table t_96 have 96 online regions (96 salt buckets).
> > > - 10.5 million rows per table.
> > > - Count query - select (*) from ...
> > > - Group by query - select A, B, C sum(D) from ... where (A = 1 and T
> >= 0
> > > and T <= 2147482800) group by A, B, C;
> > >
> > > What I found ultimately is that region servers 19, 20, 21, 22 and 23
> > > are consistently
> > > 2-3x slower than the others. This hurts overall latency pretty bad
> since
> > > queries are executed in parallel on the RS and then aggregated at the
> > > client (through Phoenix). In Hannibal regions spread out evenly over
> > region
> > > servers, according to salt buckets (phoenix feature, pre-create regions
> > and
> > > a rowkey prefix).
> > >
> > > As far as I can tell, there is no network or hardware configuration
> > > divergence between the machines. No CPU, network or other notable
> > > divergence
> > >  in Ganglia. No RS metric differences in HBase master console.
> > >
> > > The only thing that may be of interest is that pings (within the
> cluster)
> > > to
> > > bad RS is about 2-3x slower, around 0.050ms vs 0.130ms. Not sure if
> > > this is significant,
> > > but I get a bad feeling about it since it match exactly with the RS
> that
> > > stood out in my performance tests.
> > >
> > > Any ideas of how I might find the source of this problem?
> > >
> > > Cheers,
> > > -Kristoffer
> > >
> >
>

Reply via email to