>
> > we did swapoff -a and then updated fstab to permanently turn it off.
>
> You might not want to turn it off completely.  One of the lads was
> recently talking about the horrors that can happen when no swap.
>
> But sounds like you were doing over eager swapping up to this?
>
>
http://wiki.apache.org/hadoop/PerformanceTuning recommends removing swap and
we had swap off on part of the cluster and those machines were doing well in
terms of RS crash and other machines were doing lots of swap. So we decided
to turn it off for all RS machines.

Can you give more inputs on what might be the drawbacks or risks of
permanent swap off or what was the observed horror?



> > we observed swap was actually happening on RSs and after we turned it off
> we
> > have much stable RSs.
> >
> > i can tell what we have, not sure that is optimal, in fact looking for
> > comments/suggestions from folks who have used it more:
> > 64GB RAM ==> 85% given to HBASE HEAP (30% memstore, 60%block cache) ,
> 512MB
> > DN and 512MB TT
> >
>
> So, I'm bad at math, but thats a heap of 50+GB?  Hows that working out
> for you?  You played with GC tuning at all?  You might give more to
> the DN and the TT since you have plenty -- and more to the OS...
> perhaps less to hbase?
>
> How many disks?
>

We played with GC. What worked well so far is starting CMS little early at
40% occupancy; we removed 6m newgen restrictions and observed that we are
not growing beyond 18mb and minor GC is coming every seconds instead of
every 200ms in steady state (we might cap maxnewgen if things go bad), but
so far all pauses are small less than second and No full GC kicked in.

We have given more to HBase (and specifically to block cache) because we
want 95% read latencies below 20ms and our load is random read heavy with
light read-modify-writes.
The rational was to go for small hbase blocks (8KB); larger than HBase but
smaller than default HDFS block size (64KB); and large block cache to
improve hit rate (~37GB)
We did very limited experiments with different blocks sizes before going
with this configurations.

We have 1Gb for DN. We don't run map-reduce much on this cluster so given
512MB to TT. We have separate Hadoop cluster for all our MR
and analytics needs.

We have 6x1TB disks per machine.


> > we have 64KB HDFS block size
>
> Do you mean 64MB?
>
>
>
Its 64KB. Our keys are random enough to have very low chance
of exploiting block locality. So for every miss in block cache will read one
or more random HDFS blocks anyways and hence it make sense to go for lower
HDFS block size. After getting HBASE-3006 in things improved a lot for us.

 We use large 128MB blocks for our analytic hadoop cluster as it has more
seq. reads. Do you think smaller size like 64KB might be actually hearting
us?



> You've done the other stuff -- ulimits and xceivers?
>

We have 64k ulimit for all our hadoop cluster machines and xceivers is set
to 2048 for hbase cluster


>
> Hows it running for you?
>

I will post some real numbers next week when we have it running for 7 days
with current config.

I won't say we have nailed down everything, but better than what we started
with.

Any inputs will be really helpful or anything you think we are doing stupid
or totally missing it :-)

Reply via email to