Re: how to optimize for heavy writes scenario

Vladimir Rodionov Fri, 17 Mar 2017 14:52:07 -0700

>> In my opinion,  1M/s input data will result in only  70MByte/s write


Times 3 (default HDFS replication factor) Plus ...

Do not forget about compaction read/write amplification. If you flush 10 MB
and your max region size is 10 GB, with default min file to compact (3)
your amplification is 6-7 That gives us 70 x 3 x 6 = 1260 MB/s read/write
or 210 MB/sec read and writes (210 MB/s reads and 210 MB/sec writes)

per RS

This IO load is way above sustainable.


-Vlad


On Fri, Mar 17, 2017 at 2:14 PM, Kevin O'Dell <[email protected]> wrote:

> Hey Hef,
>
>   What is the memstore size setting(how much heap is it allowed) that you
> have on that cluster?  What is your region count per node?  Are you writing
> evenly across all those regions or are only a few regions active per region
> server at a time?  Can you paste your GC settings that you are currently
> using?
>
> On Fri, Mar 17, 2017 at 3:30 PM, Stack <[email protected]> wrote:
>
> > On Fri, Mar 17, 2017 at 9:31 AM, Hef <[email protected]> wrote:
> >
> > > Hi group,
> > > I'm using HBase to store large amount of time series data, the usage
> case
> > > is heavy on writes then reads. My application stops at writing 600k
> > > requests per second and I can't tune up for better tps.
> > >
> > > Hardware:
> > > I have 6 Region Servers, each has 128G memory, 12 HDDs, 2cores with
> > > 24threads,
> > >
> > > Schema:
> > > The schema for these time series data is similar as OpenTSDB that the
> > data
> > > points of a same metric within an hour are store in one row, and there
> > > could be maximum 3600 columns per row.
> > > The cell is about 70bytes on its size, including the rowkey, column
> > > qualifier, column family and value.
> > >
> > > HBase config:
> > > CDH 5.6 HBase 1.0.0
> > >
> >
> > Can you upgrade? There's a big diff between 1.2 and 1.0.
> >
> >
> > > 100G memory for each RegionServer
> > > hbase.hstore.compactionThreshold = 50
> > > hbase.hstore.blockingStoreFiles = 100
> > > hbase.hregion.majorcompaction disable
> > > hbase.client.write.buffer = 20MB
> > > hbase.regionserver.handler.count = 100
> > >
> >
> > Could try halving the handler count.
> >
> >
> > > hbase.hregion.memstore.flush.size = 128MB
> > >
> > >
> > > Why are you flushing? If it is because you are hitting this flush
> limit,
> > can you try upping it?
> >
> >
> >
> > > HBase Client:
> > > write in BufferedMutator with 100000/batch
> > >
> > > Inputs Volumes:
> > > The input data throughput is more than 2millions/sec from Kafka
> > >
> > >
> > How is the distribution? Evenly over the keyspace?
> >
> >
> > > My writer applications are distributed, how ever I scaled them up, the
> > > total write throughput won't get larger than 600K/sec.
> > >
> >
> >
> > Tell us more about this scaling up? How many writers?
> >
> >
> >
> > > The severs have 20% CPU usage and 5.6 wa,
> > >
> >
> > 5.6 is high enough. Is the i/o spread over the disks?
> >
> >
> >
> > > GC  doesn't look good though, it shows a lot 10s+.
> > >
> > >
> > What settings do you have?
> >
> >
> >
> > > In my opinion,  1M/s input data will result in only  70MByte/s write
> > > throughput to the cluster, which is quite a small amount compare to
> the 6
> > > region servers. The performance should not be bad like this.
> > >
> > > Is anybody has idea why the performance stops at 600K/s?
> > > Is there anything I have to tune to increase the HBase write
> throughput?
> > >
> >
> >
> > If you double the clients writing do you see an up in the throughput?
> >
> > If you thread dump the servers, can you tell where they are held up? Or
> if
> > they are doing any work at all relative?
> >
> > St.Ack
> >
>
>
>
> --
> Kevin O'Dell
> Field Engineer
> 850-496-1298 | [email protected]
> @kevinrodell
> <http://www.rocana.com>
>

Re: how to optimize for heavy writes scenario

Reply via email to