Re: Replication not suited for intensive write applications?

Asaf Mesika Thu, 20 Jun 2013 13:39:45 -0700

Thanks for the answer!
My responses are inline.

On Thu, Jun 20, 2013 at 11:02 PM, lars hofhansl <[email protected]> wrote:


> First off, this is a pretty constructed case leading to a specious general
> conclusion.
>
> If you only have three RSs/DNs and the default replication factor of 3,
> each machine will get every single write.
> That is the first issue. Using HBase makes little sense with such a small
> cluster.
>
You are correct, non the less - network as I measured, was far from its
capacity thus probably not the bottleneck.

>
> Secondly, as you say yourself, there are only three regionservers writing
> to the replicated cluster using a single thread each in order to preserve
> ordering.
> With more region servers your scale will tip the other way. Again more
> regionservers will make this better.
>
> I presume, in production, I will add more region servers to accommodate
growing write demand on my cluster. Hence, my clients will write with more
threads. Thus proportionally I will always have a lot more client threads
than the number of region servers (each has one replication thread). So, I
don't see how adding more region servers will tip the scale to other side.
The only way to avoid this, is to design the cluster in such a way that if
I can handle the events received at the client which write them to HBase
with x Threads, this is the amount of region servers I should have. If I
will have a spike, then it will even out eventually, but this under
utilizing my cluster hardware, no?


> As for your other question, more threads can lead to better interleaving
> of CPU and IO, thus leading to better throughput (this relationship is not
> linear, though).
>
>

>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Asaf Mesika <[email protected]>
> To: "[email protected]" <[email protected]>
> Cc:
> Sent: Thursday, June 20, 2013 3:46 AM
> Subject: Replication not suited for intensive write applications?
>
> Hi,
>
> I've been conducting lots of benchmarks to test the maximum throughput of
> replication in HBase.
>
> I've come to the conclusion that HBase replication is not suited for write
> intensive application. I hope that people here can show me where I'm wrong.
>
> *My setup*
> *Cluster (*Master and slave are alike)
> 1 Master, NameNode
> 3 RS, Data Node
>
> All computers are the same: 8 Cores x 3.4 GHz, 8 GB Ram, 1 Gigabit ethernet
> card
>
> I insert data into HBase from a java process (client) reading files from
> disk, running on the machine running the HBase Master in the master
> cluster.
>
> *Benchmark Results*
> When the client writes with 10 Threads, then the master cluster writes at
> 17 MB/sec, while the replicated cluster writes at 12 Mb/sec. The data size
> I wrote is 15 GB, all Puts, to two different tables.
> Both clusters when tested independently without replication, achieved write
> throughput of 17-19 MB/sec, so evidently the replication process is the
> bottleneck.
>
> I also tested connectivity between the two clusters using "netcat" and
> achieved 111 MB/sec.
> I've checked the usage of the network cards both on the client, master
> cluster region server and slave region servers. No computer when over
> 30mb/sec in Receive or Transmit.
> The way I checked was rather crud but works: I've run "netstat -ie" before
> HBase in the master cluster starts writing and after it finishes. The same
> was done on the replicated cluster (when the replication started and
> finished). I can tell the amount of bytes Received and Transmitted and I
> know that duration each cluster worked, thus I can calculate the
> throughput.
>
> *The bottleneck in my opinion*
> Since we've excluded network capacity, and each cluster works at faster
> rate independently, all is left is the replication process.
> My client writes to the master cluster with 10 Threads, and manages to
> write at 17-18 MB/sec.
> Each region server has only 1 thread responsible for transmitting the data
> written to the WAL to the slave cluster. Thus in my setup I effectively
> have 3 threads writing to the slave cluster.  Thus this is the bottleneck,
> since this process can not be parallelized, since it must transmit the WAL
> in a certain order.
>
> *Conclusion*
> When writes intensively to HBase with more than 3 threads (in my setup),
> you can't use replication.
>
> *Master throughput without replication*
> On a different note, I have one thing I couldn't understand at all.
> When turned off replication, and wrote with my client with 3 threads I got
> throughput of 11.3 MB/sec. When I wrote with 10 Threads (any more than that
> doesn't help) I achieved maximum throughput of 19 MB/sec.
> The network cards showed 30MB/sec Receive and 20MB/sec Transmit on each RS,
> thus the network capacity was not the bottleneck.
> On the HBase master machine which ran the client, the network card again
> showed Receive throughput of 0.5MB/sec and Transmit throughput of 18.28
> MB/sec. Hence it's the client machine network card creating the bottleneck.
>
> The only explanation I have is the synchronized writes to the WAL. Those 10
> threads have to get in line, and one by one, write their batch of Puts to
> the WAL, which creates a bottleneck.
>
> *My question*:
> The one thing I couldn't understand is: When I write with 3 Threads,
> meaning I have no more than 3 concurrent RPC requests to write in each RS.
> They achieved 11.3 MB/sec.
> The write to the WAL is synchronized, so why increasing the number of
> threads to 10 (x3 more) actually increased the throughput to 19 MB/sec?
> They all get in line to write to the same location, so it seems have
> concurrent write shouldn't improve throughput at all.
>
>
> Thanks you!
>
> Asaf
> *
> *
>
>

Re: Replication not suited for intensive write applications?

Reply via email to