Re: Storm/Hbase Bolt Performance

Jon Logan Mon, 09 Jun 2014 17:59:29 -0700

Are you sure you are looking at the right figure? Capacity should not be >
1. High values indicate that you may want to increase parallelism on that
step. Low values indicate something else is probably bottlenecking your
topology. If you could send a screenshot of the Storm UI that could be
helpful.



I've had good luck with YourKit...just remotely attach to a running worker.


On Mon, Jun 9, 2014 at 8:53 PM, Justin Workman <[email protected]>
wrote:

> The capacity indicates they are being utilized. Capacity hovers around
> .800 and busts to 1.6 or so when we see spikes of tuples or restart the
> topology.
>
> Recommendations on profilers?
>
> Sent from my iPhone
>
> On Jun 9, 2014, at 6:50 PM, Jon Logan <[email protected]> wrote:
>
> Are your HBase bolts being saturated? If not, you may want to increase the
> number of pending tuples, as that could cause things to be artificially
> throttled.
>
> You should also try attaching a profiler to your bolt, and see what's
> holding it up. Are you doing batched puts (or puts being committed on a
> separate thread)? That could also cause substantial improvements.
>
>
> On Mon, Jun 9, 2014 at 8:11 PM, Justin Workman <[email protected]>
> wrote:
>
>> In response to a comment from P. Taylor Goetz on another thread..."I can
>> personally verify that it is possible to process 1.2+ million (relatively
>> small) messages per second with a 10-15 node cluster — and that includes
>> writing to HBase, and other components (I don’t have the hardware specs
>> handy, but can probably dig them up)."
>>
>> I would like to know what special knobs people are tuning in both Storm
>> and Hbase to achieve this level of throughput. Things I would be interested
>> in would be Hbase cluster sizes, is the cluster shared with map reduce load
>> as well, bolt parallelism and any other knobs people have adjusted to get
>> this level of write throughput to Hbase from Storm.
>>
>> Maybe this isn't the right group, but we are struggling getting more than
>> about 2000 tuples/sec writting to Hbase. I think I know some of the
>> bottlenecks, but would love to know what others in teh community are tuning
>> to get this level of performance.
>>
>> Our messages are roughly 300-500k and we are running on a 6 node Storm
>> cluster running on virtual machines (our first bottleneck, which we will be
>> replacing with 10 relatively beefy physical nodes), a parallelism of 40 for
>> our storage bolt.
>>
>> Any hints on Hbase or Storm optimizations that can be done to help
>> increase the throughput to Hbase would be greatly appreciated.
>>
>> Thanks
>> Justin
>>
>
>

Re: Storm/Hbase Bolt Performance

Reply via email to