That is my understanding of capacity as well. I am certain I am looking at the right metric. I will forward a screen shot of the ui as soon as I get in front of a computer.
I will also give YourKit a try. Sent from my iPhone On Jun 9, 2014, at 6:58 PM, Jon Logan <[email protected]> wrote: Are you sure you are looking at the right figure? Capacity should not be > 1. High values indicate that you may want to increase parallelism on that step. Low values indicate something else is probably bottlenecking your topology. If you could send a screenshot of the Storm UI that could be helpful. I've had good luck with YourKit...just remotely attach to a running worker. On Mon, Jun 9, 2014 at 8:53 PM, Justin Workman <[email protected]> wrote: > The capacity indicates they are being utilized. Capacity hovers around > .800 and busts to 1.6 or so when we see spikes of tuples or restart the > topology. > > Recommendations on profilers? > > Sent from my iPhone > > On Jun 9, 2014, at 6:50 PM, Jon Logan <[email protected]> wrote: > > Are your HBase bolts being saturated? If not, you may want to increase the > number of pending tuples, as that could cause things to be artificially > throttled. > > You should also try attaching a profiler to your bolt, and see what's > holding it up. Are you doing batched puts (or puts being committed on a > separate thread)? That could also cause substantial improvements. > > > On Mon, Jun 9, 2014 at 8:11 PM, Justin Workman <[email protected]> > wrote: > >> In response to a comment from P. Taylor Goetz on another thread..."I can >> personally verify that it is possible to process 1.2+ million (relatively >> small) messages per second with a 10-15 node cluster — and that includes >> writing to HBase, and other components (I don’t have the hardware specs >> handy, but can probably dig them up)." >> >> I would like to know what special knobs people are tuning in both Storm >> and Hbase to achieve this level of throughput. Things I would be interested >> in would be Hbase cluster sizes, is the cluster shared with map reduce load >> as well, bolt parallelism and any other knobs people have adjusted to get >> this level of write throughput to Hbase from Storm. >> >> Maybe this isn't the right group, but we are struggling getting more than >> about 2000 tuples/sec writting to Hbase. I think I know some of the >> bottlenecks, but would love to know what others in teh community are tuning >> to get this level of performance. >> >> Our messages are roughly 300-500k and we are running on a 6 node Storm >> cluster running on virtual machines (our first bottleneck, which we will be >> replacing with 10 relatively beefy physical nodes), a parallelism of 40 for >> our storage bolt. >> >> Any hints on Hbase or Storm optimizations that can be done to help >> increase the throughput to Hbase would be greatly appreciated. >> >> Thanks >> Justin >> > >
