You are correct that we do not send a new batch of messages until the
current batch has been acked.  It should not be too difficult to switch to
pipelining the messages so more then one batch is in flight at any point
in time, but we wanted to get accuracy before digging more deeply into
performance.

As for the fixed batch size that is a latency vs throughput question, and
is likely to vary depending on the use case you have.

The bigger problem that I have seen is with the number of threads that
Netty is using for larger topologies.  I think we have a fix for that, but
Andy and I have not had the time to put together a patch for the community
yet.  I will try to get to it this week.

‹Bobby

On 3/26/14, 6:27 AM, "Sean Zhong" <[email protected]> wrote:

>When running benchmark developed by Bobby(
>http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty),
>I found neither the CPU, memory, network can be satured when the message
>size is small(10bytes - 100 bytes).
>
>
>message sizespout throughput (MB/s) 103207 4016.25 8032.88100
>43.1320080.13
>400138 800186.381000 196.3810000234.75
>
>I have 4 nodes, each node have very powerful CPU, E5-2680(32 cores). The
>throughput reachs peak when only 30% CPU of each machine is used, and only
>1/6 of network bandwidth is used.
>
>So I guess this may relate to netty performance.
>
>My questions:
>1. Seems we are using synchronized way to transfer message in netty client
>worker, We are sending message only after we receive response of last
>message request from the netty server, can this hurt performance?
>2. Although we have batched the message when sending it through netty
>channel.send, but the batch size varies. In my test, I found the batch
>size
>varies from tens of bytes to a few KB. Will a bigger and constant batch
>size help here?
>
>
>The following part are the steps I tried to trouble shooting the problem.
>----------------------------------------------------------------
>1. Considering the CPU is not fully usd, I tried to scale out by adding
>more workers or increasing the parallelism, but the throughput doesn't
>improve.
>
>2. By checking profiling tool like visualvm, I found the spout/bolt only
>have 60% - 70% time waiting, blocked on disruptor queue, spout spends 70%
>sleeping, acker spends 40% time waiting, while Netty boss and worker, and
>zookeeper threads threads are busy.
>
>3. I have tried to tune all possible enumations of spout.max.pending,
>transfer.size, receiver.size, executor.input.size, executor.output.size,
> but it doesn't works out.
>
>
>Sean

Reply via email to