Re: Surprisingly high network traffic between kafka servers

2014-02-19 Thread Dong Zhong
Ok, I filed a jira for that.https://issues.apache.org/jira/browse/KAFKA-1273 -- Original -- From: "Jun Rao"; Date: 2014年2月14日(星期五) 晚上11:54 To: "users@kafka.apache.org"; Subject: Re: Surprisingly high network traffic between kafka servers

Re: Surprisingly high network traffic between kafka servers

2014-02-14 Thread Carl Lerche
Hey, thanks so much for pointing this out. I think that this is likely what is happening for us. I will attempt this fix. Cheers, Carl On Thu, Feb 13, 2014 at 8:01 PM, zhong dong wrote: > We encountered with this problem, too. > > And our problem is that we set the message.max.bytes larger than

Re: Surprisingly high network traffic between kafka servers

2014-02-14 Thread Jay Kreps
Yeah that is a bug. We should be giving an error here rather than retrying. -Jay On Fri, Feb 14, 2014 at 7:54 AM, Jun Rao wrote: > Hi, Zhong, > > Thanks for sharing this. We probably should add a sanity check in the > broker to make sure that replica.fetch.max.bytes >= message.max.bytes. > Cou

Re: Surprisingly high network traffic between kafka servers

2014-02-14 Thread Jun Rao
Hi, Zhong, Thanks for sharing this. We probably should add a sanity check in the broker to make sure that replica.fetch.max.bytes >= message.max.bytes. Could you file a jira for that? Jun On Thu, Feb 13, 2014 at 8:01 PM, zhong dong wrote: > We encountered with this problem, too. > > And our p

Surprisingly high network traffic between kafka servers

2014-02-13 Thread zhong dong
We encountered with this problem, too. And our problem is that we set the message.max.bytes larger than replica.fetch.max.bytes. After we changed the replica.fetch.max.bytes to a larger number, the problem solved.

Re: Surprisingly high network traffic between kafka servers

2014-02-07 Thread Carl Lerche
Hey Joe, Those periods with "no traffic" actually are periods of expected traffic between nodes. It's just that the off period is so high that the normal traffic is not visible. Also, once traffic goes crazy, the only way to reset it is to stop all kafka nodes (vs do a rolling restart). I have be

Re: Surprisingly high network traffic between kafka servers

2014-02-07 Thread Joe Stein
Carl, looking at the boundary chart it looks like you have periods of no traffic also... prior to the spikes. I also noticed you are using AWS from your logs, what instance types are you using? Do you have any network checks in place? The logs show underReplication=true which leads towards what

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Carl Lerche
One last thing, I have collected a snippet of the network traffic between Kafka instances using tcpdump. However, it contains some customer data and less than a minutes worth was over 1 GB, so I can't really post it here, but I could possibly share offline if it can help debug the issue. On Thu, F

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Carl Lerche
Re: > Could you also check if the on-disk data size/rate match the network > traffic? While I have not explicitly checked this, I would say that the answer is no. The network is over 1Gbps and I have setup monitoring for disk space and nothing out of the norm is happening there. The expected data

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Carl Lerche
Ok, sorry for the lock of concrete information to help debug this issue. I am not really an ops guy, so I am trying to keep up. First, I added boundary to our servers. Normal Kafka behavior should be resulting in 500 kbps or less on our cluster. Here you can see that it's peaking at over 1 Gbps:

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Jun Rao
Could you also check if the on-disk data size/rate match the network traffic? Thanks, Jun On Thu, Feb 6, 2014 at 7:48 PM, Carl Lerche wrote: > So, the "good news" is that the problem came back again. The bad news > is that I disabled debug logs as it was filling disk (and I had other > fires

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Neha Narkhede
So, if you start from scratch (new environment and download of the Kafka release), could you post the list of steps to reproduce this issue? On Thu, Feb 6, 2014 at 7:48 PM, Carl Lerche wrote: > So, the "good news" is that the problem came back again. The bad news > is that I disabled debug logs

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Carl Lerche
So, the "good news" is that the problem came back again. The bad news is that I disabled debug logs as it was filling disk (and I had other fires to put out). I will re-enable debug logs and wait for it to happen again. On Thu, Feb 6, 2014 at 4:05 AM, Neha Narkhede wrote: > Carl, > > It will help

Re: Surprisingly high network traffic between kafka servers

2014-02-06 Thread Neha Narkhede
Carl, It will help if you can list the steps to reproduce this issue starting from a fresh installation. Your setup, the way it stands, seems to have gone through some config and state changes. Thanks, Neha On Wed, Feb 5, 2014 at 5:17 PM, Joel Koshy wrote: > On Wed, Feb 05, 2014 at 04:51:16PM

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread Joel Koshy
On Wed, Feb 05, 2014 at 04:51:16PM -0800, Carl Lerche wrote: > So, I tried enabling debug logging, I also made some tweaks to the > config (which I probably shouldn't have) and craziness happened. > > First, some more context. Besides the very high network traffic, we > were seeing some other issu

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread Carl Lerche
So, I tried enabling debug logging, I also made some tweaks to the config (which I probably shouldn't have) and craziness happened. First, some more context. Besides the very high network traffic, we were seeing some other issues that we were not focusing on yet. * Even though the log retention w

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread Jay Kreps
Can you enable DEBUG logging in log4j and see what requests are coming in? -Jay On Tue, Feb 4, 2014 at 9:51 PM, Carl Lerche wrote: > Hi Jay, > > I do not believe that I have changed the replica.fetch.wait.max.ms > setting. Here I have included the kafka config as well as a snapshot > of jnetto

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread Carl Lerche
I'm not really an ops person either. I was using jnettop for this. On Wednesday, February 5, 2014, S Ahmed wrote: > Sorry I'm not a ops person, but what tools do you use to monitor traffic > between servers? > > > On Tue, Feb 4, 2014 at 11:46 PM, Carl Lerche > > > wrote: > > > Hello, > > > > I'

Re: Surprisingly high network traffic between kafka servers

2014-02-05 Thread S Ahmed
Sorry I'm not a ops person, but what tools do you use to monitor traffic between servers? On Tue, Feb 4, 2014 at 11:46 PM, Carl Lerche wrote: > Hello, > > I'm running a 0.8.0 Kafka cluster of 3 servers. The service that it is > for is not in full production yet, so the data written to cluster i

Re: Surprisingly high network traffic between kafka servers

2014-02-04 Thread Carl Lerche
Hi Jay, I do not believe that I have changed the replica.fetch.wait.max.ms setting. Here I have included the kafka config as well as a snapshot of jnettop from one of the servers. https://gist.github.com/carllerche/4f2cf0f0f6d1e891f482 The bottom row (89.9K/s) is the producer (it lives on a Kafk

Re: Surprisingly high network traffic between kafka servers

2014-02-04 Thread Jay Kreps
No this is not normal. Checking twice a second (using 500ms default) for new data shouldn't cause high network traffic (that should be like < 1KB of overhead). I don't think that explains things. Is it possible that setting has been overridden? -Jay On Tue, Feb 4, 2014 at 9:25 PM, Guozhang Wang

Re: Surprisingly high network traffic between kafka servers

2014-02-04 Thread Guozhang Wang
Hi Carl, For each partition the follower will also fetch data from the leader replica, even if there is no new data in the leader replicas. One thing you can try to increase replica.fetch.wait.max.ms (default value 500ms) so that the followers's fetching request frequency to the leader can be red

Surprisingly high network traffic between kafka servers

2014-02-04 Thread Carl Lerche
Hello, I'm running a 0.8.0 Kafka cluster of 3 servers. The service that it is for is not in full production yet, so the data written to cluster is minimal (seems to average between 100kb/s -> 300kb/s per server). I have configured Kafka to have a 3 replicas. I am noticing that each Kafka server is