that makes sense. if the size of each fetch is small then compression won't
do much, and that could very well explain the increase in bandwidth.

we will try to change these settings and see what happens.

thanks a lot for your help.

T#


On Tue, Sep 2, 2014 at 10:44 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hi Theo,
>
> You can try to set replica.fetch.min.bytes to some large number (default to
> 1) and increase replica.fetch.wait.max.ms (default to 500) and see if that
> helps. In general, with 4 fetchers and min.bytes to 1 the replicas would
> effectively exchange many small packets over the wire.
>
> Guozhang
>
>
> On Mon, Sep 1, 2014 at 11:06 PM, Theo Hultberg <t...@iconara.net> wrote:
>
> > Hi Guozhang,
> >
> > We're using the default on all of those, except num.replica.fetchers
> which
> > is set to 4.
> >
> > T#
> >
> >
> > On Mon, Sep 1, 2014 at 9:41 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hello Theo,
> > >
> > > What are the values for your "replica.fetch.max.bytes",
> > > "replica.fetch.min.bytes", "replica.fetch.wait.max.ms" and
> > > "num.replica.fetchers" configs?
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Sep 1, 2014 at 2:52 AM, Theo Hultberg <t...@iconara.net>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > We're evaluating Kafka, and have a problem with it using more
> bandwidth
> > > > than we can explain. From what we can tell the replication uses at
> > least
> > > > twice the bandwidth it should.
> > > >
> > > > We have four producer nodes and three broker nodes. We have enabled
> 3x
> > > > replication, so each node will get a copy of all data in this setup.
> > The
> > > > producers have Snappy compression enabled and send batches of 200
> > > messages.
> > > > The messages are around 1 KiB each. The cluster runs using mostly
> > default
> > > > configuration, and the Kafka version is 0.8.1.1.
> > > >
> > > > When we run iftop on the broker nodes we see that each Kafka node
> > > receives
> > > > around 6-7 Mbit from each producer node (or around 25-30 Mbit in
> > total),
> > > > but then sends around 50 Mbit to each other Kafka node (or 100 Mbit
> in
> > > > total). This is twice what we expected to see, and it seems to
> saturate
> > > the
> > > > bandwidth on our m1.xlarge machines. In other words, we expected the
> > > > incoming 25 Mbit to be amplified to 50 Mbit, not 100.
> > > >
> > > > One thing that could explain it, and that we don't really know how to
> > > > verify, is that the inter-node communication is not compressed. We
> > aren't
> > > > sure about what compression ratio we get on the incoming data, but
> 50%
> > > > sounds reasonable. Could this explain what we're seeing? Is there a
> > > > configuration property to enable compression on the replication
> traffic
> > > > that we've missed?
> > > >
> > > > yours
> > > > Theo
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to