Re: Getting timeouts with elastic load balancer in AWS

Vaibhav Puranik Wed, 27 Jun 2012 15:18:38 -0700

Thanks Neha.

I will try num.retries again with this version and post my feedback here.


Regards,
Vaibhav

On Wed, Jun 27, 2012 at 3:13 PM, Neha Narkhede <neha.narkh...@gmail.com>wrote:

> You can download it from here -
>
> https://www.apache.org/dyn/closer.cgi/incubator/kafka/kafka-0.7.1-incubating/
>
> Thanks,
> Neha
>
> On Wed, Jun 27, 2012 at 3:03 PM, Vaibhav Puranik <vpura...@gmail.com>
> wrote:
> > Thanks Jun. How do I download 0.7.1?
> >
> > I checked SVN tags but the last tag seems to be
> > kafka-0.7.1-incubating-candidate-3/<
> http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/
> >
> >
> > Regards,
> > Vaibhav
> >
> > On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> num.retries is added in 0.7.1, which is just out.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vpura...@gmail.com>
> >> wrote:
> >>
> >> > Jun,
> >> >
> >> > I wrote a test producer to test if num.retries working or not. But I
> >> found
> >> > that it's not working. No matter how many retries I set,  whenever a
> >> > message send fails, it always never gets to the broker.
> >> > I am using Kafka 0.7.0
> >> >
> >> > Is this a known  problem? Do I need to file a JIRA issue?
> >> >
> >> > Because we are using Async producer we have no way to catch the
> exception
> >> > ourselves and act on it. Is that right? Any ideas how we can ensure
> that
> >> > every single message is sent with retries?
> >> >
> >> > Regards,
> >> > Vaibhav
> >> >
> >> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <jun...@gmail.com> wrote:
> >> >
> >> > > Set num.retries in producer config property file. It defaults to 0.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jun
> >> > >
> >> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <
> vpura...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > I reduced the batch size and reduced the pooled connections.
> Number
> >> of
> >> > > > errors have gone down significantly. But they are not eliminated
> yet.
> >> > > >
> >> > > > We definitely don't want to loose any events.
> >> > > >
> >> > > > Jun, how do I configure the client resend you mentioned below? I
> >> > couldn't
> >> > > > find any configuration.
> >> > > >
> >> > > > Regards,
> >> > > > Vaibhav
> >> > > >
> >> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <
> vpura...@gmail.com
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > These are great pointers.
> >> > > > > I found some more discussion here:
> >> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> >> > > > >
> >> > > > > I can do the following to keep using the elastic load balancer:
> >> > > > >
> >> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like
> >> > > connections
> >> > > > > are sitting idle. My volume does not desire that big pool.
> >> > > > > 2) Reduce the batch size so that the webapp frequently dumps the
> >> data
> >> > > to
> >> > > > > brokers. It's better for us anyways.
> >> > > > >
> >> > > > > I will try both of these options and report back.
> >> > > > >
> >> > > > > Thank you very much Jun and Niek.
> >> > > > >
> >> > > > > Regards,
> >> > > > > Vaibhav
> >> > > > >
> >> > > > >
> >> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
> >> > niek.sand...@gmail.com
> >> > > > >wrote:
> >> > > > >
> >> > > > >> ELBs will close connections that have no data going across them
> >> > over a
> >> > > > >> 60 sec period.  A reference to this behavior can be found at
> the
> >> > > > >> bottom of this page:
> >> > > > >>
> >> > > > >> http://aws.amazon.com/articles/1636185810492479
> >> > > > >>
> >> > > > >> There is currently no way for customers to increase this
> timeout.
> >> >  If
> >> > > > >> this timeout is in fact the problem, then the alternative is to
> >> use
> >> > HA
> >> > > > >> proxy for load balancing instead.
> >> > > > >>
> >> > > > >> - Niek
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <jun...@gmail.com>
> >> wrote:
> >> > > > >> > Vaibhav,
> >> > > > >> >
> >> > > > >> > Does elastic load balancer have any timeouts or quotas that
> kill
> >> > > > >> existing
> >> > > > >> > socket connections? Does client resend succeed (you can
> >> configure
> >> > > > >> resend in
> >> > > > >> > DefaultEventHandler)?
> >> > > > >> >
> >> > > > >> > Thanks,
> >> > > > >> >
> >> > > > >> > Jun
> >> > > > >> >
> >> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> >> > > vpura...@gmail.com>
> >> > > > >> wrote:
> >> > > > >> >
> >> > > > >> >> Hi all,
> >> > > > >> >>
> >> > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
> >> > async
> >> > > > >> >> prouducers in our web app.
> >> > > > >> >> I am pooling kafak producers with commons pool. Pool size -
> 10.
> >> > > > >> batch.size
> >> > > > >> >> is 100.
> >> > > > >> >>
> >> > > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
> >> > behind a
> >> > > > >> elastic
> >> > > > >> >> load balancer in AWS.
> >> > > > >> >> Every minute we loose some events because of the following
> >> > > exception
> >> > > > >> >>
> >> > > > >> >> - Disconnecting from
> >> > > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> > > > >> >> - Error in handling batch of 64 events
> >> > > > >> >> java.io.IOException: Connection timed out
> >> > > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> >> > > > >> >>    at
> >> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> >> > > > >> >>    at
> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> >> > > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> >> > > > >> >>    at
> >> > > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> >> > > > >> >>    at
> >> > > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >> > > > >> >>    at
> >> > > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> >> > > > >> >>    at
> kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> >> > > > >> >>    at
> >> > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> >> > > > >> >>    at
> >> scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> >> > > > >> >>    at
> >> > > > >> >>
> >> > > > >>
> >> > >
> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> >> > > > >> >> - Connected to
> >> > > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> > > > for
> >> > > > >> >> producing
> >> > > > >> >>
> >> > > > >> >> Has anybody faced this kind of timeouts before? Do they
> >> indicate
> >> > > any
> >> > > > >> >> resource misconfiguration? The CPU usage on broker is pretty
> >> > small.
> >> > > > >> >> Also, in spite of setting batch size to 100, the failing
> batch
> >> > > > usually
> >> > > > >> only
> >> > > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> >> > > > >> >>
> >> > > > >> >> Any help is appreciated.
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >> >> Regards,
> >> > > > >> >> Vaibhav
> >> > > > >> >> GumGum
> >> > > > >> >>
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>

Re: Getting timeouts with elastic load balancer in AWS

Reply via email to