Re: Getting timeouts with elastic load balancer in AWS

Neha Narkhede Wed, 27 Jun 2012 14:49:27 -0700

Vaibhav,

>> No matter how many retries I set,  whenever a
message send fails, it always never gets to the broker.


Please can you send across the error message that you see on the
producer side ?

Thanks,
Neha

On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vpura...@gmail.com> wrote:
> Jun,
>
> I wrote a test producer to test if num.retries working or not. But I found
> that it's not working. No matter how many retries I set,  whenever a
> message send fails, it always never gets to the broker.
> I am using Kafka 0.7.0
>
> Is this a known  problem? Do I need to file a JIRA issue?
>
> Because we are using Async producer we have no way to catch the exception
> ourselves and act on it. Is that right? Any ideas how we can ensure that
> every single message is sent with retries?
>
> Regards,
> Vaibhav
>
> On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <jun...@gmail.com> wrote:
>
>> Set num.retries in producer config property file. It defaults to 0.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vpura...@gmail.com>
>> wrote:
>>
>> > I reduced the batch size and reduced the pooled connections. Number of
>> > errors have gone down significantly. But they are not eliminated yet.
>> >
>> > We definitely don't want to loose any events.
>> >
>> > Jun, how do I configure the client resend you mentioned below? I couldn't
>> > find any configuration.
>> >
>> > Regards,
>> > Vaibhav
>> >
>> > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpura...@gmail.com>
>> > wrote:
>> >
>> > > These are great pointers.
>> > > I found some more discussion here:
>> > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
>> > >
>> > > I can do the following to keep using the elastic load balancer:
>> > >
>> > > 1) Reduce the producer pool size to 1 or 2 because looks like
>> connections
>> > > are sitting idle. My volume does not desire that big pool.
>> > > 2) Reduce the batch size so that the webapp frequently dumps the data
>> to
>> > > brokers. It's better for us anyways.
>> > >
>> > > I will try both of these options and report back.
>> > >
>> > > Thank you very much Jun and Niek.
>> > >
>> > > Regards,
>> > > Vaibhav
>> > >
>> > >
>> > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sand...@gmail.com
>> > >wrote:
>> > >
>> > >> ELBs will close connections that have no data going across them over a
>> > >> 60 sec period.  A reference to this behavior can be found at the
>> > >> bottom of this page:
>> > >>
>> > >> http://aws.amazon.com/articles/1636185810492479
>> > >>
>> > >> There is currently no way for customers to increase this timeout.  If
>> > >> this timeout is in fact the problem, then the alternative is to use HA
>> > >> proxy for load balancing instead.
>> > >>
>> > >> - Niek
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <jun...@gmail.com> wrote:
>> > >> > Vaibhav,
>> > >> >
>> > >> > Does elastic load balancer have any timeouts or quotas that kill
>> > >> existing
>> > >> > socket connections? Does client resend succeed (you can configure
>> > >> resend in
>> > >> > DefaultEventHandler)?
>> > >> >
>> > >> > Thanks,
>> > >> >
>> > >> > Jun
>> > >> >
>> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
>> vpura...@gmail.com>
>> > >> wrote:
>> > >> >
>> > >> >> Hi all,
>> > >> >>
>> > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
>> > >> >> prouducers in our web app.
>> > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
>> > >> batch.size
>> > >> >> is 100.
>> > >> >>
>> > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
>> > >> elastic
>> > >> >> load balancer in AWS.
>> > >> >> Every minute we loose some events because of the following
>> exception
>> > >> >>
>> > >> >> - Disconnecting from
>> > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > >> >> - Error in handling batch of 64 events
>> > >> >> java.io.IOException: Connection timed out
>> > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
>> > >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>> > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>> > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
>> > >> >>    at
>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>> > >> >>    at
>> > >> >>
>> > >>
>> >
>> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
>> > >> >>    at
>> kafka.network.Send$class.writeCompletely(Transmission.scala:76)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
>> > >> >>    at
>> > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
>> > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
>> > >> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
>> > >> >>    at
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
>> > >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
>> > >> >>    at
>> > >> >>
>> > >> >>
>> > >>
>> >
>> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
>> > >> >>    at
>> > >> >>
>> > >>
>> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
>> > >> >> - Connected to
>> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
>> > for
>> > >> >> producing
>> > >> >>
>> > >> >> Has anybody faced this kind of timeouts before? Do they indicate
>> any
>> > >> >> resource misconfiguration? The CPU usage on broker is pretty small.
>> > >> >> Also, in spite of setting batch size to 100, the failing batch
>> > usually
>> > >> only
>> > >> >> have 50 to 60 events. Is there any other limit I am hitting?
>> > >> >>
>> > >> >> Any help is appreciated.
>> > >> >>
>> > >> >>
>> > >> >> Regards,
>> > >> >> Vaibhav
>> > >> >> GumGum
>> > >> >>
>> > >>
>> > >
>> > >
>> >
>>

Re: Getting timeouts with elastic load balancer in AWS

Reply via email to