I reduced the batch size and reduced the pooled connections. Number of errors have gone down significantly. But they are not eliminated yet.
We definitely don't want to loose any events. Jun, how do I configure the client resend you mentioned below? I couldn't find any configuration. Regards, Vaibhav On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpura...@gmail.com> wrote: > These are great pointers. > I found some more discussion here: > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > I can do the following to keep using the elastic load balancer: > > 1) Reduce the producer pool size to 1 or 2 because looks like connections > are sitting idle. My volume does not desire that big pool. > 2) Reduce the batch size so that the webapp frequently dumps the data to > brokers. It's better for us anyways. > > I will try both of these options and report back. > > Thank you very much Jun and Niek. > > Regards, > Vaibhav > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sand...@gmail.com>wrote: > >> ELBs will close connections that have no data going across them over a >> 60 sec period. A reference to this behavior can be found at the >> bottom of this page: >> >> http://aws.amazon.com/articles/1636185810492479 >> >> There is currently no way for customers to increase this timeout. If >> this timeout is in fact the problem, then the alternative is to use HA >> proxy for load balancing instead. >> >> - Niek >> >> >> >> >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <jun...@gmail.com> wrote: >> > Vaibhav, >> > >> > Does elastic load balancer have any timeouts or quotas that kill >> existing >> > socket connections? Does client resend succeed (you can configure >> resend in >> > DefaultEventHandler)? >> > >> > Thanks, >> > >> > Jun >> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <vpura...@gmail.com> >> wrote: >> > >> >> Hi all, >> >> >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async >> >> prouducers in our web app. >> >> I am pooling kafak producers with commons pool. Pool size - 10. >> batch.size >> >> is 100. >> >> >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a >> elastic >> >> load balancer in AWS. >> >> Every minute we loose some events because of the following exception >> >> >> >> - Disconnecting from >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 >> >> - Error in handling batch of 64 events >> >> java.io.IOException: Connection timed out >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) >> >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) >> >> at >> >> >> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) >> >> at kafka.network.Send$class.writeCompletely(Transmission.scala:76) >> >> at >> >> >> >> >> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) >> >> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) >> >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) >> >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) >> >> at >> >> >> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) >> >> at >> >> >> >> >> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) >> >> at scala.collection.immutable.Stream.foreach(Stream.scala:254) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) >> >> at >> >> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) >> >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092for >> >> producing >> >> >> >> Has anybody faced this kind of timeouts before? Do they indicate any >> >> resource misconfiguration? The CPU usage on broker is pretty small. >> >> Also, in spite of setting batch size to 100, the failing batch usually >> only >> >> have 50 to 60 events. Is there any other limit I am hitting? >> >> >> >> Any help is appreciated. >> >> >> >> >> >> Regards, >> >> Vaibhav >> >> GumGum >> >> >> > >