Jun, I wrote a test producer to test if num.retries working or not. But I found that it's not working. No matter how many retries I set, whenever a message send fails, it always never gets to the broker. I am using Kafka 0.7.0
Is this a known problem? Do I need to file a JIRA issue? Because we are using Async producer we have no way to catch the exception ourselves and act on it. Is that right? Any ideas how we can ensure that every single message is sent with retries? Regards, Vaibhav On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <jun...@gmail.com> wrote: > Set num.retries in producer config property file. It defaults to 0. > > Thanks, > > Jun > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vpura...@gmail.com> > wrote: > > > I reduced the batch size and reduced the pooled connections. Number of > > errors have gone down significantly. But they are not eliminated yet. > > > > We definitely don't want to loose any events. > > > > Jun, how do I configure the client resend you mentioned below? I couldn't > > find any configuration. > > > > Regards, > > Vaibhav > > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpura...@gmail.com> > > wrote: > > > > > These are great pointers. > > > I found some more discussion here: > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > > > > > I can do the following to keep using the elastic load balancer: > > > > > > 1) Reduce the producer pool size to 1 or 2 because looks like > connections > > > are sitting idle. My volume does not desire that big pool. > > > 2) Reduce the batch size so that the webapp frequently dumps the data > to > > > brokers. It's better for us anyways. > > > > > > I will try both of these options and report back. > > > > > > Thank you very much Jun and Niek. > > > > > > Regards, > > > Vaibhav > > > > > > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <niek.sand...@gmail.com > > >wrote: > > > > > >> ELBs will close connections that have no data going across them over a > > >> 60 sec period. A reference to this behavior can be found at the > > >> bottom of this page: > > >> > > >> http://aws.amazon.com/articles/1636185810492479 > > >> > > >> There is currently no way for customers to increase this timeout. If > > >> this timeout is in fact the problem, then the alternative is to use HA > > >> proxy for load balancing instead. > > >> > > >> - Niek > > >> > > >> > > >> > > >> > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <jun...@gmail.com> wrote: > > >> > Vaibhav, > > >> > > > >> > Does elastic load balancer have any timeouts or quotas that kill > > >> existing > > >> > socket connections? Does client resend succeed (you can configure > > >> resend in > > >> > DefaultEventHandler)? > > >> > > > >> > Thanks, > > >> > > > >> > Jun > > >> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik < > vpura...@gmail.com> > > >> wrote: > > >> > > > >> >> Hi all, > > >> >> > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async > > >> >> prouducers in our web app. > > >> >> I am pooling kafak producers with commons pool. Pool size - 10. > > >> batch.size > > >> >> is 100. > > >> >> > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a > > >> elastic > > >> >> load balancer in AWS. > > >> >> Every minute we loose some events because of the following > exception > > >> >> > > >> >> - Disconnecting from > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > > >> >> - Error in handling batch of 64 events > > >> >> java.io.IOException: Connection timed out > > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) > > >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > > >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) > > >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) > > >> >> at > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > > >> >> at > > >> >> > > >> > > > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) > > >> >> at > kafka.network.Send$class.writeCompletely(Transmission.scala:76) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > > >> >> at > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) > > >> >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) > > >> >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) > > >> >> at > > >> >> > > >> > > > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) > > >> >> at scala.collection.immutable.Stream.foreach(Stream.scala:254) > > >> >> at > > >> >> > > >> >> > > >> > > > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) > > >> >> at > > >> >> > > >> > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) > > >> >> - Connected to > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > > for > > >> >> producing > > >> >> > > >> >> Has anybody faced this kind of timeouts before? Do they indicate > any > > >> >> resource misconfiguration? The CPU usage on broker is pretty small. > > >> >> Also, in spite of setting batch size to 100, the failing batch > > usually > > >> only > > >> >> have 50 to 60 events. Is there any other limit I am hitting? > > >> >> > > >> >> Any help is appreciated. > > >> >> > > >> >> > > >> >> Regards, > > >> >> Vaibhav > > >> >> GumGum > > >> >> > > >> > > > > > > > > >