num.retries is added in 0.7.1, which is just out.

Thanks,

Jun

On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <vpura...@gmail.com> wrote:

> Jun,
>
> I wrote a test producer to test if num.retries working or not. But I found
> that it's not working. No matter how many retries I set,  whenever a
> message send fails, it always never gets to the broker.
> I am using Kafka 0.7.0
>
> Is this a known  problem? Do I need to file a JIRA issue?
>
> Because we are using Async producer we have no way to catch the exception
> ourselves and act on it. Is that right? Any ideas how we can ensure that
> every single message is sent with retries?
>
> Regards,
> Vaibhav
>
> On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <jun...@gmail.com> wrote:
>
> > Set num.retries in producer config property file. It defaults to 0.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <vpura...@gmail.com>
> > wrote:
> >
> > > I reduced the batch size and reduced the pooled connections. Number of
> > > errors have gone down significantly. But they are not eliminated yet.
> > >
> > > We definitely don't want to loose any events.
> > >
> > > Jun, how do I configure the client resend you mentioned below? I
> couldn't
> > > find any configuration.
> > >
> > > Regards,
> > > Vaibhav
> > >
> > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <vpura...@gmail.com>
> > > wrote:
> > >
> > > > These are great pointers.
> > > > I found some more discussion here:
> > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427
> > > >
> > > > I can do the following to keep using the elastic load balancer:
> > > >
> > > > 1) Reduce the producer pool size to 1 or 2 because looks like
> > connections
> > > > are sitting idle. My volume does not desire that big pool.
> > > > 2) Reduce the batch size so that the webapp frequently dumps the data
> > to
> > > > brokers. It's better for us anyways.
> > > >
> > > > I will try both of these options and report back.
> > > >
> > > > Thank you very much Jun and Niek.
> > > >
> > > > Regards,
> > > > Vaibhav
> > > >
> > > >
> > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <
> niek.sand...@gmail.com
> > > >wrote:
> > > >
> > > >> ELBs will close connections that have no data going across them
> over a
> > > >> 60 sec period.  A reference to this behavior can be found at the
> > > >> bottom of this page:
> > > >>
> > > >> http://aws.amazon.com/articles/1636185810492479
> > > >>
> > > >> There is currently no way for customers to increase this timeout.
>  If
> > > >> this timeout is in fact the problem, then the alternative is to use
> HA
> > > >> proxy for load balancing instead.
> > > >>
> > > >> - Niek
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <jun...@gmail.com> wrote:
> > > >> > Vaibhav,
> > > >> >
> > > >> > Does elastic load balancer have any timeouts or quotas that kill
> > > >> existing
> > > >> > socket connections? Does client resend succeed (you can configure
> > > >> resend in
> > > >> > DefaultEventHandler)?
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Jun
> > > >> >
> > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
> > vpura...@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> >> Hi all,
> > > >> >>
> > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using
> async
> > > >> >> prouducers in our web app.
> > > >> >> I am pooling kafak producers with commons pool. Pool size - 10.
> > > >> batch.size
> > > >> >> is 100.
> > > >> >>
> > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed
> behind a
> > > >> elastic
> > > >> >> load balancer in AWS.
> > > >> >> Every minute we loose some events because of the following
> > exception
> > > >> >>
> > > >> >> - Disconnecting from
> > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > >> >> - Error in handling batch of 64 events
> > > >> >> java.io.IOException: Connection timed out
> > > >> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> > > >> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> > > >> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> > > >> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> > > >> >>    at
> > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> > > >> >>    at
> > > >> >>
> > > >>
> > >
> >
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> > > >> >>    at
> > kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> > > >> >>    at
> > > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> > > >> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> > > >> >>    at
> kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> > > >> >>    at
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> > > >> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> > > >> >>    at
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> > > >> >>    at
> > > >> >>
> > > >>
> > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> > > >> >> - Connected to
> > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> > > for
> > > >> >> producing
> > > >> >>
> > > >> >> Has anybody faced this kind of timeouts before? Do they indicate
> > any
> > > >> >> resource misconfiguration? The CPU usage on broker is pretty
> small.
> > > >> >> Also, in spite of setting batch size to 100, the failing batch
> > > usually
> > > >> only
> > > >> >> have 50 to 60 events. Is there any other limit I am hitting?
> > > >> >>
> > > >> >> Any help is appreciated.
> > > >> >>
> > > >> >>
> > > >> >> Regards,
> > > >> >> Vaibhav
> > > >> >> GumGum
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to