If linger.ms is 0, batching does not add to the latency. It will actually
improve throughput without affecting latency. Enabling batching does not
mean it will wait for the batch to be full. Whatever gets filled during the
previous batch send will be sent in the current batch even if it count is
less than batch.size

You do not have to work with Future. With callback you will get Async model
essentially (and you can make use of it if you webservice is using Servlet
3.0)


producer.send(record, new AsyncCallback(request, response));


static final class AsyncCallback implements Callback {

    HttpServletRequest request;
    HttpServletResponse response;

void onCompletion(RecordMetadata metadata, java.lang.Exception exception) {

  // Check exception and send appropriate response

}
}

On Mon, Aug 17, 2015 at 10:49 AM Neelesh <neele...@gmail.com> wrote:

> Thanks for the answers. Indeed, the callback model is the same regardless
> of batching. But for a synchronous web service, batching creates a latency
> issue. linger.ms is by default set to zero. Also, java futures are  hard
> to
> work with compared to Scala futures.  The current API also returns one
> future per single record send (correct me if I missed another variant) that
> leaves the client code to deal with hundreds of futures and/or callbacks.
> May I'm missing something very obvious in the new API, but this model and
> the fact that the scala APIs are going away makes writing an ingestion
> service in front of Kafka  more involved than the 0.8.1 API.
>
> On Sun, Aug 16, 2015 at 12:02 AM, Kishore Senji <kse...@gmail.com> wrote:
>
> > Adding to what Gwen already mentioned -
> >
> > The programming model for the Producer is send() with an optional
> callback
> > and we get a Future. This model does not change whether behind the scenes
> > batching is done or not. So your fault tolerance logic really should not
> > depend on whether batching is done over the wire for performance reasons.
> > So assuming that you will get better fault tolerance without batching is
> > also not accurate, as you have to check you have any exception in the
> > onCompletion()
> >
> > The webservice should have a callback registered (using which you
> > essentially get async model) for every send() and based on that it should
> > respond to its clients whether the call is successful or not. The clients
> > of your webservice should have fault tolerance built on top of your
> > response codes.
> >
> > I think batching is a good thing as you get better throughput plus if you
> > do not have linger.ms set, it does not wait until it completely reaches
> > the
> > batch.size so all the concurrent requests to your webservice will get
> > batched and sent to the broker which will increase the throughput of the
> > Producer and in turn your webservice.
> >
> > On Fri, Aug 14, 2015 at 6:10 PM Gwen Shapira <g...@confluent.io> wrote:
> >
> > > Hi Neelesh :)
> > >
> > > The new producer has configuration for controlling the batch sizes.
> > > By default, it will batch as much as possible without delay (controlled
> > by
> > > linger.ms) and without using too much memory (controlled by
> batch.size).
> > >
> > > As mentioned in the docs, you can set batch.size to 0 to disable
> batching
> > > completely if you want.
> > >
> > > It is worthwhile to consider using the producer callback to avoid
> losing
> > > messages when the webservice crashes (for example have the webservice
> > only
> > > consider messages as sent if the callback is triggered for a successful
> > > send).
> > >
> > > You can read more information on batching here:
> > >
> > >
> >
> http://ingest.tips/2015/07/19/tips-for-improving-performance-of-kafka-producer/
> > >
> > > And some examples on how to produce data to Kafka with the new
> producer -
> > > both with futures and callbacks here:
> > >
> > >
> >
> https://github.com/gwenshap/kafka-examples/blob/master/SimpleCounter/src/main/java/com/shapira/examples/producer/simplecounter/DemoProducerNewJava.java
> > >
> > > Gwen
> > >
> > >
> > >
> > > On Fri, Aug 14, 2015 at 5:07 PM, Neelesh <neele...@gmail.com> wrote:
> > >
> > > > We are fronting all our Kafka requests with a simple web service (we
> do
> > > > some additional massaging and writing to other stores as well). The
> new
> > > > KafkaProducer in 0.8.2 seems very geared towards producer batching.
> > Most
> > > of
> > > > our payload are single messages.
> > > >
> > > > Producer batching basically sets us up for lost messages if our web
> > > service
> > > > goes down with unflushed messaged in the producer.
> > > >
> > > > Another issue is when we have a batch of records. It looks like I
> have
> > to
> > > > call producer.send for each record and deal with individual futures
> > > > returned.
> > > >
> > > > Are there any patterns for primarily single message requests, without
> > > > losing data? I understand the throughput will be low.
> > > >
> > > > Thanks!
> > > > -Neelesh
> > > >
> > >
> >
>

Reply via email to